[tor-dev] torbutton user agent update scheme idea

Justin Findlay jfindlay at gmail.com
Sat Nov 9 20:45:52 UTC 2013

Sorry for always picking random stuff from the volunteer page, but 
having read this,

Programs like Torbutton aim to hide your browser's UserAgent string by 
replacing it with a uniform answer for every Tor user. That way the 
attacker can't splinter Tor's anonymity set by looking at that header. 
It tries to pick a string that is commonly used by non-Tor users too, so 
it doesn't stand out. Question one: how badly do we hurt ourselves by 
periodically updating the version of Firefox that Torbutton claims to 
be? If we update it too often, we splinter the anonymity sets ourselves. 
If we don't update it often enough, then all the Tor users stand out 
because they claim to be running a quite old version of Firefox. The 
answer here probably depends on the Firefox versions seen in the wild. 
Question two: periodically people ask us to cycle through N UserAgent 
strings rather than stick with one. Does this approach help, hurt, or 
not matter? Consider: cookies and recognizing Torbutton users by their 
rotating UserAgents; malicious websites who only attack certain 
browsers; and whether the answers to question one impact this answer.

I think the best answer is simple, although a little more than trivial 
to implement.

If you want to anonymize the user agent, you need to mimic the user 
agent changes made by normal users across the web, and I'm positing that 
in any statistically significant sample of web users, the behavior will 
be similar and simple enough on nearly all orders of magnitude 
available, i.e., normally distributed in any analytical dimension.

Think about the natural evolution of a typical web user's user agent. 
What are the factors that will result in a change of user agent?

1) A user may have more than one browser that they use on the same computer,
2) they may use the web on more than one device (phone, tablet, laptop, 
3) and typical, stochastic upgrade patterns.

For (1) and (2) there is not much torbutton could do to coordinate 
reasonable obfuscation among multiple version, logical, space, and time 
separated instances without way more effort than would be profitable. 
In fact, (1) and (2) ought to naturally provide most if not all the 
anonymizing that can reasonably be accomplished from torbutton's 
perspective without any action at all on torbutton's part.

For (3), though, there is something that could be done by torbutton. 
Some factors to consider in constructing a stochastic user agent updater 
- which browsers automatically upgrade themselves?
- which browsers bother the user to upgrade?
- what are the typical user response patterns towards browser upgrades?

Remember, we're thinking about a single browser on a single system.  If 
there are things going on for that user external to this (reinstall 
windows, upgrade ubuntu, get a new computer, etc.) those effects are 
already accounted for by (1) and (2).

Some investigative questions/statements to lead an analysis on this 
could be something like:

'what is the distribution of browsers for human web users?'
'what is the distribution of systems and system versions for human web 
'what are the significant correlations on the cartesian product of these 
two dimensions?'
'how do the browser versions in this product space evolve through time?'
'2/3 of firefox users are on windows and their upgrade habits follow a 
temporal distribution that is a spike followed by an exponential decay 
of order 2.3',
'0.8 of the remaining firefox users are on linux and their update habits 
are dominated by package management systems',

Then, upon finding the most significant trends in browser update 
patterns, construct a mechanism for torbutton that mimics them on a per 
user basis:  Once a user installs torbutton, it samples (selects) a 
browser from the distribution of browsers and then follows that 
browser's typical upgrade pattern.  The problem with this idea, I guess, 
is that torbutton will have to phone home to find out when browser 
updates are adopted by users so that it can make its change at the 
expectation value or whatever.

The goal being to distribute torbutton users according to the 
distribution of all web users among all user agents, you must first find 
out what that latter distribution is and how it is likely to evolve. 
This second part is not so bad as it seems because I'm guessing that 
some forward standard deviations of the expectation value in the 
temporal sense will occur late enough after the browser update that 
torbutton can push it out to most installed instances (it's not going to 
happen before, causality and all).


More information about the tor-dev mailing list