[tor-commits] [tor-browser-spec/master] Add website fingerprinting and update check info.

mikeperry at torproject.org mikeperry at torproject.org
Mon Apr 28 15:18:48 UTC 2014

commit b7d2a02da9766ebd15ade97b7789ea98509fa949
Author: Mike Perry <mikeperry-git at fscked.org>
Date:   Thu Mar 7 18:29:06 2013 -0800

    Add website fingerprinting and update check info.
    Also fix some issues gk noticed.
 docs/design/design.xml |  216 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 197 insertions(+), 19 deletions(-)

diff --git a/docs/design/design.xml b/docs/design/design.xml
index 549a27d..96b31e0 100644
--- a/docs/design/design.xml
+++ b/docs/design/design.xml
@@ -159,9 +159,13 @@ MUST NOT bypass Tor proxy settings for any content.</para></listitem>
  <listitem><link linkend="state-separation"><command>State
- <para>The browser MUST NOT provide any stored state to the content window
-from other browsers or other browsing modes, including shared state from
-plugins, machine identifiers, and TLS session state.
+ <para>
+The browser MUST NOT provide the content window with any state from any other
+browsers or any non-Tor browsing modes. This includes shared state from
+independent plugins, and shared state from Operating System implementations of
+TLS and other support libraries.
  <listitem><link linkend="disk-avoidance"><command>Disk
@@ -257,8 +261,8 @@ linkability from fingerprinting browser behavior.
-The browser SHOULD provide an obvious, easy way to remove all of its
-authentication tokens and browser state and obtain a fresh identity.
+The browser MUST provide an obvious, easy way for the user to remove all of
+its authentication tokens and browser state and obtain a fresh identity.
 Additionally, the browser SHOULD clear linkable state by default automatically
 upon browser restart, except at user option.
@@ -294,7 +298,7 @@ url="https://blog.torproject.org/blog/toggle-or-not-toggle-end-torbutton">failur
 of Torbutton</ulink>: Even if users managed to install everything properly,
 the toggle model was too hard for the average user to understand, especially
 in the face of accumulating tabs from multiple states crossed with the current
-tor-state of the browser. 
+Tor-state of the browser. 
@@ -323,9 +327,10 @@ contribute to fingerprinting.
 Therefore, if plugins are to be enabled in private browsing modes, they must
 be restricted from running automatically on every page (via click-to-play
 placeholders), and/or be sandboxed to restrict the types of system calls they
-can execute. If the user decides to craft an exemption to allow a plugin to be
-used, it MUST only apply to the top level url bar domain, and not to all sites,
-to reduce cross-origin fingerprinting linkability.
+can execute. If the user agent allows the user to craft an exemption to allow
+a plugin to be used automatically, it must only apply to the top level url bar
+domain, and not to all sites, to reduce cross-origin fingerprinting
@@ -336,13 +341,13 @@ to reduce cross-origin fingerprinting linkability.
 failure of Torbutton</ulink> was the options panel. Each option
 that detectably alters browser behavior can be used as a fingerprinting tool.
 Similarly, all extensions <ulink
-url="http://blog.chromium.org/2010/06/extensions-in-incognito.html">SHOULD be
-disabled in the mode</ulink> except as an opt-in basis. We SHOULD NOT load
+url="http://blog.chromium.org/2010/06/extensions-in-incognito.html">should be
+disabled in the mode</ulink> except as an opt-in basis. We should not load
 system-wide and/or Operating System provided addons or plugins.
-Instead of global browser privacy options, privacy decisions SHOULD be made
+Instead of global browser privacy options, privacy decisions should be made
 url bar origin</ulink> to eliminate the possibility of linkability
@@ -354,7 +359,7 @@ privacy permissions.
 If the user has indicated they wish to record local history storage, these
-permissions can be written to disk. Otherwise, they MUST remain memory-only. 
+permissions can be written to disk. Otherwise, they should remain memory-only. 
      <listitem><command>No filters</command>
@@ -554,6 +559,13 @@ The adversary can also inject malicious content at the user's upstream router
 when they have Tor disabled, in an attempt to correlate their Tor and Non-Tor
+     <para>
+Additionally, at this position the adversary can block Tor, or attempt to
+recognize the traffic patterns of specific web pages at the entrance to the Tor
+     </para>
      <listitem><command>Physical Access</command>
@@ -722,6 +734,61 @@ was formerly available only to Javascript.
+     <listitem id="website-traffic-fingerprinting"><command>Website traffic fingerprinting</command>
+     <para>
+Website traffic fingerprinting is an attempt by the adversary to recognize the
+encrypted traffic patterns of specific websites. The most comprehensive study
+of the statistical properties of this attack against Tor was done by <ulink
+et al</ulink>. Unfortunately, the publication bias in academia has encouraged
+the production of a number of follow-on attack papers claiming "improved"
+success rates using this attack in recognizing only very small numbers of
+websites. Despite these subsequent results, we are skeptical of the efficacy
+of this attack in a real world scenario, especially in the face of any defenses.
+     </para>
+     <para>
+In general, with machine learning, as you increase the number of
+categories to classify with few reliable features to extract, either true
+positive accuracy goes down or the false positive rate goes up.
+     </para>
+      <para>
+In the case of this attack, the key factors that increase the classification
+requirements (and thus hinder a real world adversary who attempts this attack)
+are large numbers of dynamically generated pages, partially cached content,
+and non-web activity in the "Open World" scenario of the entire Tor network.
+This large set of classification categories is further confounded by a poor
+and often noisy available featureset, which is also realtively easy for the
+defender to manipulate.
+     </para>
+     <para>
+In fact, the ocean of possible Tor Internet activity makes it a certainty that
+an adversary attempting to classify a large number of sites with poor feature
+resolution will ultimately be overwhelmed by false positives. This problem is
+known in the IDS literature as the <ulink
+url="http://www.raid-symposium.org/raid99/PAPERS/Axelsson.pdf">Base Rate
+Fallacy</ulink>, and it is the primary reason that anomaly and activity
+classification-based IDS and antivirus systems have failed to materialize in
+the marketplace.
+     </para>
+     <para>
+Still, we do not believe that these issues are enough to dismiss the attack
+outright. But we do believe these factors make it both worthwhile and
+effective to <link linkend="traffic-fingerprinting-defenses">deploy
+light-weight defenses</link> that reduce the accuracy of this attack by
+further contributing noise to hinder successful feature extraction.
+     </para>
+     </listitem>
      <listitem><command>Remotely or locally exploit browser and/or
@@ -1713,6 +1780,117 @@ audio and video objects.
+  <sect2 id="other">
+   <title>Other Security Measures</title>
+   <para>
+In addition to the above mechanisms that are devoted to preserving privacy
+while browsing, we also have a number of technical mechanisms to address other
+privacy and security issues.
+   </para>
+   <orderedlist>
+    <listitem id="traffic-fingerprinting-defenses"><command>Website Traffic Fingerprinting Defenses</command>
+     <para>
+<link linkend="website-traffic-fingerprinting">Website Traffic
+Fingerprinting</link> is a statistical attack to attempt to recognize specific
+encrypted website activity.
+     </para>
+     <sect3>
+       <title>Design Goal:</title>
+       <blockquote>
+      <para>
+We want to deploy a mechanism that reduces the accuracy of features available
+for classification. This mechanism would either impact the true and false
+positive accuracy rates, <emphasis>or</emphasis> reduce the number of webpages
+that could be classified at a given accuracy rate.
+     </para>
+     <para>
+Ideally, this mechanism would be as light-weight as possible, and would be
+tunable in terms of overhead. We suspect that it may even be possible to
+deploy a mechanism that reduces feature extraction resolution without any
+network overhead. In the no-overhead category, we have <ulink
+url="http://freehaven.net/anonbib/cache/LZCLCP_NDSS11.pdf">HTTPOS</ulink> and
+use of HTTP pipelining and/or SPDY</ulink>. In the tunable/low-overhead
+category, we have <ulink
+Padding</ulink> and <ulink url="http://www.cs.sunysb.edu/~xcai/fp.pdf">
+Congestion-Sensitive BUFLO</ulink>. It may be also possible to <ulink
+url="https://trac.torproject.org/projects/tor/ticket/7028">tune such
+defenses</ulink> such that they only use existing spare Guard bandwidth capacity in the Tor
+     </para>
+       </blockquote>
+     </sect3>
+     <sect3>
+       <title>Implementation Status:</title>
+       <blockquote>
+       <para>
+Currently, we patch Firefox to <ulink
+pipeline order and depth</ulink>. Unfortunately, pipelining is very fragile.
+Many sites do not support it, and even sites that advertise support for
+pipelining may simply return error codes for successive requests, effectively
+forcing the browser into non-pipelined behavior. Firefox also has code to back
+off and reduce or eliminate the pipeline if this happens. These
+shortcomings and fallback behaviors are the primary reason that Google
+developed SPDY as opposed simply extending HTTP to improve pipelining.
+     </para>
+     <para>
+Knowing this, we created the defense as an <ulink
+research prototype</ulink> to help evaluate what could be done in the best
+case with full server support (ie with SPDY).  Unfortunately, the bias in
+favor of compelling attack papers has caused academia to thus far ignore our
+requests, instead publishing only cursory (yet "devastating") evaluations that
+fail to provide even simple statistics such as the rates of actual pipeline
+utilization during their evaluations.
+     </para>
+      </blockquote>
+    </sect3>
+    </listitem>
+    <listitem><command>Privacy-preserving update notification</command>
+     <para>
+In order to inform the user when their Tor Browser is out of date, we perform a
+privacy-preserving update check in the asynchronously in the background. The
+check uses Tor to download the file <ulink
+and searches that version list for the current value for the local preference
+<command>torbrowser.version</command>. If the value from our preference is
+present in the recommended version list, the check is considered to have
+succeeded and the user is up to date. If not, it is considered to have failed
+and an update is needed. The check is triggered upon browser launch, new
+window, and new tab, but is rate limited so as to happen no more frequently
+than once every 1.5 hours.
+     </para>
+     <para>
+If the check fails, we cache this fact, and update the Torbutton graphic to
+display a flashing warning icon and insert a menu option that provides a link
+to our download page. Additionally, we reset the value for the browser
+homepage to point to a <ulink
+url="https://check.torproject.org/?lang=en-US&small=1&uptodate=0">page that
+informs the user</ulink> that their browser is out of
+     </para>
+    </listitem>
+   </orderedlist>
+  </sect2>
   <sect2 id="firefox-patches">
    <title>Description of Firefox Patches</title>
@@ -2401,12 +2579,12 @@ source URL parameters.
 We believe the Referer header should be made explicit. If a site wishes to
 transmit its URL to third party content elements during load or during
-link-click, it should have to specify this as a property of the associated
-HTML tag. With an explicit property, it would then be possible for the user
-agent to inform the user if they are about to click on a link that will
-transmit referer information (perhaps through something as subtle as a
-different color for the destination URL). This same UI notification can also
-be used for links with the <ulink
+link-click, it should have to specify this as a property of the associated HTML
+tag. With an explicit property, it would then be possible for the user agent to
+inform the user if they are about to click on a link that will transmit referer
+information (perhaps through something as subtle as a different color in the
+lower toolbar for the destination URL). This same UI notification can also be
+used for links with the <ulink

More information about the tor-commits mailing list