<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Hello tor-dev!</p>

    <p>My name is Kevin and I'm a PhD student at NYU. Recently I've been

      working on creating a "Tor Friendliness Scanner" (TFS), or a

      scanner that will measure what features of a given website are

      broken (non-functional) when accessed on the Tor Browser (TB),

      along with actionable suggestions to improve it. In order to do

      this, we first must get an approximation of ground-truth data of

      how a given website should work. We then need to compare it to how

      the website works on the TB to determine any changes.</p>

    <p>To generate a method of determining ground-truth, we decided to

      modify* the Firefox (FF) browser to log all of the steps of the

      creation of the Content Tree (also called the DOM tree), and to

      log the execution of all JavaScript functions (currently

      underway). We then will apply these changes to the TB as well, and

      run a scan of popular Web sites using the modified FF and the

      modified TB on all three of the TB security slider settings. We

      will then compare the resulting logs to determine where the tree

      creation processes differed* and why. These differences could

      potentially help us illuminate two things:</p>

    <ol>

      <li>what functionality issues the Tor Browser encounters on

        popular Web sites, and</li>

      <li>what threats (beyond metadata surveillance) the TB is

        protecting its users from in-the-wild.</li>

    </ol>

    <p>As far as I have considered, this method seems to capture a lot,

      but it's far from complete. For one thing, it obviously won't

      detect any difference that's spawned from user interaction or

      input (such as a script launched by an OnClick event). However, it

      does seem to make automation of scanning for Tor Friendliness

      possible, and can allow for wide-scale use. <br>

    </p>

    <p>We have moved ahead with development (though have not yet

      finished it) and are (hopefully) very close to a working

      prototype. I was wondering if there was feedback on this method,

      or if anyone can consider an angle we have not that would either

      make the TFS more robust, easier to create, or both.<br>

    </p>

    <p>Thanks for your time and consideration!</p>

    <p>Kevin</p>

    <p>*Note 1: Unfortunately we cannot just rely on JavaScript for

      examining the content tree, since this needs to work on all 3

      security settings of the TB's security slider, and the "safest"

      setting deactivates JavaScript by default on all Web pages. <br>

    </p>

    *Note 2: There can be non-functional differences in Web pages, such

    as different ads showing or the display of the current time. We are

    working on methods to distinguish these from functional differences,

    such as using ad blacklists to determine if a given request or

    script is part of an ad, and ignoring it as part of the difference

    between the two trees.

    <pre class="moz-signature" cols="72">-- 

Kevin Gallagher

Ph.D. Candidate

Center For Cybersecurity

NYU Tandon School of Engineering

Key Fingerprint: D02B 25CB 0F7D E276 06C3  BF08 53E4 C50F 8247 4861 </pre>

  </body>

</html>