Tue Aug 30 05:50:59 UTC 2016

#20025: document.characterSet enables fingerprinting of localization (only with
 Reporter:  dcf                       |          Owner:  tbb-team
     Type:  defect                    |         Status:  new
 Priority:  Medium                    |      Milestone:
Component:  Applications/Tor Browser  |        Version:
 Severity:  Normal                    |     Resolution:
 Keywords:  tbb-fingerprinting        |  Actual Points:
Parent ID:                            |         Points:
 Reviewer:                            |        Sponsor:

Comment (by dcf):

 I set up a demo page on two servers, one with HSTS and one without. Only
 the one with HSTS shows a difference in document.characterSet. Note that
 neither of the servers specifies the encoding in the `Content-Type`
 header, so you get a warning in the browser console and the browser has to
 infer the encoding.

 The technique from #10703 always finds `iso-8859-1`. (I think that
 technique has trouble distinguishing `iso-8859-1` and `windows-1252`.)

 == with HSTS ==

 HSTS demo page: https://people.torproject.org/~dcf/tor20025/check-

 document.characterSet is `windows-1252` for the en-US bundle and `EUC-KR`
 for the ko bundle.

 || en-US || ko ||
 || [[Image(en-us-with-hsts.png)]] || [[Image(ko-with-hsts.png)]] ||

 == without HSTS ==

 non-HSTS demo page: https://people.eecs.berkeley.edu/~fifield/tor20025

 document.characterSet is `windows-1252` for both the en-US and ko bundles.

 || en-US || ko ||
 || [[Image(en-us-without-hsts.png)]] || [[Image(ko-without-hsts.png)]] ||

