[tor-bugs] #8182 [EFF-HTTPS Everywhere]: Explicitly figure out handling of internationalized domain names

Tor Bug Tracker & Wiki blackhole at torproject.org
Sat Mar 2 19:39:28 UTC 2013


#8182: Explicitly figure out handling of internationalized domain names
----------------------------------+-----------------------------------------
 Reporter:  schoen                |          Owner:  pde
     Type:  task                  |         Status:  new
 Priority:  major                 |      Milestone:     
Component:  EFF-HTTPS Everywhere  |        Version:     
 Keywords:                        |         Parent:     
   Points:                        |   Actualpoints:     
----------------------------------+-----------------------------------------

Comment(by mikkoharhanen):

 I created rulesets for unused domain 'ä.fi'. In these rulesets, the letter
 a-umlaut (ä) was created with the following methods:
 - html entities
 - punycodes
 - UTF-8 characters
 - ISO-8859-15 characters

 The test URL reveals the rules 'from' and 'to' fields. For example, with
 the URL 'http://ä.fi/entity-to-puny/' the from field uses html entities
 and to field uses punycodes to indicate a-umlaut. If the rule works, the
 address should be redirected to https.

 a-uni.xml (file encoding: UTF-8)

 {{{
 [[Typed URL]]                   [[Resulted URL]]
 http://ä.fi/entity-to-entity/   --> [OK] https://ä.fi/entity-to-entity/
 http://ä.fi/entity-to-puny/     --> [OK] https://ä.fi/entity-to-puny/
 http://ä.fi/entity-to-uni/      --> [FAIL] https://ã¤.fi/entity-to-uni/

 http://ä.fi/puny-to-puny/       --> [FAIL] http://www.ä.fi/puny-to-puny/
 http://ä.fi/puny-to-entity/     --> [FAIL] http://www.ä.fi/puny-to-entity/
 http://ä.fi/puny-to-uni/        --> [FAIL] http://www.ä.fi/puny-to-uni/

 http://ä.fi/uni-to-uni/         --> [FAIL] http://www.ä.fi/uni-to-uni/
 http://ä.fi/uni-to-entity/      --> [FAIL] http://www.ä.fi/uni-to-entity/
 http://ä.fi/uni-to-puny/        --> [FAIL] http://www.ä.fi/uni-to-puny/
 }}}
 {{{
 [[Typed URL]]                           [[Resulted URL]]
 http://ä.fi/entity-to-entity/      --> [FAIL]
 http://www.&.com/#228;.fi/entity-to-entity/
 http://ä.fi/entity-to-puny/        --> [FAIL]
 http://www.&.com/#228;.fi/entity-to-puny/
 http://ä.fi/entity-to-uni/         --> [FAIL]
 http://www.&.com/#228;.fi/entity-to-uni/

 http://xn--4ca.fi/entity-to-entity/     --> [OK] https://ä.fi/entity-to-
 entity/
 http://xn--4ca.fi/entity-to-puny/       --> [OK] https://ä.fi/entity-to-
 puny/
 http://xn--4ca.fi/entity-to-uni/        --> [FAIL] https://ã¤.fi/entity-
 to-uni/

 http://xn--4ca.fi/puny-to-puny/         --> [FAIL] http://www.ä.fi/puny-
 to-puny/
 http://xn--4ca.fi/puny-to-entity/       --> [FAIL] http://www.ä.fi/puny-
 to-entity/
 http://xn--4ca.fi/puny-to-uni/          --> [FAIL] http://www.ä.fi/puny-
 to-uni/

 http://ã¤.fi/uni-to-uni/                --> [FAIL] http://www.ã¤.fi/uni-
 to-uni/
 http://ã¤.fi/uni-to-entity/             --> [FAIL] http://www.ã¤.fi/uni-
 to-entity/
 http://ã¤.fi/uni-to-puny/               --> [FAIL] http://www.ã¤.fi/uni-
 to-puny/
 }}}

 ***

 a-latin.xml (file encoding: ISO-8859-15)

 {{{
 [[Typed URL]]                   [[Resulted URL]]
 http://ä.fi/latin-to-latin/     --> [OK] https://ä.fi/latin-to-latin/
 http://ä.fi/latin-to-entity/    --> [OK] https://ä.fi/latin-to-entity/
 http://ä.fi/latin-to-puny/      --> [OK] https://ä.fi/latin-to-puny/

 http://ä.fi/entity-to-latin/    --> [OK] https://ä.fi/entity-to-latin/
 http://ä.fi/puny-to-latin/      --> [FAIL] http://www.ä.fi/puny-to-latin/

 http://xn--4ca.fi/latin-to-latin/       --> [OK] https://ä.fi/latin-to-
 latin/
 http://xn--4ca.fi/entity-to-latin/      --> [OK] https://ä.fi/entity-to-
 latin/
 http://xn--4ca.fi/puny-to-latin/        --> [FAIL] http://www.ä.fi/puny-
 to-latin/
 }}}

 Conclusions:
 - HTML entities always work
 - Latin1 characters always work
 - Unicode characters never work
 - Puny-codes work in output ('to') fields but not in input ('from') fields
 - Firefox converts punycodes before HTTPS Everywhere has the opportunity
 to redirect them

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8182#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list