[tor-dev] Proposal 285: Directory documents should be standardized as UTF-8

chelsea komlo me at chelseakomlo.com
Fri Nov 24 21:05:30 UTC 2017

It is great that we are identifying places to improve support for Rust
in Tor.

Along this same line of thinking, are there other places in Tor where we
will need to move to supporting UTF-8? For example, should the statefile
be UTF-8 also?

On 11/13/2017 01:51 PM, Nick Mathewson wrote:
> Filename: 285-utf-8.txt
> Title: Directory documents should be standardized as UTF-8
> Author: Nick Mathewson
> Created: 13 November 2017
> Status: Open
> 1. Summary and motivation
>    People frequently want to include non-ASCII text in their router
>    descriptors.  The Contact line is a favorite place to do this, but in
>    principle the platform line would also be pretty logical.
>    Unfortunately, there's no specified way to encode non-ASCII in our
>    directory documents.
>    Fortunately, almost everybody who does it, uses UTF-8 anyway.
>    As we move towards Rust support in Tor, we gain another motivation
>    for standarding on UTF-8, since Rust's native strings strongly prefer
>    UTF-8.
>    So, in this proposal, we describe a migration path to having all
>    directory documents be fully UTF-8.
> 2. Proposal
>    First, we should have Tor relays reject ContactInfo lines (and any
>    other lines copied directly into router descriptors) that are not
>    UTF-8.
>    At the same time, we should have authorities reject any router
>    descriptors or extrainfo documents that are not valid UTF-8.
>    Simultaneously, we can have all Tor instances reject all
>    non-directory-descriptor directory documents that are not UTF-8,
>    since none should exist today.
>    Finally, once the authorities have updated, we should have all Tor
>    instances reject all directory documents that are not UTF-8.  (We
>    should not take this step until the authorities have upgraded, or
>    else the behavior of updated and non-updated clients could be
>    distinguished.)
> 2.1. Hidden service descriptors' encrypted bodies
>    For the encrypted bodies of hidden service descriptors, we cannot
>    reject them at the authority level, and so we need to take a slightly
>    different approach to prevent client fingerprinting attacks.
>    First, we should make Tor instances start warning about any hidden
>    service descriptors whose bodies, post-decryption, contain non-utf-8
>    plaintext.  At the same time, we add a consensus parameter to
>    indicate that hidden service descriptors with non-utf-8 plantexts
>    should be rejected entirely: "reject-encrypted-non-utf-8".  If that
>    parameter is set to 1, then hidden service clients will not only
>    warn, but reject the descriptors.
>    Once the vast majority of clients are running versions that support
>    the "reject-encrypted-non-utf-8" parameter, that parameter can be set
>    to 1.
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20171124/7ea46068/attachment.html>

More information about the tor-dev mailing list