[tor-commits] [torspec/master] proposal 285: utf-8 all the things

nickm at torproject.org nickm at torproject.org
Mon Nov 13 18:51:03 UTC 2017


commit 5ba8d5a7d08c09ae9949f20eb0633fc381c2dbc6
Author: Nick Mathewson <nickm at torproject.org>
Date:   Mon Nov 13 13:50:59 2017 -0500

    proposal 285: utf-8 all the things
---
 proposals/000-index.txt |  2 ++
 proposals/285-utf-8.txt | 60 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 62 insertions(+)

diff --git a/proposals/000-index.txt b/proposals/000-index.txt
index 2ae06a9..3352d02 100644
--- a/proposals/000-index.txt
+++ b/proposals/000-index.txt
@@ -205,6 +205,7 @@ Proposals by number:
 282  Remove "Named" and "Unnamed" handling from consensus voting [OPEN]
 283  Move IPv6 ORPorts from microdescriptors to the microdesc consensus [OPEN]
 284  Hidden Service v3 Control Port [OPEN]
+285  Directory documents should be standardized as UTF-8 [OPEN]
 
 
 Proposals by status:
@@ -263,6 +264,7 @@ Proposals by status:
    282  Remove "Named" and "Unnamed" handling from consensus voting [for 0.3.3.x]
    283  Move IPv6 ORPorts from microdescriptors to the microdesc consensus [for 0.3.3.x]
    284  Hidden Service v3 Control Port
+   285  Directory documents should be standardized as UTF-8
  ACCEPTED:
    172  GETINFO controller option for circuit information
    173  GETINFO Option Expansion
diff --git a/proposals/285-utf-8.txt b/proposals/285-utf-8.txt
new file mode 100644
index 0000000..939399f
--- /dev/null
+++ b/proposals/285-utf-8.txt
@@ -0,0 +1,60 @@
+Filename: 285-utf-8.txt
+Title: Directory documents should be standardized as UTF-8
+Author: Nick Mathewson
+Created: 13 November 2017
+Status: Open
+
+1. Summary and motivation
+
+   People frequently want to include non-ASCII text in their router
+   descriptors.  The Contact line is a favorite place to do this, but in
+   principle the platform line would also be pretty logical.
+
+   Unfortunately, there's no specified way to encode non-ASCII in our
+   directory documents.
+
+   Fortunately, almost everybody who does it, uses UTF-8 anyway.
+
+   As we move towards Rust support in Tor, we gain another motivation
+   for standarding on UTF-8, since Rust's native strings strongly prefer
+   UTF-8.
+
+   So, in this proposal, we describe a migration path to having all
+   directory documents be fully UTF-8.
+
+2. Proposal
+
+   First, we should have Tor relays reject ContactInfo lines (and any
+   other lines copied directly into router descriptors) that are not
+   UTF-8.
+
+   At the same time, we should have authorities reject any router
+   descriptors or extrainfo documents that are not valid UTF-8.
+   Simultaneously, we can have all Tor instances reject all
+   non-directory-descriptor directory documents that are not UTF-8,
+   since none should exist today.
+
+   Finally, once the authorities have updated, we should have all Tor
+   instances reject all directory documents that are not UTF-8.  (We
+   should not take this step until the authorities have upgraded, or
+   else the behavior of updated and non-updated clients could be
+   distinguished.)
+
+2.1. Hidden service descriptors' encrypted bodies
+
+   For the encrypted bodies of hidden service descriptors, we cannot
+   reject them at the authority level, and so we need to take a slightly
+   different approach to prevent client fingerprinting attacks.
+
+   First, we should make Tor instances start warning about any hidden
+   service descriptors whose bodies, post-decryption, contain non-utf-8
+   plaintext.  At the same time, we add a consensus parameter to
+   indicate that hidden service descriptors with non-utf-8 plantexts
+   should be rejected entirely: "reject-encrypted-non-utf-8".  If that
+   parameter is set to 1, then hidden service clients will not only
+   warn, but reject the descriptors.
+
+   Once the vast majority of clients are running versions that support
+   the "reject-encrypted-non-utf-8" parameter, that parameter can be set
+   to 1.
+



More information about the tor-commits mailing list