commit 282f7f5a138fe1c1c130ee210ac1180aff4bd2b2
Author: Karsten Loesing <karsten.loesing(a)gmx.net>
Date: Thu Mar 1 10:45:09 2012 +0100
Start merging proposals 158 and 162 into dir-spec.txt.
---
dir-spec.txt | 356 ++++++++++++++++++++++++++++++++++++++++++++++++++++++---
1 files changed, 337 insertions(+), 19 deletions(-)
diff --git a/dir-spec.txt b/dir-spec.txt
index f680458..be140a5 100644
--- a/dir-spec.txt
+++ b/dir-spec.txt
@@ -1057,7 +1057,66 @@
Authorities MUST generate a new signing key and corresponding
certificate before the key expires.
-3.2. Vote and consensus status documents
+3.2. Microdescriptors
+
+ Microdescriptors are a stripped-down version of router descriptors
+ generated by the directory authorities which may additionally contain
+ authority-generated information. Microdescriptors contain only the
+ most relevant parts that clients care about. Microdescriptors are
+ expected to be relatively static and only change about once per week.
+ Microdescriptors do not contain any information that clients need to
+ use to decide which servers to fetch information about, or which
+ servers to fetch information from.
+
+ Microdescriptors are a straight transform from the router descriptor
+ and the consensus method. Microdescriptors have no header or footer.
+ Microdescriptors are identified by the hash of its concatenated
+ elements without a signature by the router. Microdescriptors do not
+ contain any version information, because their version is determined
+ by the consensus method.
+
+3.2.1. Microdescriptors in consensus method 8 or later
+
+ Starting with consensus method 8, microdescriptors contain the
+ following elements taken from or based on the router descriptor. Order
+ matters here, because different directory authorities must be able to
+ transform a given router descriptor and consensus method into the exact
+ same microdescriptor.
+
+ "onion-key" NL a public key in PEM format
+
+ [Exactly once, at start]
+
+ The "onion-key" element as specified in 2.1.
+
+ [Should we mention that clients don't learn identity keys anymore
+ with this approach? Clients only need identity keys for their
+ entry guards, and in that case they learn the identity key from
+ the TLS handshake. But clients couldn't check identity keys of
+ non-entry nodes with the microdescriptor approach anymore, even if
+ they wanted. -KL]
+
+ "family" names NL
+
+ [At most once]
+
+ The "family" element as specified in 2.1.
+
+ "p" SP ("accept" / "reject") SP PortList NL
+
+ [At most once]
+
+ The exit-policy summary as specified in 3.3 and 3.5.2. A missing
+ "p" line is equivalent to "p reject 1-65535".
+
+ [Should we note the downside of this approach that clients never
+ learn exact exit policies now? Clients can only guess whether a
+ relay accepts their request, try the BEGIN request, and might get
+ end-reason-exit-policy if they guessed wrong, in which case
+ they'll have to try elsewhere. Or is this too much design
+ discussion for a spec? -KL]
+
+3.3. Vote and consensus status documents
Votes and consensuses are more strictly formatted then other documents
in this specification, since different authorities must be able to
@@ -1097,14 +1156,14 @@
[At most once for votes; does not occur in consensuses.]
A space-separated list of supported methods for generating
- consensuses from votes. See section 3.4.1 for details. Method "1"
+ consensuses from votes. See section 3.5.1 for details. Method "1"
MUST be included.
"consensus-method" SP Integer NL
[At most once for consensuses; does not occur in votes.]
- See section 3.4.1 for details.
+ See section 3.5.1 for details.
(Only included when the vote is generated with consensus-method 2 or
later.)
@@ -1387,6 +1446,19 @@
or does not support (if 'reject') for exit to "most
addresses".
+ "m" SP methods 1*(SP algorithm "=" digest) NL
+
+ [Any number, only in votes.]
+
+ Microdescriptor hashes for all consensus methods that an authority
+ supports and that use the same microdescriptor format. "methods"
+ is a comma-separated list of the consensus methods that the
+ authority believes will produce "digest". "algorithm" is the name
+ of the hash algorithm producing "digest", which can be "sha256" or
+ something else, depending on the consensus "methods" supporting
+ this algorithm. "digest" is the base64 encoding of the hash of
+ the router's microdescriptor with trailing =s omitted.
+
The footer section is delineated in all votes and consensuses supporting
consensus method 9 and above with the following:
@@ -1434,7 +1506,7 @@
Wbe - Weight for Exit-flagged nodes for BEGIN_DIR requests
Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
- These values are calculated as specified in Section 3.4.3.
+ These values are calculated as specified in Section 3.5.3.
The signature contains the following item, which appears Exactly Once
for a vote, and At Least Once for a consensus.
@@ -1450,7 +1522,7 @@
the signing authority, and "signing-key-digest" is the hex-encoded
digest of the current authority signing key of the signing authority.
-3.3. Assigning flags in a vote
+3.4. Assigning flags in a vote
(This section describes how directory authorities choose which status
flags to apply to routers, as of Tor 0.2.0.0-alpha-dev. Later directory
@@ -1574,9 +1646,9 @@
accept not for all addresses, ignoring all rejects for private
netblocks. "Most" addresses are permitted if no more than 2^25
IPv4 addresses (two /8 networks) were blocked. The list is encoded
- as described in 3.4.2.
+ as described in 3.5.2.
-3.4. Computing a consensus from a set of votes
+3.5. Computing a consensus from a set of votes
Given a set of votes, authorities compute the contents of the consensus
document as follows:
@@ -1659,7 +1731,7 @@
for the descriptor we are listing. (They should all be the
same. If they are not, we pick the most commonly listed
one, breaking ties in favor of the lexicographically larger
- vote.) The port list is encoded as specified in 3.4.2.
+ vote.) The port list is encoded as specified in 3.5.2.
* If consensus-method 6 or later is in use and if 3 or more
authorities provide a Measured= keyword in their votes for
@@ -1684,7 +1756,7 @@
All ties in computing medians are broken in favor of the smaller or
earlier item.
-3.4.1. Forward compatibility
+3.5.1. Forward compatibility
Future versions of Tor will need to include new information in the
consensus documents, but it is important that all authorities (or at least
@@ -1718,7 +1790,7 @@
making changes in the contents of consensus; not for making
backward-incompatible changes in their format.)
-3.4.2. Encoding port lists
+3.5.2. Encoding port lists
Whether the summary shows the list of accepted ports or the list of
rejected ports depends on which list is shorter (has a shorter string
@@ -1738,7 +1810,7 @@
use an accept-style summary and list as much of the port list as is
possible within these 1000 bytes. [XXXX be more specific.]
-3.4.3. Computing Bandwidth Weights
+3.5.3. Computing Bandwidth Weights
Let weight_scale = 10000
@@ -1900,13 +1972,110 @@
Handle bridges and strange exit policies:
Wgm=Wgg, Wem=Wee, Weg=Wed
-3.5. Detached signatures
+3.6. Consensus flavors
+
+ Consensus flavors are variants of the consensus that clients can choose
+ to download and use instead of the unflavored consensus. The purpose
+ of a consensus flavor is to remove or replace information in the
+ unflavored consensus without forcing clients to download information
+ they would not use anyway.
+
+ Directory authorities can produce and serve an arbitrary number of
+ flavors of the same consensus. A downside of creating too many new
+ flavors is that clients will be distinguishable based on which flavor
+ they download. A new flavor should not be created when adding a field
+ instead wouldn't be too onerous.
+
+ Examples for consensus flavors include:
+ - Publishing hashes of microdescriptors instead of hashes of
+ full descriptors (see 3.6.2).
+ - Including different digests of descriptors, instead of the
+ perhaps-soon-to-be-totally-broken SHA1.
+
+ Consensus flavors are derived from the unflavored consensus once the
+ voting process is complete. This is to avoid consensus synchronization
+ problems.
+
+ Every consensus flavor has a name consisting of a sequence of one
+ or more alphanumeric characters and dashes. For compatibility,
+ current descriptor flavor is called "ns".
+
+ The supported consensus flavors are defined as part of the
+ authorities' consensus method.
+
+ All consensus flavors have in common that their first line is
+ "network-status-version" where version is 3 or higher, and the flavor
+ is a string consisting of alphanumeric characters and dashes:
+
+ "network-status-version" SP version SP flavor NL
+
+3.6.1. ns consensus
+
+ The ns consensus flavor is equivalent to the unflavored consensus
+ except for its first line which states its consensus flavor name:
+
+ "network-status-version" SP version SP "ns" NL
+
+ [At start, exactly once.]
+
+3.6.2. Microdescriptor consensus
+
+ The microdescriptor consensus is a consensus flavor that contains
+ microdescriptor hashes instead of descriptor hashes and that omits
+ exit-policy summaries which are contained in microdescriptors. The
+ microdescriptor consensus was designed to contain elements that are
+ small and frequently changing. Clients use the information in the
+ microdescriptor consensus to decide which servers to fetch information
+ about and which servers to fetch information from.
+
+ The microdescriptor consensus is based on the unflavored consensus with
+ the exceptions as follows:
+
+ "network-status-version" SP version SP "microdesc" NL
+
+ [At start, exactly once.]
+
+ The flavor name of a microdescriptor consensus is "microdesc".
+
+ Changes to router status entries are as follows:
+
+ "r" SP nickname SP identity SP publication SP IP SP ORPort
+ SP DirPort NL
+
+ [At start, exactly once.]
+
+ Similar to "r" lines in 3.3, but without the digest element.
+
+ "p" ... NL
+
+ [Zero times.]
+
+ Exit policy summaries are contained in microdescriptors and
+ therefore omitted in the microdescriptor consensus.
+
+ "m" SP digest NL
+
+ [Exactly once.]
+
+ "digest" is the base64 of the SHA256 hash of the router's
+ microdescriptor with trailing =s omitted. For a given router
+ descriptor digest and consensus method there should only be a
+ single microdescriptor digest in the "m" lines of all votes.
+ If different votes have different microdescriptor digests for
+ the same descriptor digest and consensus method, at least one
+ of the authorities is broken. If this happens, the microdesc
+ consensus should contain whichever microdescriptor digest is
+ most common. If there is no winner, we break ties in the favor
+ of the lexically earliest.
+
+3.7. Detached signatures
Assuming full connectivity, every authority should compute and sign the
- same consensus directory in each period. Therefore, it isn't necessary to
- download the consensus computed by each authority; instead, the
- authorities only push/fetch each others' signatures. A "detached
- signature" document contains items as follows:
+ same consensus including any flavors in each period. Therefore, it
+ isn't necessary to download the consensus or any flavors of it computed
+ by each authority; instead, the authorities only push/fetch each
+ others' signatures. A "detached signature" document contains items as
+ follows:
"consensus-digest" SP Digest NL
@@ -1920,11 +2089,82 @@
[As in the consensus]
+ "additional-digest" SP flavor SP algname SP digest NL
+
+ [Any number.]
+
+ For each supported consensus flavor, every directory authority
+ adds one or more "additional-digest" lines. "flavor" is the name
+ of the consensus flavor, "algname" is the name of the hash
+ algorithm that is used to generate the digest, and "digest" is the
+ hex-encoded digest.
+
+ The hash algorithm for the microdescriptor consensus flavor is
+ defined as SHA256 with algname "sha256".
+
+ "additional-signature" SP flavor SP algname SP identity SP
+ signing-key-digest NL signature.
+
+ [Any number.]
+
+ For each supported consensus flavor and defined digest algorithm,
+ every directory authority adds an "additional-signature" line.
+ "flavor" is the name of the consensus flavor. "algname" is the
+ name of the algorithm that was used to hash the identity and
+ signing keys, and to compute the signature. "identity" is the
+ hex-encoded digest of the authority identity key of the signing
+ authority, and "signing-key-digest" is the hex-encoded digest of
+ the current authority signing key of the signing authority.
+
+ The "sha256" signature format is defined as the RSA signature of
+ the OAEP+-padded SHA256 digest of the item to be signed. When
+ checking signatures, the signature MUST be treated as valid if the
+ signature material begins with SHA256(document), so that other
+ data can get added later.
+ [To be honest, I didn't fully understand the previous paragraph
+ and only copied it from the proposals. Review carefully. -KL]
+
"directory-signature"
[As in the consensus; the signature object is the same as in the
consensus document.]
+3.8. Consensus index
+
+ Authorities additionally may generate and serve a consensus-index
+ document. Its format is:
+
+ "consensus-index" SP version NL
+
+ [At start, exactly once.]
+
+ "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL
+ "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL
+
+ [As in the consensus]
+
+ "document" SP flavor SP length 1*(SP algname "=" digest) NL
+
+ [Any number.]
+
+ There must be one "document" line for each generated consensus
+ flavor. "length" describes the length of the signed portion of
+ a consensus (the signatures themselves are not included), along
+ with one or more "digests" of that signed portion. Digests are
+ given in hex. The algorithm "sha256" MUST be included; others
+ are allowed.
+ [What's the reason for using a different format than in the
+ "additional-digest" lines of detached signatures? -KL]
+
+ "directory-signature" SP algname SP identity SP signing-key-digest
+ NL signature
+
+ [Any number.]
+
+ [As "additional-signature" lines in detached signatures, but
+ without the "flavor" part.]
+
+ [Actually, is it a bug that the "flavor" part is missing? -KL]
4. Directory server operation
@@ -2035,6 +2275,15 @@
[XXX possible future features include support for downloading old
consensuses.]
+ An authority further makes the consensus index available at
+ /tor/status-vote/(current|next)/consensus-index[.z] .
+ [The URL above is not implemented as of February 21, 2012. -KL]
+
+ The authorities serve another consensus of each flavor "F" from the
+ locations
+ /tor/status-vote/(current|next)/consensus-F.z. and
+ /tor/status-vote/(current|next)/consensus-F/<FP1>+....z.
+
4.3. Downloading consensus status documents (caches only)
All directory servers (authorities and caches) try to keep a recent
@@ -2053,6 +2302,12 @@
and is fresh until 2:00, that cache will fetch a new consensus at
a random time between 2:00 and 2:30.]
+ Directory caches also fetch the consensus index and the referenced
+ consensus flavors from the authorities. Caches check the correctness
+ of consensus flavors, but do not check anything about an unrecognized
+ consensus document beyond its digest and length. Caches serve all
+ consensus flavors from the same locations as the directory authorities.
+
4.4. Downloading and storing router descriptors (authorities and caches)
Periodically (currently, every 10 seconds), directory servers check
@@ -2087,7 +2342,41 @@
Authorities SHOULD NOT download descriptors for routers that they would
immediately reject for reasons listed in 3.1.
-4.5. Downloading and storing extra-info documents
+4.5. Downloading and storing microdescriptors (caches only)
+
+ Directory mirrors should fetch, cache, and serve each microdescriptor
+ from the authorities.
+
+ The microdescriptors with base64 hashes <D1>,<D2>,<D3> are available
+ at:
+ http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>[.z]
+
+ <Dn> are base-64 encoded with trailing =s omitted for size and for
+ consistency with the microdescriptor consensus format. -s are used
+ instead of +s to separate items, since the + character is used in
+ base64 encoding.
+
+ All the microdescriptors from the current consensus should also be
+ available at:
+ http://<hostname>/tor/micro/all[.z]
+ so a client that's bootstrapping doesn't need to send a 70KB URL just
+ to name every microdescriptor it's looking for.
+ [Note that /tor/micro/all[.z] is not implemented as of February 21,
+ 2012. -KL]
+
+ Directory mirrors should check to make sure that the microdescriptors
+ they're about to serve match the right hashes (either the hashes from
+ the fetch URL or the hashes from the consensus, respectively).
+
+ [So, with the consensus index, caches can mirror consensus flavors they
+ don't understand. But there's no such mechanism to mirror unrecognized
+ descriptor types which might be referenced from those unrecognized
+ consensus flavors, right? Would it make sense to produce a descriptor
+ index to tell caches which descriptors to mirror, even if they don't
+ understand them? Without that, deploying a new consensus flavor might
+ take the same time as it takes now. -KL]
+
+4.6. Downloading and storing extra-info documents
All authorities, and any cache that chooses to cache extra-info documents,
and any client that uses extra-info documents, should implement this
@@ -2103,7 +2392,7 @@
to download from caches. We follow the same splitting and back-off rules
as in 4.4 (if a cache) or 5.3 (if a client).
-4.6. General-use HTTP URLs
+4.7. General-use HTTP URLs
"Fingerprints" in these URLs are base-16-encoded SHA1 hashes.
@@ -2210,6 +2499,9 @@
fingerprints. Servers MUST accept both upper and lower case fingerprints
in requests.
+ [XXX Add new URLs for microdescriptors, consensus flavors,
+ microdescriptor consensus, and consensus indexes. -KL]
+
5. Client operation: downloading information
Every Tor that is not a directory server (that is, those that do
@@ -2250,7 +2542,12 @@
of the one-hour interval is 45 minutes, and 7/8 of the remaining 75
minutes is 65 minutes.]
-5.2. Downloading and storing router descriptors
+ Clients may choose to download the microdescriptor consensus instead
+ of the general network status consensus. In that case they should use
+ the same update strategy as for the normal consensus. They should not
+ download more than one consensus flavor.
+
+5.2. Downloading and storing router descriptors or microdescriptors
Clients try to have the best descriptor for each router. A descriptor is
"best" if:
@@ -2287,6 +2584,19 @@
being published too far in the past.] [The code seems to discard
descriptors in all cases after they're 5 days old. True? -RD]
+ Clients which chose to download the microdescriptor consensus instead
+ of the general consensus must download the referenced microdescriptors
+ instead of router descriptors. Clients fetch and cache
+ microdescriptors preemptively from dir mirrors when starting up, like
+ they currently fetch descriptors. After bootstrapping, clients only
+ need to fetch the microdescriptors that have changed.
+
+ Clients maintain a cache of microdescriptors along with metadata like
+ when it was last referenced by a consensus, and which identity key
+ it corresponds to. They keep a microdescriptor until it hasn't been
+ mentioned in any consensus for a week. Future clients might cache them
+ for longer or shorter times.
+
5.3. Managing downloads
When a client has no consensus network-status document, it downloads it
@@ -2307,6 +2617,14 @@
After receiving any response client MUST discard any network-status
documents and descriptors that it did not request.
+ When a client gets a new microdescriptor consensus, it looks to see if
+ there are any microdescriptors it needs to learn. If it needs to learn
+ more than half of the microdescriptors, it requests 'all', else it
+ requests only the missing ones. Clients MAY try to determine whether
+ the upload bandwidth for listing the microdescriptors they want is more
+ or less than the download bandwidth for the microdescriptors they do
+ not want.
+
6. Using directory information
Everyone besides directory authorities uses the approaches in this section