Request for comments: tor protocol overhaul
arma at mit.edu
Mon Apr 21 06:10:37 UTC 2003
Nick and I have been working on spec'ing out how to do incremental path
building (and teardown), plus letting connections exit from nodes partway
down the path (we call this concept "leaky pipes"). Here's the latest
iteration of the design (we won't be starting to implement this until
at least after the demo).
Nick, Paul, Andrei, Bram, Lucky, Adam, Marc, others, comments would be
very useful, especially wrt the questions I have throughout.
1. Motivation: what does this buy us?
First, we can now handle exit policies more easily -- allowing a new
exit connection to something our current exit doesn't allow means just
adding one more hop to the path, not building a whole new path.
Second, we can make connections unlinkable to one another by changing the
last hop on the path, which is much cheaper than changing the entire path.
[Argue with me on this one -RD]
Third, we can partially obscure from the intermediate nodes when we're
changing our exit points. By making circuit-extend cells indistinguishable
from data cells except at the end of the circuit, we no longer have
easily trackable create cells that lay an entire circuit at once; so we
can consider mixing, delaying, etc all data cells, without having the
circuit-creation process be the weak point.
Fourth, our signaling system can support long-range dummies (cells that
are only distinguishable from normal data at their last hop).
Fifth, we don't have to track onions for replay anymore. [I haven't come
up with a proof on this one yet; argue with me. -RD]
Sixth, we get forward anonymity: the keys at each hop are ephemeral,
so an adversary who owns a node in the path can't record the stream,
then arrive at each successive hop and demand the stream's decryption.
2. Changes to cell structure/handling
Data cells now have the following payload structure:
Topic command [1 byte]
Topic ID [8 bytes]
Payload [248-1-8=239 bytes]
Because every node in a circuit can now be an exit node, ORs now need
a way to recognize when a data cell is intended for them. They do
this by checking the Topic ID field: if (after decrypting the cell)
the node recognizes the Topic ID, the data cell is intended for that
node's use. Otherwise, the node passes the cell to the next node on the
circuit as before.
The Topic ID "zero" (0x0000000000000000) is used for control messages not
attached to a particular topic. The zero Topic ID is always treated as
Rather than decrypting data cells repeatedly with all known keys, the OP
must now check after each decryption to see whether the Topic ID matches
a known Topic ID for the corresponding node.
If a topic ID is accidentally matched even once in circuit, the whole
circuit is screwed. But 64 bits of topic ID to prevent accidental
matching is plenty. Consider 8 hops to a path and 7 exit connections
per hop. That's 64 possible patterns that might accidentally match at
each cell. We're still looking at 58 bits of security -- so we expect
the first collision to occur after around 2^66 bytes of data.
3. Extending a circuit
We use a new Topic command, "EXTEND" (value 5) to request that the last
node on a circuit extend the circuit by a single node. The Topic ID of
this cell must be zero. The payload should contain:
Port [2 bytes]
Address [4 bytes]
Onion skin [212 bytes]
Upon receiving an EXTEND data cell, the last node in the circuit
extracts the port and address portions of the cell, and finds (or opens)
a connection to the next node on the circuit. It then chooses a fresh
ACI, and sends a CREATE cell to that node, with the contents of the
"onion skin" as the payload.
Because we're using single onion layers now (rather than full onions),
we use a different format for the onion. Introducing an expiration time,
and Diffie-Helman key agreement, the onion skin contains:
Expiration [4 bytes]
[Do we still need this expiration timestamp? That is, do we need to
track for replays at all? I think the ephemeral key may help greatly;
see section 4 below. -RD]
Symmetric key [16 bytes]
g^X [192 bytes]
The first 128 bytes of the skin are encrypted with the node's RSA key [can
somebody confirm we'll be ok without OAEP here? It won't fit. -RD]. The
rest is encrypted with the symmetric key.
Upon receiving a create cell, an OR allocates a new circuit with the
provided ACI, and (at its convenience) decrypts the skin and completes
the Diffie-Hellman key agreement by generating g^Y and g^XY. For future
data cells, the node will use the last 16 bytes of g^XY as the forward
key, and the next-to-last sixteen bytes as the back key. Kudos to Adam
Back for suggesting the idea of an ephemeral key exchange with each hop.
Once it has completed the DH calculation, the OR replies with
an *unencrypted* data cell, with Topic Command "EXTENDED"
(value 6), Topic ID zero, and contents equal to g^Y. Once the
OP has received g^Y, it can compute g^XY, and will now share
forward and reverse keys.
4. Redundancy checking on data BEGIN cells
Begin data cells (like all data cells) must already be padded with
zeros. We further specify that the "address:port\0" must now be
"address:port" followed by 8 zeros, which the exit node must check. Thus
we force Alice (or the replayed or spoofed Alice) to know the session key
(which is different every time) in order to get any connections to work.
[Should we go the next step and put in 8 bytes of randomness plus the
20 byte SHA1 of the address:port\0randomness? That will prevent address
redirection attacks (guess the address he's going to, xor it out, xor
in your own address) and other tagging attacks due to the malleability
from the stream ciphers. -RD]
5. Truncating a circuit
We introduce a "TRUNCATE" (value 7) topic command, with Topic ID zero,
that causes the receiving node to send a DESTROY cell to the next node
in the circuit, and send an acknowledging "TRUNCATED" cell (value 8)
back up the circuit.
Also, a node must deliver a TRUNCATED cell backward for each circuit
when its forward connection to another node closes. Thus when an onion
router goes down only the topics at that router or beyond are affected
(so the kill-a-node-and-watch-who-closes attack is less helpful).
6. Flow control
Topic-level flow control is still all set. But circuit-level flow control
gets much trickier.
Without circuit-level flow control, onion proxies acting as firewalls for
large institutions (which would thus have hundreds of topics happening
at once) will push too many cells down the circuit and never have a way
of knowing that the circuit's full.
Circuit-level sendme cells serve two functions. First of all, they tell
the guy at the other end of the circuit that he can initiate N new cells.
Secondly, they tell the intermediate nodes that the guy at the far end
is allowed to send those cells.
Only Alice is able to initiate forward data cells. Any node might be
able to initiate data cells backward, if it has an exit connection.
So we keep the idea of circuit-level sendme cells. The increment produced
by a circuit-level sendme is fixed systemwide: say, 100. Imagine Alice
has a path open to Bob1 through Bob5.
When Bobn has eaten 100 data cells, he initiates a sendme back
to Alice. Sendme cells must now have a payload, which starts out
(unencrypted) as 8 bytes of zero. The crypts at each hop happen just
like for data cells. Sendmes don't count as far as windows are concerned.
So when Bobn sends a sendme to Alice, she can learn which Bob it was.
Similarly, when Alice sends a sendme to Bobn, he can recognize it because
its payload is zero's rather than crypted garbage.
Only when a sendme ends at Bobn is he allowed to initiate more cells.
Alice is only allowed to initiate cells to nodes that are nearer to,
or the same as, the node who sent the sendme.
So each Bob keeps the same window info he's got right now. Alice has
to remember each Bob's windows (she can store them in the cpath info,
where the per-hop ciphers already live). It looks like the cpath might
have to store data cell queues also, for when we've got a data cell
destined for a node whose circuit window is empty.
More information about the tor-dev