tor-commits
Threads by month
- ----- 2025 -----
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
November 2011
- 16 participants
- 987 discussions

01 Nov '11
commit 84ec5aca5f5735f445840f6f574842b71365bbde
Author: Roger Dingledine <arma(a)torproject.org>
Date: Tue Nov 1 04:22:41 2011 -0400
update the spec to match recent reality
in particular, some cells can be variable-length; and we don't encourage
clients to use two-cert authentication.
---
tor-spec.txt | 21 ++++++++++-----------
1 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/tor-spec.txt b/tor-spec.txt
index 52a9217..28eca98 100644
--- a/tor-spec.txt
+++ b/tor-spec.txt
@@ -278,8 +278,9 @@ see tor-design.pdf.
OPs alike if their certificates were missing or malformed.]
Once a TLS connection is established, the two sides send cells
- (specified below) to one another. Cells are sent serially. All
- cells are CELL_LEN bytes long. Cells may be sent embedded in TLS
+ (specified below) to one another. Cells are sent serially. Standard
+ cells are CELL_LEN bytes long, but variable-length cells also exist; see
+ Section 3. Cells may be sent embedded in TLS
records of any size or divided across TLS records, but the framing
of TLS records MUST NOT leak information about the type or contents
of the cells.
@@ -291,13 +292,12 @@ see tor-design.pdf.
also hold a TLS connection with no circuits open, if it is likely that a
circuit will be built soon using that connection.
- (As an exception, directory servers may try to stay connected to all of
- the ORs -- though this will be phased out for the Tor 0.1.2.x release.)
-
- To avoid being trivially distinguished from relays, client-only Tor
- instances are encouraged but not required to use a two-certificate chain
- as well. Clients SHOULD NOT keep using the same certificates when
- their IP address changes. Clients MAY send no certificates at all.
+ Client-only Tor instances are encouraged to avoid using handshake
+ variants that include certificates, if those certificates provide
+ any persistent tags to the relays they contact. If clients do use
+ certificates, they SHOULD NOT keep using the same certificates when
+ their IP address changes. Clients MAY send certificates using any
+ of the above handshake variants.
3. Cell Packet format
@@ -311,7 +311,7 @@ see tor-design.pdf.
Command [1 byte]
Payload (padded with 0 bytes) [PAYLOAD_LEN bytes]
- On a version 2 connection, all cells are as in version 1 connections,
+ On a version 2 or 3 connection, all cells are as in version 1 connections,
except for variable-length cells, whose format is:
CircID [2 octets]
@@ -324,7 +324,6 @@ see tor-design.pdf.
higher connection, variable-length cells are indicated by a command
byte equal to 7 ("VERSIONS"), or greater than or equal to 128.
-
The CircID field determines which circuit, if any, the cell is
associated with.
1
0

01 Nov '11
commit 213765575ff8c1b8381360ec58107b04cf0e597a
Author: Karsten Loesing <karsten.loesing(a)gmx.net>
Date: Tue Nov 1 07:51:03 2011 +0100
Fix consensus params check some more.
---
src/org/torproject/chc/MetricsWebsiteReport.java | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/src/org/torproject/chc/MetricsWebsiteReport.java b/src/org/torproject/chc/MetricsWebsiteReport.java
index 01d126c..09ab17c 100644
--- a/src/org/torproject/chc/MetricsWebsiteReport.java
+++ b/src/org/torproject/chc/MetricsWebsiteReport.java
@@ -392,7 +392,7 @@ public class MetricsWebsiteReport implements Report {
Map<String, String> voteConsensusParams =
vote.getConsensusParams();
boolean conflictOrInvalid = false;
- if (voteConsensusParams == null) {
+ if (voteConsensusParams != null) {
for (Map.Entry<String, String> e :
voteConsensusParams.entrySet()) {
if (!consensusConsensusParams.containsKey(e.getKey()) ||
1
0

[metrics-web/master] Add bwauthpid to the list of valid consensus parameters.
by karsten@torproject.org 01 Nov '11
by karsten@torproject.org 01 Nov '11
01 Nov '11
commit 9059f5a36888cd550d552fb2f235bd839edeb0aa
Author: Karsten Loesing <karsten.loesing(a)gmx.net>
Date: Tue Nov 1 07:51:25 2011 +0100
Add bwauthpid to the list of valid consensus parameters.
---
src/org/torproject/chc/MetricsWebsiteReport.java | 2 +-
src/org/torproject/chc/NagiosReport.java | 2 +-
src/org/torproject/chc/StdOutReport.java | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/org/torproject/chc/MetricsWebsiteReport.java b/src/org/torproject/chc/MetricsWebsiteReport.java
index 09ab17c..8744c8b 100644
--- a/src/org/torproject/chc/MetricsWebsiteReport.java
+++ b/src/org/torproject/chc/MetricsWebsiteReport.java
@@ -385,7 +385,7 @@ public class MetricsWebsiteReport implements Report {
("circwindow,CircuitPriorityHalflifeMsec,refuseunknownexits,"
+ "cbtdisabled,cbtnummodes,cbtrecentcount,cbtmaxtimeouts,"
+ "cbtmincircs,cbtquantile,cbtclosequantile,cbttestfreq,"
- + "cbtmintimeout,cbtinitialtimeout").split(",")));
+ + "cbtmintimeout,cbtinitialtimeout,bwauthpid").split(",")));
Map<String, String> consensusConsensusParams =
downloadedConsensus.getConsensusParams();
for (Status vote : this.downloadedVotes) {
diff --git a/src/org/torproject/chc/NagiosReport.java b/src/org/torproject/chc/NagiosReport.java
index e550625..341d68b 100644
--- a/src/org/torproject/chc/NagiosReport.java
+++ b/src/org/torproject/chc/NagiosReport.java
@@ -166,7 +166,7 @@ public class NagiosReport implements Report {
("circwindow,CircuitPriorityHalflifeMsec,refuseunknownexits,"
+ "cbtdisabled,cbtnummodes,cbtrecentcount,cbtmaxtimeouts,"
+ "cbtmincircs,cbtquantile,cbtclosequantile,cbttestfreq,"
- + "cbtmintimeout,cbtinitialtimeout").split(",")));
+ + "cbtmintimeout,cbtinitialtimeout,bwauthpid").split(",")));
for (Status vote : this.downloadedVotes) {
Map<String, String> voteConsensusParams =
vote.getConsensusParams();
diff --git a/src/org/torproject/chc/StdOutReport.java b/src/org/torproject/chc/StdOutReport.java
index 805343f..850ecc0 100644
--- a/src/org/torproject/chc/StdOutReport.java
+++ b/src/org/torproject/chc/StdOutReport.java
@@ -206,7 +206,7 @@ public class StdOutReport implements Report {
("circwindow,CircuitPriorityHalflifeMsec,refuseunknownexits,"
+ "cbtdisabled,cbtnummodes,cbtrecentcount,cbtmaxtimeouts,"
+ "cbtmincircs,cbtquantile,cbtclosequantile,cbttestfreq,"
- + "cbtmintimeout,cbtinitialtimeout").split(",")));
+ + "cbtmintimeout,cbtinitialtimeout,bwauthpid").split(",")));
for (Status vote : this.downloadedVotes) {
Map<String, String> voteConsensusParams =
vote.getConsensusParams();
1
0

[pytorctl/master] Merge commit 'a58246918f58f3fc5663f8772df39cdcd3ccce8d'
by mikeperry@torproject.org 01 Nov '11
by mikeperry@torproject.org 01 Nov '11
01 Nov '11
commit 66ed91b5c1453b9c25b7070b702e2e548efb4f80
Merge: e15ed99 a582469
Author: Mike Perry <mikeperry-git(a)fscked.org>
Date: Mon Oct 31 21:34:30 2011 -0700
Merge commit 'a58246918f58f3fc5663f8772df39cdcd3ccce8d'
SQLSupport.py | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
1
0

01 Nov '11
commit a58246918f58f3fc5663f8772df39cdcd3ccce8d
Author: Mike Perry <mikeperry-git(a)fscked.org>
Date: Mon Oct 31 21:33:20 2011 -0700
Add TorCtl piece of the fix for #1984.
---
SQLSupport.py | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/SQLSupport.py b/SQLSupport.py
index b7c9a82..40cbdf7 100644
--- a/SQLSupport.py
+++ b/SQLSupport.py
@@ -674,6 +674,7 @@ class RouterStats(Entity):
f.write("node_id=$"+s.router.idhex+" nick="+s.router.nickname)
f.write(" strm_bw="+str(cvt(s.sbw,0)))
f.write(" filt_bw="+str(cvt(s.filt_sbw,0)))
+ f.write(" circ_fail_rate="+str(cvt(s.circ_to_rate,0)))
f.write(" desc_bw="+str(int(cvt(s.avg_desc_bw,0))))
f.write(" ns_bw="+str(int(cvt(s.avg_bw,0)))+"\n")
1
0

r25193: {website} "unsupported" is totally the wrong way to dissuade people (website/trunk/docs/en)
by Roger Dingledine 01 Nov '11
by Roger Dingledine 01 Nov '11
01 Nov '11
Author: arma
Date: 2011-11-01 03:35:45 +0000 (Tue, 01 Nov 2011)
New Revision: 25193
Modified:
website/trunk/docs/en/faq.wml
Log:
"unsupported" is totally the wrong way to dissuade people
Modified: website/trunk/docs/en/faq.wml
===================================================================
--- website/trunk/docs/en/faq.wml 2011-10-31 02:39:39 UTC (rev 25192)
+++ website/trunk/docs/en/faq.wml 2011-11-01 03:35:45 UTC (rev 25193)
@@ -828,7 +828,7 @@
You can use <a
href="https://gitweb.torproject.org/torbrowser.git/blob_plain/HEAD:/build-scripts…">our
old polipo config file</a> if you like. However, please realize that
-this approach is unsupported.
+this approach is not recommended for novice users.
</p>
<hr>
1
0

[torspec/master] First finished draftoid for xxx-new-crypto-sketch
by nickm@torproject.org 01 Nov '11
by nickm@torproject.org 01 Nov '11
01 Nov '11
commit 3176a378a54b99dbbe3638b0da3766981ca4a87c
Author: Nick Mathewson <nickm(a)torproject.org>
Date: Mon Oct 31 15:55:55 2011 -0400
First finished draftoid for xxx-new-crypto-sketch
Needs a round of proofreading before I call it a "draft".
---
proposals/ideas/xxx-new-crypto-sketch.txt | 352 ++++++++++++++++++++---------
1 files changed, 241 insertions(+), 111 deletions(-)
diff --git a/proposals/ideas/xxx-new-crypto-sketch.txt b/proposals/ideas/xxx-new-crypto-sketch.txt
index 504e658..83711c3 100644
--- a/proposals/ideas/xxx-new-crypto-sketch.txt
+++ b/proposals/ideas/xxx-new-crypto-sketch.txt
@@ -1,19 +1,7 @@
Title: Design sketch for new crypto ops
-Date: 19 Oct 2011
+Date: 31 Oct 2011
Author: Nick Mathewson
-TODO:
- - Write missing sections
- - Fix XXXXs
- - Write performance notes
- - Find rransom's extend proposal
- - Proofread
- - Change relay-cipher section:
- - Pad with zeros.
- - Encrypt-then-mac.
- - Revise analysis.
- - Try to be a little less monlogue-ish, a little more
-
0. Overview
The point of this document is to discuss what crypto we ought to be using.
@@ -28,7 +16,9 @@ TODO:
Addressed in proposls 176, 184. We say a little here though in
section 5.
CREATE/EXTEND CRYPTO --
- Addressed in xxx-ntor-handshake.txt and rransom's extend draft
+ Addressed in xxx-ntor-handshake.txt and rransom's extend draft at
+ [*] and subsequent discussion on the tor-dev mailing list. Not
+ considered here.
RELAY CRYPTO
Addressed here in Section 6
DIRECTORY SYSTEM
@@ -36,50 +26,63 @@ TODO:
HIDDEN SERVICE SYSTEM
Addressed in a forthcoming document by rransom.
+[*] https://lists.torproject.org/pipermail/tor-dev/2011-March/002547.html
+
1. Base algorithm choice
There seem to be two main candidate algorithms for signatures: RSA
with big keys (hereinafter "RSA>1024"); and Ed25519, which is DSA with
the sharp edges filed off on an Edwards curve related to DJB's
Curve25519. We can look at other ECC groups too. {But see ECC
- Notes.}
+ Notes in 1.1 below.}
- For Diffie Hellman: Curve25519 seems like a decent choice; failing
- that, another XXXX. DH on Z_p with big groups (hereinafter "DH>1024").
- {But see ECC notes.}
+ FOR DIFFIE-HELLMAN: Curve25519 seems like a decent choice; failing
+ that, one of the NIST P-groups. Failing that, DH on Z_p with big
+ groups (hereinafter "DH>1024"). {But see ECC Notes in 1.1 below.}
- For a hash function: SHA256, switching to SHA3 in 2012 when it comes
+ FOR A HASH FUNCTION: SHA256, switching to SHA3 in 2012 when it comes
out. It might be worthwhile waiting for SHA3 in most places, and
- skipping over the sha256 stage entirely.
+ skipping over the SHA256 stage entirely.
- For a stream cipher: AES-CTR is in one sense a conservative choice
+ FOR A STREAM CIPHER: AES-CTR is in one sense a conservative choice
inasmuch as AES is well-analyzed, but AES's well-known issues with
cache-based timing attacks are pretty worrisome. We can mitigate that
some by using random secret IVs for AES-CTR, so that we will be
encrypting neither attacker-chosen nor attacker-known plaintext with
- out AES cipher, but that's a bit kludgy. Salsa20 is what rransom
- likes these days, but IMO we aren't competent to tell whether it looks
- good or not; the existing attacks against it don't look like very bad
- news to me, but who knows whether it's getting enough attention that
- we can read. See also ChaCha; see also the other EuroCrypt {XXXX}
- winners/finalists; see also SHA3 if the SHA3 winner specifies a way to
- use it as a stream cipher, or specifies an underlying stream/block
- cipher.
-
- For a random number generator: We currently use OpenSSL seeded with
- RAND_poll and with platform entropy. OpenSSL uses RC4 (XXX check).
- The platform entropy management can be messy, obscure, or both. I
- suggest that:
-
- * we should seed our PRNG with more entropy sources if we can find
+ out AES cipher, but that's a bit kludgy. There are also supposed to
+ be time-invariant implementations that use Intel's AESNI instructions
+ where available, and time-invariant implementations that use
+ bit-slicing.
+
+ Salsa20 is what rransom likes these days, but IMO we aren't competent
+ to tell whether it looks good or not; the existing attacks against it
+ don't look like very bad news to me, but who knows whether it's
+ getting enough attention that we can read. See also ChaCha; see also
+ the other eSTREAM winners/ finalists; see also SHA3 if the SHA3 winner
+ specifies a way to use it as a stream cipher, or specifies an
+ underlying stream/block cipher.
+
+ If we're feeling cautious, we could run two independently-keyed stream
+ ciphers and xor their streams together.
+
+ FOR A RANDOM NUMBER GENERATOR: We currently use OpenSSL seeded with
+ RAND_poll and with platform entropy. OpenSSL uses a message-digest-
+ based algorithm from SSLeay (See http://linux.die.net/man/3/sslrand
+ for the ugly details.) The platform entropy management can be messy,
+ obscure, or both. I suggest that:
+
+ * We should seed our PRNG with more entropy sources if we can find
some promising code with an appropriate license
- * instead of just using OpenSSL's PRNG, we should use OpenSSL's PRNG
- xor'd with some other good PRNG. (Is Yarrow still cool? And is
- there a combine operation better than xor?)
+ * Instead of just using OpenSSL's PRNG, we should use OpenSSL's
+ MD-based PRNG xor'd with some other good PRNG. (Fortuna,
+ maybe. Is there a combine operation better than xor? See also SHA3
+ if the SHA3 winner is one that specifies a PRNG mode of
+ operation.)
+ * We should consider splicing this combined-stream PRNG into OpenSSL
+ as the RNG it uses for SSL and key generation.
* We should re-seed the RNG before and after very sensitive
operations, like private key generation.
-
1.1. ECC notes
ECC is the brave new[*] crypto of the future! It's faster[**] than
@@ -117,7 +120,7 @@ TODO:
like that. We should check that out.
[*] Actually, it's older than onion routing, and older than some
- members of the Tor project.
+ members of the Tor Project.
[**] Actually, because of the common practice of choosing a small-ish
prime value (65537) for e in RSA, RSA public key operations can be a
@@ -138,14 +141,14 @@ TODO:
- To determine which part of the circuit ID space to use on a Tor
instance's links.
-2.1. New identities, option 1: RSA>1024, slow migration.
+2.1. New identities, option 1: "RSA>1024, slow migration"
- We use RSA for identity keys indefinitely. Nearly all operations done
- with an identity key are signature checking; signing happens only a
- few times an hour per node even with pathological cases. Since
- signature checking is really cheap with RSA, there's no speed
- advantage for ECC here. (There is a space advantage, since the keys
- are much smaller.)
+ In this option, we use RSA for identity keys indefinitely. Nearly all
+ operations done with an identity key are signature checking; signing
+ happens only a few times an hour per node even with pathological
+ cases. Since signature checking is really cheap with RSA, there's no
+ speed advantage for ECC here. (There is a space advantage, since the
+ keys are much smaller.)
The easiest way to migrate to longer identity keys is to tell all Tors
to begin accepting longer identity keys now, and to tweak all our
@@ -156,13 +159,14 @@ TODO:
specified. Once all versions of Tor can accept long identity keys, we
raise the maximum size from 1024 to somewhere in the 2048-4096 range.
-2.2. New identities option 2: RSA>1024, faster migration.
+2.2. New identities option 2: "RSA>1024, faster migration"
- We use RSA for identity keys indefinitely as above. But we allow
- nodes to begin having longer identities now, even though older Tors
- won't understand them. This implies, of course, that every such node
- needs to have at least 2 identities: one RSA1024 identity for backward
- compatibility, one RSA>1024 identity for more secure identification.
+ In this option, we use RSA for identity keys indefinitely as above.
+ But we allow nodes to begin having longer identities now, even though
+ older Tors won't understand them. This implies, of course, that every
+ such node needs to have at least 2 identities: one RSA1024 identity
+ for backward compatibility, one RSA>1024 identity for more secure
+ identification.
We would have these identities cross-certify as follows: All keys
would be listed in the router descriptor. RSA>1024 keys would be
@@ -201,9 +205,9 @@ TODO:
implications of letting nodes have multiple identity keys.
We'll need to advertise these new identities in consensus directories
- too; see XXXX below for more info there.
+ too; see 4.2 below for more info there.
-2.3. New identities option 3: RSA>1024 and/or Ed25519, faster migration
+2.3. New identities option 3: "RSA>1024 and/or Ed25519, faster migration"
As in option 2 above, but new keys can also be Ed25519. If we expect
that not all installations will allow Ed25519 (see "ECC Notes",
@@ -269,12 +273,18 @@ TODO:
nodes without RSA1024 identity keys (off until all clients and nodes
accept new identity keys).
-2.6. Implications for directory information
+2.6. Selective correctness attacks
- Clients must know a hash for each node's identity key, or else they
- can't make an authenticated connection to the node or tell ORs how to
- extend to the node.
+ For any scheme based on having multiple signature types on a router
+ descriptor or other document, an attacker could mount a partitioning
+ attack by making a document which older clients will accept but newer
+ clients will reject.
+
+ It's easy to prevent this at the consensus step: directory authorities
+ MUST NOT accept any descriptor unless all clients will be able to
+ verify it.
+ For bridge descriptors, we need to investigate more carefully.
3. New fingerprints
@@ -328,6 +338,47 @@ TODO:
In the UI we'll lose the property that no node has more than one
fingerprint: I do not believe that this actually hurts us.
+3.3. Implications for directory information
+
+ Clients must know a hash for each node's identity key, or else they
+ can't make an authenticated connection to the node or tell ORs how to
+ extend to the node.
+
+ This means that if client Alice wants to connect to node Bob, Alice
+ must have a fingerprint of Bob's ID key such that she understands the
+ ID key type and the fingerprint algorithm. If Alice wants to extend
+ from Bob to Carol, she must have a fingerprint of Carol's ID key such
+ that Bob understands the ID key type and the fingerprint algorithm.
+
+ So for every node, Alice must not only know a fingerprint that *she*
+ can use for that node, but also a set fingerprint such that every node
+ can understand at least one fingerprint in the set.
+
+ This implies a proliferation of fingerprints! We should tread
+ carefully here. To prevent proliferation, the easiest solution is not
+ to add too many new types, and to have a good plan for retiring older
+ types.
+
+3.4. Implications for EXTEND cells
+
+ As mentioned in 3.3, when a client Alice can tell node Bob to extend
+ to node Carol, she needs to give Bob a fingerprint for Carol that Bob
+ will understand: one where Bob understands the digest algorithm, and
+ understands the identity key type.
+
+ There are two ways we can do this:
+
+ 1) Alice's EXTEND cell contains every fingerprint for Carol that
+ Alice knows about. Bob treats the cell as valid if every one he
+ can verify is correct.
+
+ 2) Alice knows which fingerprint types Bob understands (either via
+ his version, or something else in his directory info). She
+ selects a fingerprint for Carol using the best one of these
+ types.
+
+ The first seems more robust to me, if we have space for enough bytes.
+ If we proliferate too many types, though, we'll need to do the second.
4. Directory changes
@@ -336,17 +387,51 @@ TODO:
In some places, directory objects cross-reference one another by SHA1
hash. They should use a better hash algorithm instead.
- XXX
+ This does make problems in a few cases.
+
+ Router descriptors and extrainfo descriptors:
+
+ One problematic case is in in determining node families. If node A
+ and node B want to list each other as being in the same family,
+ they need to do so in a way that clients can interpret. That could
+ mean listing SHA1-RSA1024 fingerprints so old clients understand,
+ AND new fingerprints for security. (But *that* could create
+ interesting partitioning attacks wherein your family looks
+ different depending on who's looking.)
+
+ Solution: we need to move the responsibility for combining node
+ families into the consensus voting process, so clients don't
+ need to understand the cross-reference types themselves.
+
+ Another case is in certifying extrainfo documents from descriptors.
+ For that, we can list multiple extrainfo digests, either on the
+ extrainfo line, or on additional lines.
+
+ Voting and consensus documents:
+
+ Adding more fingerprints in votes isn't a problem; votes are a tiny
+ fraction of authority bw usage. Adding more hashes is easy.
+
+ For consensus documents, we ought to have flavors that you can
+ download depending on what set of fingerprints types you
+ understand.
+
+ For integrity purposes, consensuses can refer to microdescriptors
+ or descriptors by any digest type that the client understands. But
+ for downloading purposes, the digest type must be one that
+ directory caches also support: see 4.4.
4.2. More fingerprints
Because extending from node A to node B requires that we have node B's
fingerprint in a way that node A will understand, it is not enough to
- get a set of identity fingerprints for each node in the format _we_
- like best. Instead, we must .
+ get a set of identity fingerprints for each node in the format that
+ the client likes best -- see 3.3 and 3.4 above. So every flavor of
+ consensus we serve needs to include a node identity in a format the
+ client understands, and a node identities in formats such that every
+ node will understand at least one.
-
-4.x. An option: compound signatures on directory objects
+4.3. An option: compound signatures on directory objects
In Tor 0.2.2.x and later, when we check a signature on a directory
object (not including hidden service descriptors), we only look at
@@ -357,13 +442,23 @@ TODO:
with a 1024-bit key therefore leaves 128-11-20==97 more bytes we
could use for a SHA2 or a SHA3 hash.)
-4.x.
+4.4. Downloading by digest
+ We should have directory caches support downloading objects by more
+ hash types. Right now, descriptors are downloaded by their SHA1
+ hashes and microdescriptors by their SHA256 hashes. This is okay for
+ now, but once SHA3 is out, we should support downloading all of these
+ by SHA3 digest.
5. Link crypto changes
+ Currently we use TLS. That's fine.
+ We should however look to longer link keys, bigger DH groups, etc.
+ Once TLS versiosn 1.1/1.2 is available in OpenSSL, we should move to
+ use them, I think. We should also look into how quickly we can
+ deprecate TLS 1.0 and SSL <= 3 usage.
6. Relay crypto changes
@@ -390,7 +485,7 @@ TODO:
6.1. Option 1: Use AES-CTR in a less scary mode
- When doing key expansion, in addition to establishing Kf, Kb, Df, and
+ When doing key expansion, in addition to establishing Kf, Kb, Df, and
Db, also establish IVf and IVb. Use the current relay crypto, except
instead of starting the counters at 0, start them at IVf and IVb.
This way, an attacker doesn't have any known plaintexts to work with,
@@ -414,7 +509,7 @@ TODO:
have the key used for each block be an HMAC of all previously
received ciphertext.)
-Option 3: As 1, but tagging attacks garble the circuit in the same block.
+6.3. Option 3: As 1, but tagging attacks garble the circuit in the same block.
Use a large-block cipher mode, such as BEAR or LIONESS (depending on
whether we need a PRP or SPRP). Base the key material for each block
@@ -429,7 +524,7 @@ Option 3: As 1, but tagging attacks garble the circuit in the same block.
start from the end, but it didn't seem much better. We should
investigate relative performance, though.}
-Option 4: Shall we have middle nodes be able to fast-stop bad data?
+6.4. Option 4: Shall we have middle nodes be able to fast-stop bad data?
In all the above options, if a cell is altered, the middle node can
at best turn that cell and the rest of the cells on the circuit into
@@ -439,8 +534,10 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
Might we prefer to do as in mixnets, and have nodes kill circuits
upon receiving altered cells?
- I'm not so sure. It's relatively expensive in per-cell overhead, and
- the next-best attack to the one it prevents isn't so great.
+ It's not such an obvious improvement. Including more MACs is more
+ expensive in per-cell overhead. The attacks that we would foil this
+ way but not with Option 3 are not so much better than the the passive or
+ timing-based-active end-to-end attacks that would still remain.
Consider that if option 3 is in place, an end-to-end attacker who
wants to do a tagging attack at one node can garble the rest of the
@@ -451,37 +548,43 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
corresponding gap at the exit. The only advantage of the garbling
attack would be that garbled cells are presumably rarer than circuit
closes or traffic pauses, and thus easier to use to distinguish
- target circuits. But that's still bunk; the other attacks win fine,
- and the pause attack doesn't risk detection so much.
+ target circuits. But that's still questionable: the other attacks
+ win fine, and the pause attack doesn't risk detection as much.
+
+ So why might we want to do this? First, the overhead doesn't need to
+ be as bad as you might first expect (see below). Second, it would be
+ nice to increase the security margin as much as possible: "attacks
+ only get better".
- Still, to do this one, we'd want to have outgoing and incoming
- circuits treated differently. Incoming cells could work as in 1 or 2
- or 3 above; outgoing cells would want to have a header portion as in
- mixmaster, where each node checks that the first N bits of the header
+ So let's figure out how it would look.
+
+ To do this one, we'd want to have outgoing and incoming circuits
+ treated differently. Incoming cells would get decrypted as in 1
+ above, except that we'd have a MAC on them. For outgoing cells,
+ each node would check that the first N bytes of the cell
match a MAC of all data seen so far, *including the rest of the
- cell*. They'd then decrypt the rest of the cell, remove the first N
- bits of the header, and re-pad the header with N bits at the end,
- taken from a PRNG whose seed is shared with the client. (This is
- basically how mixmaster works, and how mixminion works in the common
- case.)
+ cell*. They'd then remove the first N bytes, and re-pad the cell
+ with bytes from a PRNG, and decrypt the resulting re-padded cell.
+ (This is basically how mixmaster works, and how mixminion works in
+ the common case.)
The space overhead here is kind of large: N bits per cell per node.
- In the most paranoid case, if we used 256-byte HMACs on 3-node paths,
+ In the most paranoid case, if we used 256-bit HMACs on 3-node paths,
that's 96 bytes per cell, which is more than 20% of the total length.
But we can probably do better if we let the CREATE operation also
tell the node some N to check. For example, the first node doesn't
need to check any bits. The second and third nodes could check 64
- bits apiece; that only has 16 bytes overhead, and high probability of
- catching any changes. (Birthday attacks don't matter here, and an
- attacker who mounts this attack for long enough to accidentally find
- a 64-bit mac will break so many circuits in the process as to become
- totally unreliable.
+ bits apiece; that only has 16 bytes overhead total, and high
+ probability of catching any changes. (Birthday attacks don't matter
+ here, and an attacker who mounts this attack for long enough to
+ accidentally find a 64-bit MAC will break so many circuits in the
+ process as to become totally unreliable.)
- But all of this leaks the path lengths and position on the path to
+ All of this leaks the path lengths and position on the path to
various nodes. We might open ourselves up to partitioning attacks if
different clients choose different numbers of bits. What's more, we
might leak the length of the path to the last node by how much junk
- there is at the end of the cell.
+ there is at the end of the cell. So we'd need to be careful!
Here's a simple construction for this format, to be concrete:
@@ -522,22 +625,23 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
When the client receives a relay cell body, it iteratively does:
For node i in circuit from 1..N:
- Let cells_i = all previous cells which we decided were from i,
+ Let cells_i = all previous cells which we previously decided
+ were from node i, or relayed by node i,
and let cellbody = the body of the cell, except for the last
- MACBYTESb[i] bytes.
- {XXXX I'd be more comfortable if the MAC covered all cells
- passed by every node on the circuit.}
+ MACBYTESb[i] bytes,
+ and let cell mac = the last MACBYTESb[i] bytes of this cell.
- If MACBYTESb[i]>0, check whether MAC(cells_i | cellbody,
- Mb[i])[:MACBYTESb[i]] the last MACBYTESb[i] bytes of the
- cell. If so, this cell is from node i.
+ If cellmac is nonempty, check wither cellmac = mac_received,
+ where mac_received is the first MACBYTESb[i] bytes of
+ MAC(cells_i | cellbody, Mb[i]). If so, this cell is from node
+ i.
- If this cell is from node i, decrypt the first
- CELL_DATA_LEN-MACBYTESb[i] bytes of the cell using the stream
- keyed with Kb[i],IVb[i]. Act on it as a relay cell.
+ If this cell is from node i, add cellbody to cells_i, then
+ decrypt cellbody using the stream keyed with Kb[i],IVb[i].
+ Act on it as a relay cell.
- Otherwise decrypt the entire cell, MAC included, with the
- stream keyed with Kb[i], IVb[i].
+ Otherwise add the entire cell to cells_i, and decrypt it, MAC
+ included, with the stream keyed with Kb[i], IVb[i].
If no node sent this cell: it's junk and somebody is probably
messing with us! Destroy the circuit.
@@ -560,19 +664,19 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
Let PAD[i] = the next MACBYTESf[i] bytes from the PRNG keyed
with SEEDf[i], for i in 1...N.
- Let STREAM[i] = the next CELL_DATA_LEN-MACBYTESf[i] bytes of
+ Let STREAM[i] = the next CELL_DATA_LEN bytes of
the stream keyed by Kf[i],IV[i], for i in 1...N.
Let PADSEEN[1] == ""
For i in 2...N:
- Let L = len(PADSEEN[i-1])
- Let PADSEEN[i] = (PADSEEN[i-1] xor
- STREAM[i-1][CELL_DATA_LEN-L:]) | PAD[i-1]
+ Let L = len(PADSEEN[i-1]) + len(PAD[i-1])
+ Let PADSEEN[i] = (PADSEEN[i-1] | PAD[i-1]) xor
+ STREAM[i-1][CELL_DATA_LEN-L:]
For i in N down to 1:
- Let Encbody = Body xor STREAM[i][len(body)]
+ Let Encbody = Body xor STREAM[i][:len(Body)]
Let extra = "RECOGNIZED" if i == N, "OK" otherwise
Let cells[i] = cells[i] | Body | PADSEEN[i]
Let M = MAC(cells[i] | extra , Mf[i])
@@ -589,10 +693,10 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
= MAC(cells|rest|"OK", Mb)[:MACBYTESf], this cell is not for
us, but is valid. Otherwise, destroy the circuit.
- Decrypt REST using the stream cipher keyed with Kf,IVf. If
- this cell is for us, act on it as a relay cell. Otherwise, let
- PAD = the next MACBYTESf[i] bytes of the PRNG keyed with SEEDf,
- and relay REST | PAD.
+ Let PAD = the next MACBYTESf[i] bytes of the PRNG keyed with
+ SEEDf, and decrypt REST | PAD using the stream cipher keyed
+ with Kf,IVf. If this cell is for us, act on it as a relay
+ cell. Otherwise, relay it.
ANOTHER VARIANT:
@@ -629,6 +733,32 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
sufficient to do something like setting MACBYTES=8 for the last
hop, and MACBYTES=8 for one hop in the middle.
+ OTHER VARIANTS:
+
+ Can we combine this approach with one of the approaches in 2 or
+ 3 above to ensure that if corrupt data passes (because of our
+ use of truncated HMACs) it still corrupts the stream?
+
+ Can/should we use GCM or something here instead of separate
+ encrypt/hmac operations? It doesn't seem that GCM per se would
+ apply without some tweaking, which we probably do not have the
+ expertise to do.
+
+ OVERHEAD NOTES:
+
+ When computing additional overhead with this method, note that
+ it lets us replace the old 4 byte "digest" field and the 2 byte
+ "recognized" field.
+
+ I note in passing that we need at most 9 bits for the length
+ field, and most 6 bits for the command field, yet we're using a
+ total of 3 bytes for those 15 bits. That's an opportunity to
+ save another byte.
+
+6.5. Other ideas for
+
+
+
ACKS
Lots of the good ideas and concerns here are due to Robert Ransom.
1
0

[torspec/master] Re-flow new-crypto-sketch to a 72 columns, add a TODO
by nickm@torproject.org 01 Nov '11
by nickm@torproject.org 01 Nov '11
01 Nov '11
commit 5d397a19f34212598b98fe1d4fc0ad5945cc0809
Author: Nick Mathewson <nickm(a)torproject.org>
Date: Mon Oct 31 11:06:30 2011 -0400
Re-flow new-crypto-sketch to a 72 columns, add a TODO
---
proposals/ideas/xxx-new-crypto-sketch.txt | 617 +++++++++++++++--------------
1 files changed, 326 insertions(+), 291 deletions(-)
diff --git a/proposals/ideas/xxx-new-crypto-sketch.txt b/proposals/ideas/xxx-new-crypto-sketch.txt
index a4893e1..504e658 100644
--- a/proposals/ideas/xxx-new-crypto-sketch.txt
+++ b/proposals/ideas/xxx-new-crypto-sketch.txt
@@ -2,6 +2,18 @@ Title: Design sketch for new crypto ops
Date: 19 Oct 2011
Author: Nick Mathewson
+TODO:
+ - Write missing sections
+ - Fix XXXXs
+ - Write performance notes
+ - Find rransom's extend proposal
+ - Proofread
+ - Change relay-cipher section:
+ - Pad with zeros.
+ - Encrypt-then-mac.
+ - Revise analysis.
+ - Try to be a little less monlogue-ish, a little more
+
0. Overview
The point of this document is to discuss what crypto we ought to be using.
@@ -26,96 +38,100 @@ Author: Nick Mathewson
1. Base algorithm choice
- There seem to be two main candidate algorithms for signatures: RSA with big
- keys (hereinafter "RSA>1024"); and Ed25519, which is DSA with the sharp
- edges filed off on an Edwards curve related to DJB's Curve25519. We can
- look at other ECC groups too. {But see ECC Notes.}
-
- For Diffie Hellman: Curve25519 seems like a decent choice; failing that,
- another . DH
- on Z_p with big groups (hereinafter "DH>1024"). {But see ECC notes.}
-
- For a hash function: SHA256, switching to SHA3 in 2012 when it comes out.
- It might be worthwhile waiting for SHA3 in most places, and skipping over
- the sha256 stage entirely.
-
- For a stream cipher: AES-CTR is in one sense a conservative choice inasmuch
- as AES is well-analyzed, but AES's well-known issues with cache-based
- timing attacks are pretty worrisome. We can mitigate that some by using
- random secret IVs for AES-CTR, so that we will be encrypting neither
- attacker-chosen nor attacker-known plaintext with out AES cipher, but
- that's a bit kludgy. Salsa20 is what rransom likes these days, but IMO we
- aren't competent to tell whether it looks good or not; the existing attacks
- against it don't look like very bad news to me, but who knows whether it's
- getting enough attention that we can read. See also ChaCha; see also the
- other EuroCrypt {XXXX} winners/finalists; see also SHA3 if the SHA3 winner
- specifies a way to use it as a stream cipher, or specifies an underlying
- stream/block cipher.
+ There seem to be two main candidate algorithms for signatures: RSA
+ with big keys (hereinafter "RSA>1024"); and Ed25519, which is DSA with
+ the sharp edges filed off on an Edwards curve related to DJB's
+ Curve25519. We can look at other ECC groups too. {But see ECC
+ Notes.}
+
+ For Diffie Hellman: Curve25519 seems like a decent choice; failing
+ that, another XXXX. DH on Z_p with big groups (hereinafter "DH>1024").
+ {But see ECC notes.}
+
+ For a hash function: SHA256, switching to SHA3 in 2012 when it comes
+ out. It might be worthwhile waiting for SHA3 in most places, and
+ skipping over the sha256 stage entirely.
+
+ For a stream cipher: AES-CTR is in one sense a conservative choice
+ inasmuch as AES is well-analyzed, but AES's well-known issues with
+ cache-based timing attacks are pretty worrisome. We can mitigate that
+ some by using random secret IVs for AES-CTR, so that we will be
+ encrypting neither attacker-chosen nor attacker-known plaintext with
+ out AES cipher, but that's a bit kludgy. Salsa20 is what rransom
+ likes these days, but IMO we aren't competent to tell whether it looks
+ good or not; the existing attacks against it don't look like very bad
+ news to me, but who knows whether it's getting enough attention that
+ we can read. See also ChaCha; see also the other EuroCrypt {XXXX}
+ winners/finalists; see also SHA3 if the SHA3 winner specifies a way to
+ use it as a stream cipher, or specifies an underlying stream/block
+ cipher.
For a random number generator: We currently use OpenSSL seeded with
- RAND_poll and with platform entropy. OpenSSL uses RC4 (XXX check). The
- platform entropy management can be messy, obscure, or both. I suggest
- that:
- * we should seed our PRNG with more entropy sources if we can find some
- promising code with an appropriate license
+ RAND_poll and with platform entropy. OpenSSL uses RC4 (XXX check).
+ The platform entropy management can be messy, obscure, or both. I
+ suggest that:
+
+ * we should seed our PRNG with more entropy sources if we can find
+ some promising code with an appropriate license
* instead of just using OpenSSL's PRNG, we should use OpenSSL's PRNG
- xor'd with some other good PRNG. (Is Yarrow still cool? And is there
- a combine operation better than xor?)
- * We should re-seed the RNG before and after very sensitive operations,
- like private key generation.
+ xor'd with some other good PRNG. (Is Yarrow still cool? And is
+ there a combine operation better than xor?)
+ * We should re-seed the RNG before and after very sensitive
+ operations, like private key generation.
1.1. ECC notes
- ECC is the brave new[*] crypto of the future! It's faster[**] than doing
- crypto in Z_n (as we do for RSA and DH now) for equivalent levels of
- security, and the resulting outputs are much shorter.
-
- As near as I can tell as a layman, Certicom is muddying the waters as much
- as possible wrt claiming that it's nigh impractical to deploy ECC without
- licensing their patents. This is rather like the silliness that PKP used
- to pull back in the day, where they claimed that their patents covered not
- only the existing public key cryptography algorithms, but also the very
- idea of public key cryptography itself.
-
- DJB claims that for every patent he's aware of, either that patent doesn't
- cover his code, or that patent is invalid because of prior art. I'm not
- going to try to evaluate these claims, since I'm not supposed to be reading
- patents for typical "let's avoid the appearance of knowing infringement"
- reasons. But before we dive into the world of ECC, we should see if we can
- ask any friendly patent attorneys and ECC experts for a second or third
- opinion here.
-
- I note in passing that nearly all of the patents that DJB mentions in his
- list would appear to expire over the next 12 months or so.
-
- Additionally, there are ECC groups out there less fast than DJB's, but more
- widely available and analyzed. We should consider some of those too.
-
- One final issue to investigate is whether using these algorithms will make
- any major free software distribution decide not to include us. I seem to
- recall seeing that one or two of the big ones had at one point decided to
- ship OpenSSL only with ECC disabled, either because of real patent
- concerns, or because of an opinion that the certicom license for ECC use in
- TLS was problematic for free software, or something like that. We should
- check that out.
-
- [*] Actually, it's older than onion routing, and older than some members of
- the Tor project.
-
- [**] Actually, because of the common practice of choosing a small-ish prime
- value (65537) for e in RSA, RSA public key operations can be a little
- faster than equivalent-security ECDH or ECDSA operations. The private key
- operations in RSA are still much much slower.
+ ECC is the brave new[*] crypto of the future! It's faster[**] than
+ doing crypto in Z_n (as we do for RSA and DH now) for equivalent
+ levels of security, and the resulting outputs are much shorter.
+
+ As near as I can tell as a layman, Certicom is muddying the waters as
+ much as possible wrt claiming that it's nigh impractical to deploy ECC
+ without licensing their patents. This is rather like the silliness
+ that PKP used to pull back in the day, where they claimed that their
+ patents covered not only the existing public key cryptography
+ algorithms, but also the very idea of public key cryptography itself.
+
+ DJB claims that for every patent he's aware of, either that patent
+ doesn't cover his code, or that patent is invalid because of prior
+ art. I'm not going to try to evaluate these claims, since I'm not
+ supposed to be reading patents for typical "let's avoid the appearance
+ of knowing infringement" reasons. But before we dive into the world
+ of ECC, we should see if we can ask any friendly patent attorneys and
+ ECC experts for a second or third opinion here.
+
+ I note in passing that nearly all of the patents that DJB mentions in
+ his list would appear to expire over the next 12 months or so.
+
+ Additionally, there are ECC groups out there less fast than DJB's, but
+ more widely available and analyzed. We should consider some of those
+ too.
+
+ One final issue to investigate is whether using these algorithms will
+ make any major free software distribution decide not to include us. I
+ seem to recall seeing that one or two of the big ones had at one point
+ decided to ship OpenSSL only with ECC disabled, either because of real
+ patent concerns, or because of an opinion that the certicom license
+ for ECC use in TLS was problematic for free software, or something
+ like that. We should check that out.
+
+ [*] Actually, it's older than onion routing, and older than some
+ members of the Tor project.
+
+ [**] Actually, because of the common practice of choosing a small-ish
+ prime value (65537) for e in RSA, RSA public key operations can be a
+ little faster than equivalent-security ECDH or ECDSA operations. The
+ private key operations in RSA are still much much slower.
2. New identities
Identity keys and their fingerprints are used:
- - To sign router descriptors
+ - To sign router descriptors.
- To identify nodes in consensus directories
- To make sure we're talking to the right node in the link handshake
- - To make sure that the extending node is talking to the right next node
- when sending an extend cell
+ - To make sure that the extending node is talking to the right next
+ node when sending an extend cell
- To identify particular nodes in the hidden service subsystem
- To identify nodes in the UI in various places
- Internally, to identify a node uniquely in the codebase.
@@ -124,35 +140,36 @@ Author: Nick Mathewson
2.1. New identities, option 1: RSA>1024, slow migration.
- We use RSA for identity keys indefinitely. Nearly all operations done with
- an identity key are signature checking; signing happens only a few times an
- hour per node even with pathological cases. Since signature checking is
- really cheap with RSA, there's no speed advantage for ECC here. (There is
- a space advantage, since the keys are much smaller.)
-
- The easiest way to migrate to longer identity keys is to tell all Tors to
- begin accepting longer identity keys now, and to tweak all our protocols so
- that longer RSA identity keys are understood. We should then have a pair
- of parameters in the consensus that determines the largest and smallest
- acceptable identity key size in the network. Clients and servers should
- reject any keys longer or shorter than specified. Once all versions of Tor
- can accept long identity keys, we raise the maximum size from 1024 to
- somewhere in the 2048-4096 range.
+ We use RSA for identity keys indefinitely. Nearly all operations done
+ with an identity key are signature checking; signing happens only a
+ few times an hour per node even with pathological cases. Since
+ signature checking is really cheap with RSA, there's no speed
+ advantage for ECC here. (There is a space advantage, since the keys
+ are much smaller.)
+
+ The easiest way to migrate to longer identity keys is to tell all Tors
+ to begin accepting longer identity keys now, and to tweak all our
+ protocols so that longer RSA identity keys are understood. We should
+ then have a pair of parameters in the consensus that determines the
+ largest and smallest acceptable identity key size in the network.
+ Clients and servers should reject any keys longer or shorter than
+ specified. Once all versions of Tor can accept long identity keys, we
+ raise the maximum size from 1024 to somewhere in the 2048-4096 range.
2.2. New identities option 2: RSA>1024, faster migration.
- We use RSA for identity keys indefinitely as above. But we allow nodes to
- begin having longer identities now, even though older Tors won't understand
- them. This implies, of course, that every such node needs to have at least
- 2 identities: one RSA1024 identity for backward compatibility, one RSA>1024
- identity for more secure identification.
+ We use RSA for identity keys indefinitely as above. But we allow
+ nodes to begin having longer identities now, even though older Tors
+ won't understand them. This implies, of course, that every such node
+ needs to have at least 2 identities: one RSA1024 identity for backward
+ compatibility, one RSA>1024 identity for more secure identification.
- We would have these identities cross-certify as follows: All keys would be
- listed in the router descriptor. RSA>1024 keys would be called something
- other than identity-key, so as not to confuse older clients. A signature
- with the RSA>1024 key would appear right before the current RSA1024
- signature. This way, signed material would include both keys, and would be
- signed by both keys.
+ We would have these identities cross-certify as follows: All keys
+ would be listed in the router descriptor. RSA>1024 keys would be
+ called something other than identity-key, so as not to confuse older
+ clients. A signature with the RSA>1024 key would appear right before
+ the current RSA1024 signature. This way, signed material would
+ include both keys, and would be signed by both keys.
[In other words, descriptors would look something like:
@@ -169,11 +186,13 @@ Author: Nick Mathewson
...
ext-signature
-----BEGIN SIGNATURE-----
- signature of everything through "ext-signature\n", using the long key
+ signature of everything through "ext-signature\n",
+ using the long key
-----END SIGNATURE-----
router-signature
-----BEGIN SIGNATURE-----
- signature of everything through "router-signature\n", using the short key
+ signature of everything through "router-signature\n",
+ using the short key
-----END SIGNATURE-----
]
@@ -181,15 +200,15 @@ Author: Nick Mathewson
See "UI notes" in the "new fingerprints" section below for some of the
implications of letting nodes have multiple identity keys.
- We'll need to advertise these new identities in consensus directories too;
- see XXXX below for more info there.
+ We'll need to advertise these new identities in consensus directories
+ too; see XXXX below for more info there.
2.3. New identities option 3: RSA>1024 and/or Ed25519, faster migration
- As in option 2 above, but new keys can also be Ed25519. If we expect that
- not all installations will allow Ed25519 (see "ECC Notes", section 1.1),
- we'll need to say that every server with an Ed25519 key must also have an
- RSA>1024 key.
+ As in option 2 above, but new keys can also be Ed25519. If we expect
+ that not all installations will allow Ed25519 (see "ECC Notes",
+ section 1.1), we'll need to say that every server with an Ed25519 key
+ must also have an RSA>1024 key.
2.4. Implications for current use of identity keys
@@ -204,27 +223,29 @@ Author: Nick Mathewson
The current v3 link handshake can handle presenting multiple identity
certificates in the CERT cell. We should consider ourselves to be
- connected to a node with identity X if _any_ of the identity certificates
- that it presents in its authenticated CERT cell has identity X. To handle
- EXTEND cells correctly, we should verify every identity we can.
+ connected to a node with identity X if _any_ of the identity
+ certificates that it presents in its authenticated CERT cell has
+ identity X. To handle EXTEND cells correctly, we should verify every
+ identity we can.
- To make sure that the extending node is talking to the right next node
when sending an extend cell
- The new extend cell format needs to allow the client to tell the extending
- node about some identity for the destination node that the extending node
- will be able to understand. This is a capability of the extending node
- that the client needs to be able to check. (Also, the extend cell needs to
- hash that identity in a form the extending node can understand; but that's
- a fingerprint issue.)
+ The new extend cell format needs to allow the client to tell the
+ extending node about some identity for the destination node that the
+ extending node will be able to understand. This is a capability of
+ the extending node that the client needs to be able to check. (Also,
+ the extend cell needs to hash that identity in a form the extending
+ node can understand; but that's a fingerprint issue.)
- To determine which part of the circuit ID space to use on a Tor
instance's links.
- We can continue to use RSA1024 identity key comparison here by default. We
- can also use some other parameter of the v3 handshake, or introduce a new
- link protocol where if the initiator authenticates, the initiator always
- gets the low circIDs and the responder always gets the high ones.
+ We can continue to use RSA1024 identity key comparison here by
+ default. We can also use some other parameter of the v3 handshake, or
+ introduce a new link protocol where if the initiator authenticates,
+ the initiator always gets the low circIDs and the responder always
+ gets the high ones.
- To identity nodes in consensus directories
- To identify nodes in the UI in various places
@@ -239,20 +260,20 @@ Author: Nick Mathewson
2.5. Migrating away from short ID keys entirely
Eventually no version of Tor that requires 1024-bit identity keys will
- remain. When that happens, we should stop using them entirely. That means
- that if we take any path other than the "slow migration" path of 2.1, we'll
- need to make everything that looks at a node's identity also accept nodes
- with _only_ a RSA>1024/Ed25519 identity.
+ remain. When that happens, we should stop using them entirely. That
+ means that if we take any path other than the "slow migration" path of
+ 2.1, we'll need to make everything that looks at a node's identity
+ also accept nodes with _only_ a RSA>1024/Ed25519 identity.
- At the directory service level, we should have an option to allow nodes
- without RSA1024 identity keys (off until all clients and nodes accept new
- identity keys).
+ At the directory service level, we should have an option to allow
+ nodes without RSA1024 identity keys (off until all clients and nodes
+ accept new identity keys).
2.6. Implications for directory information
- Clients must know a hash for each node's identity key, or else they can't
- make an authenticated connection to the node or tell ORs how to extend to
- the node.
+ Clients must know a hash for each node's identity key, or else they
+ can't make an authenticated connection to the node or tell ORs how to
+ extend to the node.
3. New fingerprints
@@ -261,43 +282,46 @@ Author: Nick Mathewson
encoding of the RSA1024 identity key. We encode this in hex almost
everywhere, and prefix it with a $.
- I propose that fingerprints of the future be determined by taking a digest
- using SHA256 or SHA3 of:
+ I propose that fingerprints of the future be determined by taking a
+ digest using SHA256 or SHA3 of:
"Hash Algorithm Name", "Key Type Name", encoded key
- When representing these internally, we should include the hash algorithm
- that was used. When representing them in the UI, we should use the
- notation %b64, where b64 is a base-64 encoding, omitting the trailing =s.
+ When representing these internally, we should include the hash
+ algorithm that was used. When representing them in the UI, we should
+ use the notation %b64, where b64 is a base-64 encoding, omitting the
+ trailing =s.
- (Other plausible characters to use are @, ?, +, ~, =, etc. I like %, but
- can be persuaded. Bikeshed bikeshed bikeshed.)
+ (Other plausible characters to use are @, ?, +, ~, =, etc. I like %,
+ but can be persuaded. Bikeshed bikeshed bikeshed.)
- Since 43 base-64 characters is enough to represent a 256-bit digest, with 2
- bits left over, I propose that the b64 value encode
+ Since 43 base-64 characters is enough to represent a 256-bit digest,
+ with 2 bits left over, I propose that the b64 value encode
hh | D(hash algorithm name, key type, encoded key),
+
where hh is a 2-bit value, with one of the values:
00 -- sha256
01 -- sha3
10 -- to be determined
11 -- reserved.
- We should investigate in the interface whether it's plausible to allow a
- prefix of a node ID where the full ID would otherwise be required. That
- seems risky for short prefixes, though.
+ We should investigate in the interface whether it's plausible to allow
+ a prefix of a node ID where the full ID would otherwise be required.
+ That seems risky for short prefixes, though.
3.1. How many fingerprints is that anyway?!
- Suppose that we allow sha256 and sha3 as hash algorithms, and we allow each
- node to have 3 identity keys: one RSA1024, one RSA>1024, and one ECC. Then
- we would have 7 fingerprints (6 plus the legacy SHA1(RSA1024) fingerprint),
- for a total of 20+6*32==212 bytes per node.
+ Suppose that we allow sha256 and sha3 as hash algorithms, and we allow
+ each node to have 3 identity keys: one RSA1024, one RSA>1024, and one
+ ECC. Then we would have 7 fingerprints (6 plus the legacy
+ SHA1(RSA1024) fingerprint), for a total of 20+6*32==212 bytes per
+ node.
- It's not a horrible problem to accept them all in the UI, but the UI isn't
- the only place that needs to know fingerprints. Instead, let's say that
- RSA1024 identities are only identified with SHA1 hashes. This limits our
- fingerprint load to a more manageable 20+32*2 == 84 bytes per node. Still
- not great, though.
+ It's not a horrible problem to accept them all in the UI, but the UI
+ isn't the only place that needs to know fingerprints. Instead, let's
+ say that RSA1024 identities are only identified with SHA1 hashes.
+ This limits our fingerprint load to a more manageable 20+32*2 == 84
+ bytes per node. Still not great, though.
3.2. What does this imply for the UI?
@@ -317,20 +341,21 @@ Author: Nick Mathewson
4.2. More fingerprints
Because extending from node A to node B requires that we have node B's
- fingerprint in a way that node A will understand, it is not enough to get a
- set of identity fingerprints for each node in the format _we_ like best.
- Instead, we must .
+ fingerprint in a way that node A will understand, it is not enough to
+ get a set of identity fingerprints for each node in the format _we_
+ like best. Instead, we must .
4.x. An option: compound signatures on directory objects
- In Tor 0.2.2.x and later, when we check a signature on a directory object
- (not including hidden service descriptors), we only look at the first
- DIGEST_LEN bytes of the RSA-signed data. Once 0.2.1.x is obsolete, or on
- any types of signatures not checked in 0.2.1.x, we can use the rest of the
- space. (We're using PKCS1 padding on our signatures, which has an
- overhead of 11 bytes. Signing a SHA1 hash with a 1024-bit key therefore
- leaves 128-11-20==97 more bytes we could use for a SHA2 or a SHA3 hash.)
+ In Tor 0.2.2.x and later, when we check a signature on a directory
+ object (not including hidden service descriptors), we only look at
+ the first DIGEST_LEN bytes of the RSA-signed data. Once 0.2.1.x is
+ obsolete, or on any types of signatures not checked in 0.2.1.x, we
+ can use the rest of the space. (We're using PKCS1 padding on our
+ signatures, which has an overhead of 11 bytes. Signing a SHA1 hash
+ with a 1024-bit key therefore leaves 128-11-20==97 more bytes we
+ could use for a SHA2 or a SHA3 hash.)
4.x.
@@ -342,121 +367,127 @@ Author: Nick Mathewson
6. Relay crypto changes
-There are a few things we might want out of improved relay crypto. They
-include:
+ There are a few things we might want out of improved relay crypto.
+ They include:
- Resistance to end-to-end bitwise tagging attacks.
- Better resistance to malleability.
- - If using counter mode, no block-cipher operations on any value known
- to the attacker.
+ - If using counter mode, no block-cipher operations on any value
+ known to the attacker.
-I'll try to provide these in increasing order of difficulty. None of these
-is necessarily correct; I should look for a security proof or a better
-construction for any that we seem likely to use.
+ I'll try to provide these in increasing order of difficulty. None of
+ these is necessarily correct; I should look for a security proof or a
+ better construction for any that we seem likely to use.
-Rationales: Our existing malleability resistance is a kludge. Doing no
-block-cipher ops on attacker-known values increases our security margins a
-little. Our arguments about tagging attacks hold that an attacker who
-controls both ends has plenty of ways to win even if tagging attacks are
-foiled; nonetheless, most of these ways are technically slightly more
-difficult than xor-based tagging, and it could be useful to boost our
-defense-in-depth a little bit, just in case other active end-to-end attacks
-turn out to be harder than we'd though.)
+ Rationales: Our existing malleability resistance is a kludge. Doing
+ no block-cipher ops on attacker-known values increases our security
+ margins a little. Our arguments about tagging attacks hold that an
+ attacker who controls both ends has plenty of ways to win even if
+ tagging attacks are foiled; nonetheless, most of these ways are
+ technically slightly more difficult than xor-based tagging, and it
+ could be useful to boost our defense-in-depth a little bit, just in
+ case other active end-to-end attacks turn out to be harder than we'd
+ though.)
-Option 1: Use AES-CTR in a less scary mode
+6.1. Option 1: Use AES-CTR in a less scary mode
- When doing key expansion, in addition to establishing Kf, Kb, Df, and Db,
- also establish IVf and IVb. Use the current relay crypto, except instead
- of starting the counters at 0, start them at IVf and IVb. This way, an
- attacker doesn't have any known plaintexts to work with, which makes AES a
- little more robust.
+ When doing key expansion, in addition to establishing Kf, Kb, Df, and
+ Db, also establish IVf and IVb. Use the current relay crypto, except
+ instead of starting the counters at 0, start them at IVf and IVb.
+ This way, an attacker doesn't have any known plaintexts to work with,
+ which makes AES a little more robust.
-Option 2: As 1, but tagging attacks garble the circuit after one block.
+6.2. Option 2: As 1, but tagging attacks garble the circuit after one block.
Keep an HMAC of all previously received encrypted cells on a circuit.
- When decrypting a cell, use this HMAC value to determine the first 64 bits
- of the counter; increment the low 64 bits of the counter as usual.
+ When decrypting a cell, use this HMAC value to determine the first 64
+ bits of the counter; increment the low 64 bits of the counter as
+ usual.
- This way, if an adversary flips any bits before passing the stream through
- an honest node, no _subsequent_ block will be recoverable.
+ This way, if an adversary flips any bits before passing the stream
+ through an honest node, no _subsequent_ block will be recoverable.
- To prevent any part of the stream from being re-used, close any circuit if
- the low 64 bits of the counter would ever wrap (that is, around 295
- million terabytes).
+ To prevent any part of the stream from being re-used, close any
+ circuit if the low 64 bits of the counter would ever wrap (that is,
+ around 295 million terabytes).
- (If we're using a stream cipher with fast re-key, then we can just have
- the key used for each block be an HMAC of all previously received
- ciphertext.)
+ (If we're using a stream cipher with fast re-key, then we can just
+ have the key used for each block be an HMAC of all previously
+ received ciphertext.)
Option 3: As 1, but tagging attacks garble the circuit in the same block.
Use a large-block cipher mode, such as BEAR or LIONESS (depending on
- whether we need a PRP or SPRP). Base the key material for each block on
- an HMAC of all previous blocks' ciphertexts.
+ whether we need a PRP or SPRP). Base the key material for each block
+ on an HMAC of all previous blocks' ciphertexts.
- This way, if an adversary makes any alteration in a block, that block and
- all subsequent blocks will be garbled. It's more expensive than 2,
- though, especially if we need to use a LIONESS construction.
+ This way, if an adversary makes any alteration in a block, that block
+ and all subsequent blocks will be garbled. It's more expensive than
+ 2, though, especially if we need to use a LIONESS construction.
- {I considered IGE here, with a trick where odd-numbered nodes on a circuit
- start from the front of the block and even-numbered nodes start from the
- end, but it didn't seem much better. We should investigate relative
- performance, though.}
+ {I considered IGE here, with a trick where odd-numbered nodes on a
+ circuit start from the front of the block and even-numbered nodes
+ start from the end, but it didn't seem much better. We should
+ investigate relative performance, though.}
Option 4: Shall we have middle nodes be able to fast-stop bad data?
- In all the above options, if a cell is altered, the middle node can at
- best turn that cell and the rest of the cells on the circuit into garbage,
- which the last node won't deliver (if honest) or can't deliver (if dishonest).
-
- Might we prefer to do as in mixnets, and have nodes kill circuits upon
- receiving altered cells?
-
- I'm not so sure. It's relatively expensive in per-cell overhead, and the
- next-best attack to the one it prevents isn't so great.
-
- Consider that if option 3 is in place, an end-to-end attacker who wants to
- do a tagging attack at one node can garble the rest of the circuit, and
- see if the output is garbled at the exit node. But such an attacker could
- just as easily close the circuit at one of those nodes and watch for a
- corresponding close event, or even better -- simply pause traffic on that
- circuit for a while and watch for a corresponding gap at the exit. The
- only advantage of the garbling attack would be that garbled cells are
- presumably rarer than circuit closes or traffic pauses, and thus easier to
- use to distinguish target circuits. But that's still bunk; the other
- attacks win fine, and the pause attack doesn't risk detection so much.
-
- Still, to do this one, we'd want to have outgoing and incoming circuits
- treated differently. Incoming cells could work as in 1 or 2 or 3 above;
- outgoing cells would want to have a header portion as in mixmaster, where
- each node checks that the first N bits of the header match a MAC of all
- data seen so far, *including the rest of the cell*. They'd then decrypt
- the rest of the cell, remove the first N bits of the header, and re-pad
- the header with N bits at the end, taken from a PRNG whose seed is shared
- with the client. (This is basically how mixmaster works, and how
- mixminion works in the common case.)
-
- The space overhead here is kind of large: N bits per cell per node. In
- the most paranoid case, if we used 256-byte HMACs on 3-node paths, that's
- 96 bytes per cell, which is more than 20% of the total length. But we can
- probably do better if we let the CREATE operation also tell the node some
- N to check. For example, the first node doesn't need to check any bits.
- The second and third nodes could check 64 bits apiece; that only has 16
- bytes overhead, and high probability of catching any changes. (Birthday
- attacks don't matter here, and an attacker who mounts this attack for long
- enough to accidentally find a 64-bit mac will break so many circuits in
- the process as to become totally unreliable.
-
- But all of this leaks the path lengths and position on the path to various
- nodes. We might open ourselves up to partitioning attacks if different
- clients choose different numbers of bits. What's more, we might leak the
- length of the path to the last node by how much junk there is at the end
- of the cell.
+ In all the above options, if a cell is altered, the middle node can
+ at best turn that cell and the rest of the cells on the circuit into
+ garbage, which the last node won't deliver (if honest) or can't
+ deliver (if dishonest).
+
+ Might we prefer to do as in mixnets, and have nodes kill circuits
+ upon receiving altered cells?
+
+ I'm not so sure. It's relatively expensive in per-cell overhead, and
+ the next-best attack to the one it prevents isn't so great.
+
+ Consider that if option 3 is in place, an end-to-end attacker who
+ wants to do a tagging attack at one node can garble the rest of the
+ circuit, and see if the output is garbled at the exit node. But such
+ an attacker could just as easily close the circuit at one of those
+ nodes and watch for a corresponding close event, or even better --
+ simply pause traffic on that circuit for a while and watch for a
+ corresponding gap at the exit. The only advantage of the garbling
+ attack would be that garbled cells are presumably rarer than circuit
+ closes or traffic pauses, and thus easier to use to distinguish
+ target circuits. But that's still bunk; the other attacks win fine,
+ and the pause attack doesn't risk detection so much.
+
+ Still, to do this one, we'd want to have outgoing and incoming
+ circuits treated differently. Incoming cells could work as in 1 or 2
+ or 3 above; outgoing cells would want to have a header portion as in
+ mixmaster, where each node checks that the first N bits of the header
+ match a MAC of all data seen so far, *including the rest of the
+ cell*. They'd then decrypt the rest of the cell, remove the first N
+ bits of the header, and re-pad the header with N bits at the end,
+ taken from a PRNG whose seed is shared with the client. (This is
+ basically how mixmaster works, and how mixminion works in the common
+ case.)
+
+ The space overhead here is kind of large: N bits per cell per node.
+ In the most paranoid case, if we used 256-byte HMACs on 3-node paths,
+ that's 96 bytes per cell, which is more than 20% of the total length.
+ But we can probably do better if we let the CREATE operation also
+ tell the node some N to check. For example, the first node doesn't
+ need to check any bits. The second and third nodes could check 64
+ bits apiece; that only has 16 bytes overhead, and high probability of
+ catching any changes. (Birthday attacks don't matter here, and an
+ attacker who mounts this attack for long enough to accidentally find
+ a 64-bit mac will break so many circuits in the process as to become
+ totally unreliable.
+
+ But all of this leaks the path lengths and position on the path to
+ various nodes. We might open ourselves up to partitioning attacks if
+ different clients choose different numbers of bits. What's more, we
+ might leak the length of the path to the last node by how much junk
+ there is at the end of the cell.
Here's a simple construction for this format, to be concrete:
The CREATE operation's KDF produces the following outputs:
- Kf, IVf (stream cipher key, and possibly IV, for forward direction)
- Kb, IVb (stream cipher key, and possibly IV, for reverse direction)
+ Kf, IVf (stream cipher key and IV for forward direction)
+ Kb, IVb (stream cipher key and IV for reverse direction)
Mf (MAC key for forward direction)
Mb (MAC key for reverse direction)
SEEDf (PRNG key for forward direction)
@@ -465,27 +496,28 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
MACBYTESb (an integer between 0 and 32 inclusive)
CANEXIT (boolean: can we exit from this hop?)
- Let Kf[i], Mf[i], etc denote the parameter Kf, Mf, etc as shared between
- the client and the i'th node in its circuit.
-
- Relay cells sent towards the client have the following plaintext format:
+ Let Kf[i], Mf[i], etc denote the parameter Kf, Mf, etc as shared
+ between the client and the i'th node in its circuit.
+ Relay cells sent towards the client have the following plaintext
+ format:
Body:
Content:
Relay Command [1 byte]
StreamID [2 bytes]
Length [2 bytes]
Data [Up to CELL_DATA_LEN-5-MACBYTESb bytes]
- Padding [randomly generated as needed to fill the cell]
+ Padding [randomly generated as needed to fill the
+ cell]
MAC(All previous encrypted content + encrypted content,
Mb)[:MACBYTESb] [MACBYTESb bytes]
- The originator of the client-bound cell encrypts the content with the
- next part of its Kb,IVb stream, then appends the MAC.
+ The originator of the client-bound cell encrypts the content with
+ the next part of its Kb,IVb stream, then appends the MAC.
- Non-clients receiving a client-bound relay cell encrypt the entire cell
- body, MAC included, with the next part of the stream cipher that was
- keyed with Kb,IVb.
+ Non-clients receiving a client-bound relay cell encrypt the entire
+ cell body, MAC included, with the next part of the stream cipher
+ that was keyed with Kb,IVb.
When the client receives a relay cell body, it iteratively does:
@@ -497,15 +529,15 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
passed by every node on the circuit.}
If MACBYTESb[i]>0, check whether MAC(cells_i | cellbody,
- Mb[i])[:MACBYTESb[i]] the last MACBYTESb[i] bytes of the cell. If
- so, this cell is from node i.
+ Mb[i])[:MACBYTESb[i]] the last MACBYTESb[i] bytes of the
+ cell. If so, this cell is from node i.
If this cell is from node i, decrypt the first
CELL_DATA_LEN-MACBYTESb[i] bytes of the cell using the stream
keyed with Kb[i],IVb[i]. Act on it as a relay cell.
- Otherwise decrypt the entire cell, MAC included, with the stream
- keyed with Kb[i], IVb[i].
+ Otherwise decrypt the entire cell, MAC included, with the
+ stream keyed with Kb[i], IVb[i].
If no node sent this cell: it's junk and somebody is probably
messing with us! Destroy the circuit.
@@ -535,7 +567,8 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
For i in 2...N:
Let L = len(PADSEEN[i-1])
- Let PADSEEN[i] = (PADSEEN[i-1] xor STREAM[i-1][CELL_DATA_LEN-L:]) | PAD[i-1]
+ Let PADSEEN[i] = (PADSEEN[i-1] xor
+ STREAM[i-1][CELL_DATA_LEN-L:]) | PAD[i-1]
For i in N down to 1:
@@ -549,15 +582,15 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
To receive an outbound cell:
- Let M be the first MACBYTESf bytes of the cell, let REST be the rest
- of the cell, and let "cells" be all previous cells on this circuit.
- If CANEXIT, and M = MAC(cells|rest|"RECOGNIZED", Mb)[:MACBYTESf],
- and MACBYTESf > 0, this cell is for us. If M = MAC(cells|rest|"OK",
- Mb)[:MACBYTESf], this cell is not for us, but is valid. Otherwise,
- destroy the circuit.
+ Let M be the first MACBYTESf bytes of the cell, let REST be the
+ rest of the cell, and let "cells" be all previous cells on this
+ circuit. If CANEXIT, and M = MAC(cells|rest|"RECOGNIZED",
+ Mb)[:MACBYTESf], and MACBYTESf > 0, this cell is for us. If M
+ = MAC(cells|rest|"OK", Mb)[:MACBYTESf], this cell is not for
+ us, but is valid. Otherwise, destroy the circuit.
- Decrypt REST using the stream cipher keyed with Kf,IVf. If this
- cell is for us, act on it as a relay cell. Otherwise, let
+ Decrypt REST using the stream cipher keyed with Kf,IVf. If
+ this cell is for us, act on it as a relay cell. Otherwise, let
PAD = the next MACBYTESf[i] bytes of the PRNG keyed with SEEDf,
and relay REST | PAD.
@@ -572,28 +605,30 @@ Option 4: Shall we have middle nodes be able to fast-stop bad data?
PICKING MACBYTESf,MACBYTESb.
We don't need to worry about birthday attacks:
- Because we're using a MAC, only the parties who are making the
- MACs could try to do a brute-force search for a collision, but
- they have no reason to do so.
- If a collision occurs accidentally, an adversary can't substitute
- an earlier-seen cell for a later one with the same MAC, since the
- MAC covers not only the cell, but all previous cells on the
- circuit.
- So 16 bytes is about the most we should ever do, given our usual
- security parameters. Let me moot the number 8 for MACBYTESb.
+ Because we're using a MAC, only the parties who are making
+ the MACs could try to do a brute-force search for a
+ collision, but they have no reason to do so.
+
+ If a collision occurs accidentally, an adversary can't
+ substitute an earlier-seen cell for a later one with the
+ same MAC, since the MAC covers not only the cell, but all
+ previous cells on the circuit.
+
+ So 16 bytes is about the most we should ever do, given our
+ usual security parameters. Let me moot the number 8 for
+ MACBYTESb.
For outbound cells, for any hop we can exit from, choosing
- MACBYTESf=6 gets us the current security level. For the first hop,
- assuming we don't exit from it, choosing MACBYTESf=0 is totally
- safe, since the link crypto guarantees that nothing was corrupted on
- the way.
+ MACBYTESf=6 gets us the current security level. For the first
+ hop, assuming we don't exit from it, choosing MACBYTESf=0 is
+ totally safe, since the link crypto guarantees that nothing was
+ corrupted on the way.
In general, to prevent an end-to-end tagging attack, it seems
sufficient to do something like setting MACBYTES=8 for the last
hop, and MACBYTES=8 for one hop in the middle.
-
ACKS
Lots of the good ideas and concerns here are due to Robert Ransom.
1
0
commit daf782e82c2ab1b66da07155ff83622923ba4140
Author: Nick Mathewson <nickm(a)torproject.org>
Date: Mon Oct 31 16:47:58 2011 -0400
Proofreading by seborn
---
proposals/ideas/xxx-new-crypto-sketch.txt | 62 ++++++++++++++---------------
1 files changed, 30 insertions(+), 32 deletions(-)
diff --git a/proposals/ideas/xxx-new-crypto-sketch.txt b/proposals/ideas/xxx-new-crypto-sketch.txt
index 83711c3..551b403 100644
--- a/proposals/ideas/xxx-new-crypto-sketch.txt
+++ b/proposals/ideas/xxx-new-crypto-sketch.txt
@@ -13,10 +13,10 @@ Author: Nick Mathewson
IDENTITY KEYS AND FINGERPRINTS
Addressed here in Section 2.
LINK CRYPTO (TLS) --
- Addressed in proposls 176, 184. We say a little here though in
- section 5.
+ Addressed in proposals 176, 184. We say a little here in section 5,
+ though.
CREATE/EXTEND CRYPTO --
- Addressed in xxx-ntor-handshake.txt and rransom's extend draft at
+ Addressed in xxx-ntor-handshake.txt and rransom's EXTEND draft at
[*] and subsequent discussion on the tor-dev mailing list. Not
considered here.
RELAY CRYPTO
@@ -41,7 +41,7 @@ Author: Nick Mathewson
groups (hereinafter "DH>1024"). {But see ECC Notes in 1.1 below.}
FOR A HASH FUNCTION: SHA256, switching to SHA3 in 2012 when it comes
- out. It might be worthwhile waiting for SHA3 in most places, and
+ out. It might be worthwhile waiting for SHA3 in most places and
skipping over the SHA256 stage entirely.
FOR A STREAM CIPHER: AES-CTR is in one sense a conservative choice
@@ -49,7 +49,7 @@ Author: Nick Mathewson
cache-based timing attacks are pretty worrisome. We can mitigate that
some by using random secret IVs for AES-CTR, so that we will be
encrypting neither attacker-chosen nor attacker-known plaintext with
- out AES cipher, but that's a bit kludgy. There are also supposed to
+ our AES cipher, but that's a bit kludgy. There are also supposed to
be time-invariant implementations that use Intel's AESNI instructions
where available, and time-invariant implementations that use
bit-slicing.
@@ -90,7 +90,7 @@ Author: Nick Mathewson
levels of security, and the resulting outputs are much shorter.
As near as I can tell as a layman, Certicom is muddying the waters as
- much as possible wrt claiming that it's nigh impractical to deploy ECC
+ much as possible wrt claiming that it's nigh-impractical to deploy ECC
without licensing their patents. This is rather like the silliness
that PKP used to pull back in the day, where they claimed that their
patents covered not only the existing public key cryptography
@@ -131,12 +131,12 @@ Author: Nick Mathewson
Identity keys and their fingerprints are used:
- To sign router descriptors.
- - To identify nodes in consensus directories
- - To make sure we're talking to the right node in the link handshake
+ - To identify nodes in consensus directories.
+ - To make sure we're talking to the right node in the link handshake.
- To make sure that the extending node is talking to the right next
- node when sending an extend cell
- - To identify particular nodes in the hidden service subsystem
- - To identify nodes in the UI in various places
+ node when sending an extend cell.
+ - To identify particular nodes in the hidden service subsystem.
+ - To identify nodes in the UI in various places.
- Internally, to identify a node uniquely in the codebase.
- To determine which part of the circuit ID space to use on a Tor
instance's links.
@@ -219,11 +219,11 @@ Author: Nick Mathewson
Let's review our use of identity keys again and make sure that we can
handle all of them with the ideas above.
- - To sign router descriptors
+ - To sign router descriptors.
We discussed this in 2.2.
- - To make sure we're talking to the right node in the link handshake
+ - To make sure we're talking to the right node in the link handshake.
The current v3 link handshake can handle presenting multiple identity
certificates in the CERT cell. We should consider ourselves to be
@@ -233,14 +233,14 @@ Author: Nick Mathewson
identity we can.
- To make sure that the extending node is talking to the right next node
- when sending an extend cell
+ when sending an extend cell.
The new extend cell format needs to allow the client to tell the
extending node about some identity for the destination node that the
extending node will be able to understand. This is a capability of
the extending node that the client needs to be able to check. (Also,
the extend cell needs to hash that identity in a form the extending
- node can understand; but that's a fingerprint issue.)
+ node can understand, but that's a fingerprint issue.)
- To determine which part of the circuit ID space to use on a Tor
instance's links.
@@ -251,19 +251,19 @@ Author: Nick Mathewson
the initiator always gets the low circIDs and the responder always
gets the high ones.
- - To identity nodes in consensus directories
- - To identify nodes in the UI in various places
+ - To identify nodes in consensus directories.
+ - To identify nodes in the UI in various places.
- Internally, to identify a node uniquely in the codebase.
See sections 3 and 4 below.
- - To identify particular nodes in the hidden service subsystem
+ - To identify particular nodes in the hidden service subsystem.
Out of scope.
2.5. Migrating away from short ID keys entirely
- Eventually no version of Tor that requires 1024-bit identity keys will
+ Eventually, no version of Tor that requires 1024-bit identity keys will
remain. When that happens, we should stop using them entirely. That
means that if we take any path other than the "slow migration" path of
2.1, we'll need to make everything that looks at a node's identity
@@ -307,9 +307,11 @@ Author: Nick Mathewson
Since 43 base-64 characters is enough to represent a 256-bit digest,
with 2 bits left over, I propose that the b64 value encode
- hh | D(hash algorithm name, key type, encoded key),
- where hh is a 2-bit value, with one of the values:
+ hh | D(hash algorithm name, key type, encoded key)
+
+ where hh is a 2-bit value, with one of the following values:
+
00 -- sha256
01 -- sha3
10 -- to be determined
@@ -356,7 +358,7 @@ Author: Nick Mathewson
This implies a proliferation of fingerprints! We should tread
carefully here. To prevent proliferation, the easiest solution is not
- to add too many new types, and to have a good plan for retiring older
+ to add too many new types and to have a good plan for retiring older
types.
3.4. Implications for EXTEND cells
@@ -428,7 +430,7 @@ Author: Nick Mathewson
get a set of identity fingerprints for each node in the format that
the client likes best -- see 3.3 and 3.4 above. So every flavor of
consensus we serve needs to include a node identity in a format the
- client understands, and a node identities in formats such that every
+ client understands, and node identities in formats such that every
node will understand at least one.
4.3. An option: compound signatures on directory objects
@@ -481,7 +483,7 @@ Author: Nick Mathewson
technically slightly more difficult than xor-based tagging, and it
could be useful to boost our defense-in-depth a little bit, just in
case other active end-to-end attacks turn out to be harder than we'd
- though.)
+ thought.)
6.1. Option 1: Use AES-CTR in a less scary mode
@@ -541,7 +543,7 @@ Author: Nick Mathewson
Consider that if option 3 is in place, an end-to-end attacker who
wants to do a tagging attack at one node can garble the rest of the
- circuit, and see if the output is garbled at the exit node. But such
+ circuit and see if the output is garbled at the exit node. But such
an attacker could just as easily close the circuit at one of those
nodes and watch for a corresponding close event, or even better --
simply pause traffic on that circuit for a while and watch for a
@@ -563,7 +565,7 @@ Author: Nick Mathewson
above, except that we'd have a MAC on them. For outgoing cells,
each node would check that the first N bytes of the cell
match a MAC of all data seen so far, *including the rest of the
- cell*. They'd then remove the first N bytes, and re-pad the cell
+ cell*. They'd then remove the first N bytes, re-pad the cell
with bytes from a PRNG, and decrypt the resulting re-padded cell.
(This is basically how mixmaster works, and how mixminion works in
the common case.)
@@ -629,7 +631,7 @@ Author: Nick Mathewson
were from node i, or relayed by node i,
and let cellbody = the body of the cell, except for the last
MACBYTESb[i] bytes,
- and let cell mac = the last MACBYTESb[i] bytes of this cell.
+ and let cellmac = the last MACBYTESb[i] bytes of this cell.
If cellmac is nonempty, check wither cellmac = mac_received,
where mac_received is the first MACBYTESb[i] bytes of
@@ -702,7 +704,7 @@ Author: Nick Mathewson
If we restrict MACBYTESf values to range 0..HL/2, where HL is the
length of the MAC output, we can replace
- MAC(x | "RECOGIZED")[:MACBYTESf] and MAC(x | "OK")[:MACBYTESf]
+ MAC(x | "RECOGNIZED")[:MACBYTESf] and MAC(x | "OK")[:MACBYTESf]
with
MAC(x)[:MACBYTESf] and MAC(x)[HL-MACBYTESf:]
@@ -755,10 +757,6 @@ Author: Nick Mathewson
total of 3 bytes for those 15 bits. That's an opportunity to
save another byte.
-6.5. Other ideas for
-
-
-
ACKS
Lots of the good ideas and concerns here are due to Robert Ransom.
1
0
commit d55883abf351b0df0a46ee6f5adc17dad61da9de
Author: Nick Mathewson <nickm(a)torproject.org>
Date: Tue Oct 18 21:43:36 2011 -0400
start a draft for "new crypto ops"
---
proposals/ideas/xxx-new-crypto-sketch.txt | 601 +++++++++++++++++++++++++++++
proposals/ideas/xxx-ntor-handshake.txt | 17 +-
2 files changed, 614 insertions(+), 4 deletions(-)
diff --git a/proposals/ideas/xxx-new-crypto-sketch.txt b/proposals/ideas/xxx-new-crypto-sketch.txt
new file mode 100644
index 0000000..a4893e1
--- /dev/null
+++ b/proposals/ideas/xxx-new-crypto-sketch.txt
@@ -0,0 +1,601 @@
+Title: Design sketch for new crypto ops
+Date: 19 Oct 2011
+Author: Nick Mathewson
+
+0. Overview
+
+ The point of this document is to discuss what crypto we ought to be using.
+ See "Initial Thoughts on Migrating Tor to New Cryptography" from last year
+ for general guidelines and principles.
+
+ In broad strokes, the parts of our crypto are:
+
+ IDENTITY KEYS AND FINGERPRINTS
+ Addressed here in Section 2.
+ LINK CRYPTO (TLS) --
+ Addressed in proposls 176, 184. We say a little here though in
+ section 5.
+ CREATE/EXTEND CRYPTO --
+ Addressed in xxx-ntor-handshake.txt and rransom's extend draft
+ RELAY CRYPTO
+ Addressed here in Section 6
+ DIRECTORY SYSTEM
+ Addressed here.
+ HIDDEN SERVICE SYSTEM
+ Addressed in a forthcoming document by rransom.
+
+1. Base algorithm choice
+
+ There seem to be two main candidate algorithms for signatures: RSA with big
+ keys (hereinafter "RSA>1024"); and Ed25519, which is DSA with the sharp
+ edges filed off on an Edwards curve related to DJB's Curve25519. We can
+ look at other ECC groups too. {But see ECC Notes.}
+
+ For Diffie Hellman: Curve25519 seems like a decent choice; failing that,
+ another . DH
+ on Z_p with big groups (hereinafter "DH>1024"). {But see ECC notes.}
+
+ For a hash function: SHA256, switching to SHA3 in 2012 when it comes out.
+ It might be worthwhile waiting for SHA3 in most places, and skipping over
+ the sha256 stage entirely.
+
+ For a stream cipher: AES-CTR is in one sense a conservative choice inasmuch
+ as AES is well-analyzed, but AES's well-known issues with cache-based
+ timing attacks are pretty worrisome. We can mitigate that some by using
+ random secret IVs for AES-CTR, so that we will be encrypting neither
+ attacker-chosen nor attacker-known plaintext with out AES cipher, but
+ that's a bit kludgy. Salsa20 is what rransom likes these days, but IMO we
+ aren't competent to tell whether it looks good or not; the existing attacks
+ against it don't look like very bad news to me, but who knows whether it's
+ getting enough attention that we can read. See also ChaCha; see also the
+ other EuroCrypt {XXXX} winners/finalists; see also SHA3 if the SHA3 winner
+ specifies a way to use it as a stream cipher, or specifies an underlying
+ stream/block cipher.
+
+ For a random number generator: We currently use OpenSSL seeded with
+ RAND_poll and with platform entropy. OpenSSL uses RC4 (XXX check). The
+ platform entropy management can be messy, obscure, or both. I suggest
+ that:
+ * we should seed our PRNG with more entropy sources if we can find some
+ promising code with an appropriate license
+ * instead of just using OpenSSL's PRNG, we should use OpenSSL's PRNG
+ xor'd with some other good PRNG. (Is Yarrow still cool? And is there
+ a combine operation better than xor?)
+ * We should re-seed the RNG before and after very sensitive operations,
+ like private key generation.
+
+
+1.1. ECC notes
+
+ ECC is the brave new[*] crypto of the future! It's faster[**] than doing
+ crypto in Z_n (as we do for RSA and DH now) for equivalent levels of
+ security, and the resulting outputs are much shorter.
+
+ As near as I can tell as a layman, Certicom is muddying the waters as much
+ as possible wrt claiming that it's nigh impractical to deploy ECC without
+ licensing their patents. This is rather like the silliness that PKP used
+ to pull back in the day, where they claimed that their patents covered not
+ only the existing public key cryptography algorithms, but also the very
+ idea of public key cryptography itself.
+
+ DJB claims that for every patent he's aware of, either that patent doesn't
+ cover his code, or that patent is invalid because of prior art. I'm not
+ going to try to evaluate these claims, since I'm not supposed to be reading
+ patents for typical "let's avoid the appearance of knowing infringement"
+ reasons. But before we dive into the world of ECC, we should see if we can
+ ask any friendly patent attorneys and ECC experts for a second or third
+ opinion here.
+
+ I note in passing that nearly all of the patents that DJB mentions in his
+ list would appear to expire over the next 12 months or so.
+
+ Additionally, there are ECC groups out there less fast than DJB's, but more
+ widely available and analyzed. We should consider some of those too.
+
+ One final issue to investigate is whether using these algorithms will make
+ any major free software distribution decide not to include us. I seem to
+ recall seeing that one or two of the big ones had at one point decided to
+ ship OpenSSL only with ECC disabled, either because of real patent
+ concerns, or because of an opinion that the certicom license for ECC use in
+ TLS was problematic for free software, or something like that. We should
+ check that out.
+
+ [*] Actually, it's older than onion routing, and older than some members of
+ the Tor project.
+
+ [**] Actually, because of the common practice of choosing a small-ish prime
+ value (65537) for e in RSA, RSA public key operations can be a little
+ faster than equivalent-security ECDH or ECDSA operations. The private key
+ operations in RSA are still much much slower.
+
+2. New identities
+
+ Identity keys and their fingerprints are used:
+ - To sign router descriptors
+ - To identify nodes in consensus directories
+ - To make sure we're talking to the right node in the link handshake
+ - To make sure that the extending node is talking to the right next node
+ when sending an extend cell
+ - To identify particular nodes in the hidden service subsystem
+ - To identify nodes in the UI in various places
+ - Internally, to identify a node uniquely in the codebase.
+ - To determine which part of the circuit ID space to use on a Tor
+ instance's links.
+
+2.1. New identities, option 1: RSA>1024, slow migration.
+
+ We use RSA for identity keys indefinitely. Nearly all operations done with
+ an identity key are signature checking; signing happens only a few times an
+ hour per node even with pathological cases. Since signature checking is
+ really cheap with RSA, there's no speed advantage for ECC here. (There is
+ a space advantage, since the keys are much smaller.)
+
+ The easiest way to migrate to longer identity keys is to tell all Tors to
+ begin accepting longer identity keys now, and to tweak all our protocols so
+ that longer RSA identity keys are understood. We should then have a pair
+ of parameters in the consensus that determines the largest and smallest
+ acceptable identity key size in the network. Clients and servers should
+ reject any keys longer or shorter than specified. Once all versions of Tor
+ can accept long identity keys, we raise the maximum size from 1024 to
+ somewhere in the 2048-4096 range.
+
+2.2. New identities option 2: RSA>1024, faster migration.
+
+ We use RSA for identity keys indefinitely as above. But we allow nodes to
+ begin having longer identities now, even though older Tors won't understand
+ them. This implies, of course, that every such node needs to have at least
+ 2 identities: one RSA1024 identity for backward compatibility, one RSA>1024
+ identity for more secure identification.
+
+ We would have these identities cross-certify as follows: All keys would be
+ listed in the router descriptor. RSA>1024 keys would be called something
+ other than identity-key, so as not to confuse older clients. A signature
+ with the RSA>1024 key would appear right before the current RSA1024
+ signature. This way, signed material would include both keys, and would be
+ signed by both keys.
+
+ [In other words, descriptors would look something like:
+
+ router foo...
+ ...
+ identity-key
+ -----BEGIN RSA KEY-----
+ 1024-bit RSA key here
+ -----END RSA KEY-----
+ ext-identity-key
+ -----BEGIN RSA KEY-----
+ 3072-bit RSA key here
+ -----END RSA KEY-----
+ ...
+ ext-signature
+ -----BEGIN SIGNATURE-----
+ signature of everything through "ext-signature\n", using the long key
+ -----END SIGNATURE-----
+ router-signature
+ -----BEGIN SIGNATURE-----
+ signature of everything through "router-signature\n", using the short key
+ -----END SIGNATURE-----
+
+ ]
+
+ See "UI notes" in the "new fingerprints" section below for some of the
+ implications of letting nodes have multiple identity keys.
+
+ We'll need to advertise these new identities in consensus directories too;
+ see XXXX below for more info there.
+
+2.3. New identities option 3: RSA>1024 and/or Ed25519, faster migration
+
+ As in option 2 above, but new keys can also be Ed25519. If we expect that
+ not all installations will allow Ed25519 (see "ECC Notes", section 1.1),
+ we'll need to say that every server with an Ed25519 key must also have an
+ RSA>1024 key.
+
+2.4. Implications for current use of identity keys
+
+ Let's review our use of identity keys again and make sure that we can
+ handle all of them with the ideas above.
+
+ - To sign router descriptors
+
+ We discussed this in 2.2.
+
+ - To make sure we're talking to the right node in the link handshake
+
+ The current v3 link handshake can handle presenting multiple identity
+ certificates in the CERT cell. We should consider ourselves to be
+ connected to a node with identity X if _any_ of the identity certificates
+ that it presents in its authenticated CERT cell has identity X. To handle
+ EXTEND cells correctly, we should verify every identity we can.
+
+ - To make sure that the extending node is talking to the right next node
+ when sending an extend cell
+
+ The new extend cell format needs to allow the client to tell the extending
+ node about some identity for the destination node that the extending node
+ will be able to understand. This is a capability of the extending node
+ that the client needs to be able to check. (Also, the extend cell needs to
+ hash that identity in a form the extending node can understand; but that's
+ a fingerprint issue.)
+
+ - To determine which part of the circuit ID space to use on a Tor
+ instance's links.
+
+ We can continue to use RSA1024 identity key comparison here by default. We
+ can also use some other parameter of the v3 handshake, or introduce a new
+ link protocol where if the initiator authenticates, the initiator always
+ gets the low circIDs and the responder always gets the high ones.
+
+ - To identity nodes in consensus directories
+ - To identify nodes in the UI in various places
+ - Internally, to identify a node uniquely in the codebase.
+
+ See sections 3 and 4 below.
+
+ - To identify particular nodes in the hidden service subsystem
+
+ Out of scope.
+
+2.5. Migrating away from short ID keys entirely
+
+ Eventually no version of Tor that requires 1024-bit identity keys will
+ remain. When that happens, we should stop using them entirely. That means
+ that if we take any path other than the "slow migration" path of 2.1, we'll
+ need to make everything that looks at a node's identity also accept nodes
+ with _only_ a RSA>1024/Ed25519 identity.
+
+ At the directory service level, we should have an option to allow nodes
+ without RSA1024 identity keys (off until all clients and nodes accept new
+ identity keys).
+
+2.6. Implications for directory information
+
+ Clients must know a hash for each node's identity key, or else they can't
+ make an authenticated connection to the node or tell ORs how to extend to
+ the node.
+
+
+3. New fingerprints
+
+ Right now we compute fingerprints by taking the SHA1 hash of an ASN1
+ encoding of the RSA1024 identity key. We encode this in hex almost
+ everywhere, and prefix it with a $.
+
+ I propose that fingerprints of the future be determined by taking a digest
+ using SHA256 or SHA3 of:
+
+ "Hash Algorithm Name", "Key Type Name", encoded key
+
+ When representing these internally, we should include the hash algorithm
+ that was used. When representing them in the UI, we should use the
+ notation %b64, where b64 is a base-64 encoding, omitting the trailing =s.
+
+ (Other plausible characters to use are @, ?, +, ~, =, etc. I like %, but
+ can be persuaded. Bikeshed bikeshed bikeshed.)
+
+ Since 43 base-64 characters is enough to represent a 256-bit digest, with 2
+ bits left over, I propose that the b64 value encode
+ hh | D(hash algorithm name, key type, encoded key),
+ where hh is a 2-bit value, with one of the values:
+ 00 -- sha256
+ 01 -- sha3
+ 10 -- to be determined
+ 11 -- reserved.
+
+ We should investigate in the interface whether it's plausible to allow a
+ prefix of a node ID where the full ID would otherwise be required. That
+ seems risky for short prefixes, though.
+
+3.1. How many fingerprints is that anyway?!
+
+ Suppose that we allow sha256 and sha3 as hash algorithms, and we allow each
+ node to have 3 identity keys: one RSA1024, one RSA>1024, and one ECC. Then
+ we would have 7 fingerprints (6 plus the legacy SHA1(RSA1024) fingerprint),
+ for a total of 20+6*32==212 bytes per node.
+
+ It's not a horrible problem to accept them all in the UI, but the UI isn't
+ the only place that needs to know fingerprints. Instead, let's say that
+ RSA1024 identities are only identified with SHA1 hashes. This limits our
+ fingerprint load to a more manageable 20+32*2 == 84 bytes per node. Still
+ not great, though.
+
+3.2. What does this imply for the UI?
+
+ In the UI we'll lose the property that no node has more than one
+ fingerprint: I do not believe that this actually hurts us.
+
+
+4. Directory changes
+
+4.1. Better cross-referencing
+
+ In some places, directory objects cross-reference one another by SHA1
+ hash. They should use a better hash algorithm instead.
+
+ XXX
+
+4.2. More fingerprints
+
+ Because extending from node A to node B requires that we have node B's
+ fingerprint in a way that node A will understand, it is not enough to get a
+ set of identity fingerprints for each node in the format _we_ like best.
+ Instead, we must .
+
+
+4.x. An option: compound signatures on directory objects
+
+ In Tor 0.2.2.x and later, when we check a signature on a directory object
+ (not including hidden service descriptors), we only look at the first
+ DIGEST_LEN bytes of the RSA-signed data. Once 0.2.1.x is obsolete, or on
+ any types of signatures not checked in 0.2.1.x, we can use the rest of the
+ space. (We're using PKCS1 padding on our signatures, which has an
+ overhead of 11 bytes. Signing a SHA1 hash with a 1024-bit key therefore
+ leaves 128-11-20==97 more bytes we could use for a SHA2 or a SHA3 hash.)
+
+4.x.
+
+
+5. Link crypto changes
+
+
+
+
+6. Relay crypto changes
+
+There are a few things we might want out of improved relay crypto. They
+include:
+ - Resistance to end-to-end bitwise tagging attacks.
+ - Better resistance to malleability.
+ - If using counter mode, no block-cipher operations on any value known
+ to the attacker.
+
+I'll try to provide these in increasing order of difficulty. None of these
+is necessarily correct; I should look for a security proof or a better
+construction for any that we seem likely to use.
+
+Rationales: Our existing malleability resistance is a kludge. Doing no
+block-cipher ops on attacker-known values increases our security margins a
+little. Our arguments about tagging attacks hold that an attacker who
+controls both ends has plenty of ways to win even if tagging attacks are
+foiled; nonetheless, most of these ways are technically slightly more
+difficult than xor-based tagging, and it could be useful to boost our
+defense-in-depth a little bit, just in case other active end-to-end attacks
+turn out to be harder than we'd though.)
+
+Option 1: Use AES-CTR in a less scary mode
+
+ When doing key expansion, in addition to establishing Kf, Kb, Df, and Db,
+ also establish IVf and IVb. Use the current relay crypto, except instead
+ of starting the counters at 0, start them at IVf and IVb. This way, an
+ attacker doesn't have any known plaintexts to work with, which makes AES a
+ little more robust.
+
+Option 2: As 1, but tagging attacks garble the circuit after one block.
+
+ Keep an HMAC of all previously received encrypted cells on a circuit.
+ When decrypting a cell, use this HMAC value to determine the first 64 bits
+ of the counter; increment the low 64 bits of the counter as usual.
+
+ This way, if an adversary flips any bits before passing the stream through
+ an honest node, no _subsequent_ block will be recoverable.
+
+ To prevent any part of the stream from being re-used, close any circuit if
+ the low 64 bits of the counter would ever wrap (that is, around 295
+ million terabytes).
+
+ (If we're using a stream cipher with fast re-key, then we can just have
+ the key used for each block be an HMAC of all previously received
+ ciphertext.)
+
+Option 3: As 1, but tagging attacks garble the circuit in the same block.
+
+ Use a large-block cipher mode, such as BEAR or LIONESS (depending on
+ whether we need a PRP or SPRP). Base the key material for each block on
+ an HMAC of all previous blocks' ciphertexts.
+
+ This way, if an adversary makes any alteration in a block, that block and
+ all subsequent blocks will be garbled. It's more expensive than 2,
+ though, especially if we need to use a LIONESS construction.
+
+ {I considered IGE here, with a trick where odd-numbered nodes on a circuit
+ start from the front of the block and even-numbered nodes start from the
+ end, but it didn't seem much better. We should investigate relative
+ performance, though.}
+
+Option 4: Shall we have middle nodes be able to fast-stop bad data?
+
+ In all the above options, if a cell is altered, the middle node can at
+ best turn that cell and the rest of the cells on the circuit into garbage,
+ which the last node won't deliver (if honest) or can't deliver (if dishonest).
+
+ Might we prefer to do as in mixnets, and have nodes kill circuits upon
+ receiving altered cells?
+
+ I'm not so sure. It's relatively expensive in per-cell overhead, and the
+ next-best attack to the one it prevents isn't so great.
+
+ Consider that if option 3 is in place, an end-to-end attacker who wants to
+ do a tagging attack at one node can garble the rest of the circuit, and
+ see if the output is garbled at the exit node. But such an attacker could
+ just as easily close the circuit at one of those nodes and watch for a
+ corresponding close event, or even better -- simply pause traffic on that
+ circuit for a while and watch for a corresponding gap at the exit. The
+ only advantage of the garbling attack would be that garbled cells are
+ presumably rarer than circuit closes or traffic pauses, and thus easier to
+ use to distinguish target circuits. But that's still bunk; the other
+ attacks win fine, and the pause attack doesn't risk detection so much.
+
+ Still, to do this one, we'd want to have outgoing and incoming circuits
+ treated differently. Incoming cells could work as in 1 or 2 or 3 above;
+ outgoing cells would want to have a header portion as in mixmaster, where
+ each node checks that the first N bits of the header match a MAC of all
+ data seen so far, *including the rest of the cell*. They'd then decrypt
+ the rest of the cell, remove the first N bits of the header, and re-pad
+ the header with N bits at the end, taken from a PRNG whose seed is shared
+ with the client. (This is basically how mixmaster works, and how
+ mixminion works in the common case.)
+
+ The space overhead here is kind of large: N bits per cell per node. In
+ the most paranoid case, if we used 256-byte HMACs on 3-node paths, that's
+ 96 bytes per cell, which is more than 20% of the total length. But we can
+ probably do better if we let the CREATE operation also tell the node some
+ N to check. For example, the first node doesn't need to check any bits.
+ The second and third nodes could check 64 bits apiece; that only has 16
+ bytes overhead, and high probability of catching any changes. (Birthday
+ attacks don't matter here, and an attacker who mounts this attack for long
+ enough to accidentally find a 64-bit mac will break so many circuits in
+ the process as to become totally unreliable.
+
+ But all of this leaks the path lengths and position on the path to various
+ nodes. We might open ourselves up to partitioning attacks if different
+ clients choose different numbers of bits. What's more, we might leak the
+ length of the path to the last node by how much junk there is at the end
+ of the cell.
+
+ Here's a simple construction for this format, to be concrete:
+
+ The CREATE operation's KDF produces the following outputs:
+ Kf, IVf (stream cipher key, and possibly IV, for forward direction)
+ Kb, IVb (stream cipher key, and possibly IV, for reverse direction)
+ Mf (MAC key for forward direction)
+ Mb (MAC key for reverse direction)
+ SEEDf (PRNG key for forward direction)
+ And it also sets the following user-selected parameter:
+ MACBYTESf (an integer between 0 and 32 inclusive)
+ MACBYTESb (an integer between 0 and 32 inclusive)
+ CANEXIT (boolean: can we exit from this hop?)
+
+ Let Kf[i], Mf[i], etc denote the parameter Kf, Mf, etc as shared between
+ the client and the i'th node in its circuit.
+
+ Relay cells sent towards the client have the following plaintext format:
+
+ Body:
+ Content:
+ Relay Command [1 byte]
+ StreamID [2 bytes]
+ Length [2 bytes]
+ Data [Up to CELL_DATA_LEN-5-MACBYTESb bytes]
+ Padding [randomly generated as needed to fill the cell]
+ MAC(All previous encrypted content + encrypted content,
+ Mb)[:MACBYTESb] [MACBYTESb bytes]
+
+ The originator of the client-bound cell encrypts the content with the
+ next part of its Kb,IVb stream, then appends the MAC.
+
+ Non-clients receiving a client-bound relay cell encrypt the entire cell
+ body, MAC included, with the next part of the stream cipher that was
+ keyed with Kb,IVb.
+
+ When the client receives a relay cell body, it iteratively does:
+
+ For node i in circuit from 1..N:
+ Let cells_i = all previous cells which we decided were from i,
+ and let cellbody = the body of the cell, except for the last
+ MACBYTESb[i] bytes.
+ {XXXX I'd be more comfortable if the MAC covered all cells
+ passed by every node on the circuit.}
+
+ If MACBYTESb[i]>0, check whether MAC(cells_i | cellbody,
+ Mb[i])[:MACBYTESb[i]] the last MACBYTESb[i] bytes of the cell. If
+ so, this cell is from node i.
+
+ If this cell is from node i, decrypt the first
+ CELL_DATA_LEN-MACBYTESb[i] bytes of the cell using the stream
+ keyed with Kb[i],IVb[i]. Act on it as a relay cell.
+
+ Otherwise decrypt the entire cell, MAC included, with the stream
+ keyed with Kb[i], IVb[i].
+
+ If no node sent this cell: it's junk and somebody is probably
+ messing with us! Destroy the circuit.
+
+ When the client *sends* a cell outbound to node N:
+
+ Let cells[i] start at "" for all i in 1...N initially, and get
+ updated as below.
+
+ Let MACLEN = SUM(MACBYTESf[1...N])
+
+ Let Body =
+ Relay Command [1 byte]
+ StreamID [2 bytes]
+ Length [2 bytes]
+ Data [Up to CELL_DATA_LEN-5-MACLEN bytes]
+ Padding [randomly generated,
+ CELL_DATA_LEN-5-MACLEN-len(Data) bytes]
+
+ Let PAD[i] = the next MACBYTESf[i] bytes from the PRNG keyed
+ with SEEDf[i], for i in 1...N.
+
+ Let STREAM[i] = the next CELL_DATA_LEN-MACBYTESf[i] bytes of
+ the stream keyed by Kf[i],IV[i], for i in 1...N.
+
+ Let PADSEEN[1] == ""
+
+ For i in 2...N:
+ Let L = len(PADSEEN[i-1])
+ Let PADSEEN[i] = (PADSEEN[i-1] xor STREAM[i-1][CELL_DATA_LEN-L:]) | PAD[i-1]
+
+ For i in N down to 1:
+
+ Let Encbody = Body xor STREAM[i][len(body)]
+ Let extra = "RECOGNIZED" if i == N, "OK" otherwise
+ Let cells[i] = cells[i] | Body | PADSEEN[i]
+ Let M = MAC(cells[i] | extra , Mf[i])
+
+ Let Body = M[:MACBYTESf[i]] | EncBody
+
+
+ To receive an outbound cell:
+
+ Let M be the first MACBYTESf bytes of the cell, let REST be the rest
+ of the cell, and let "cells" be all previous cells on this circuit.
+ If CANEXIT, and M = MAC(cells|rest|"RECOGNIZED", Mb)[:MACBYTESf],
+ and MACBYTESf > 0, this cell is for us. If M = MAC(cells|rest|"OK",
+ Mb)[:MACBYTESf], this cell is not for us, but is valid. Otherwise,
+ destroy the circuit.
+
+ Decrypt REST using the stream cipher keyed with Kf,IVf. If this
+ cell is for us, act on it as a relay cell. Otherwise, let
+ PAD = the next MACBYTESf[i] bytes of the PRNG keyed with SEEDf,
+ and relay REST | PAD.
+
+ ANOTHER VARIANT:
+
+ If we restrict MACBYTESf values to range 0..HL/2, where HL is the
+ length of the MAC output, we can replace
+ MAC(x | "RECOGIZED")[:MACBYTESf] and MAC(x | "OK")[:MACBYTESf]
+ with
+ MAC(x)[:MACBYTESf] and MAC(x)[HL-MACBYTESf:]
+
+ PICKING MACBYTESf,MACBYTESb.
+
+ We don't need to worry about birthday attacks:
+ Because we're using a MAC, only the parties who are making the
+ MACs could try to do a brute-force search for a collision, but
+ they have no reason to do so.
+
+ If a collision occurs accidentally, an adversary can't substitute
+ an earlier-seen cell for a later one with the same MAC, since the
+ MAC covers not only the cell, but all previous cells on the
+ circuit.
+ So 16 bytes is about the most we should ever do, given our usual
+ security parameters. Let me moot the number 8 for MACBYTESb.
+
+ For outbound cells, for any hop we can exit from, choosing
+ MACBYTESf=6 gets us the current security level. For the first hop,
+ assuming we don't exit from it, choosing MACBYTESf=0 is totally
+ safe, since the link crypto guarantees that nothing was corrupted on
+ the way.
+
+ In general, to prevent an end-to-end tagging attack, it seems
+ sufficient to do something like setting MACBYTES=8 for the last
+ hop, and MACBYTES=8 for one hop in the middle.
+
+
+ACKS
+
+ Lots of the good ideas and concerns here are due to Robert Ransom.
+
+ Michael Stone helped some with "relay option 4" above.
diff --git a/proposals/ideas/xxx-ntor-handshake.txt b/proposals/ideas/xxx-ntor-handshake.txt
index 41af5c7..1f988fc 100644
--- a/proposals/ideas/xxx-ntor-handshake.txt
+++ b/proposals/ideas/xxx-ntor-handshake.txt
@@ -67,13 +67,20 @@ Protocol:
NODEID: ID -- H_LENGTH bytes
KEYID: KEYID(B) -- H_LENGTH bytes
CLIENT_PK: X -- G_LENGTH bytes
+ PARAMSLEN: -- 2 bytes
+ PARMS: -- PARAMSLEN byets
+
+ (The "PARAMS" component is used to encode any additional authenticated
+ information that's needed for establishing the right kind of circuit.)
The server generates a keypair of y,Y = KEYGEN(), and computes
- secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PROTOID
+ secret_input = EXP(X,y) | EXP(X,b) | ID | B | X | Y | PARAMSLEN | PARAMS
+ | PROTOID
KEY_SEED = H(secret_input, t_key)
verify = H(secret_input, t_verify)
- auth_input = verify | ID | B | Y | X | PROTOID | "Server"
+ auth_input = verify | ID | B | Y | X | PARAMSLEN | PARAMS | PROTOID
+ | "Server"
The server sends a CREATED cell containing:
@@ -82,10 +89,12 @@ Protocol:
The client then checks Y is in G^* [see below], and computes
- secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PROTOID
+ secret_input = EXP(Y,x) | EXP(B,x) | ID | B | X | Y | PARAMSLEN | PARAMS
+ | PROTOID
KEY_SEED = H(secret_input, t_key)
verify = H(secret_input, t_verify)
- auth_input = verify | ID | B | Y | X | PROTOID | "Server"
+ auth_input = verify | ID | B | Y | X | PARAMLENS | PARAMS | PROTOID
+ | "Server"
The client verifies that AUTH == H(auth_input, t_mac).
1
0