Re: [tor-dev] Sanitizing bridge descriptors containing ed25519 fields

30 May 2015

      On Fri, May 29, 2015 at 4:23 PM, Karsten Loesing <karsten@torproject.org> wrote:
...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Forwarding from a private thread with Nick.
- -------- Forwarded Message --------
Subject: Re: Whoops
Date: Fri, 29 May 2015 21:20:57 +0200
From: Karsten Loesing <karsten@torproject.org>
To: Nick Mathewson <nickm@torproject.org>
Ugh, long mail ahead.  This turns out to be more difficult than
expected...
Just like life itself!  Never fear, we will solve all problems and
build a better world.
...
On 29/05/15 19:29, Nick Mathewson wrote:
...
On Fri, May 29, 2015 at 11:04 AM, Karsten Loesing
<karsten@torproject.org> wrote:
...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Sure!
My main question is which of these new fields we'll have to
sanitize in bridge descriptors.
The current idea of sanitizing bridge identities is that Tonga
would give out server descriptors if you give it a bridge
identity.  We want to avoid that, which is why we're SHA1()'ing
fingerprints and removing cryptographic material.
What about the new identity?  Would we have to sanitize that in
any way?  And if so, would we want to SHA1() it, or is there a
more ed25519y way to do this?
I guess the better question might be: are there plans for Tonga
to give out descriptors if you tell it an ed25519 identity?  If
not, do you see any potential trouble in leaving it unchanged in
sanitized bridge descriptors?
I would suggest that we sanitize all the crosscert stuff, and the
ed25519 identity, and the ed25519 signing cert.  Does this need to
be done using some language I know?  If so I'll be happy to hack
it up for you if you point me to the current code that does it.
Thanks for the offer, really, but if I can, I'd rather want to write
this code myself once I know what it's supposed to do.  The reason is
that setting up this code and providing you with sample data might be
more effort than writing it myself.  Hope that's okay, too.
Only if you're curious, here's the current code that sanitizes bridge
descriptors:
https://gitweb.torproject.org/metrics-db.git/tree/src/org/torproject/ernie/d...
But feel free to ignore that code, and let's talk conceptually or by
example.
Okay.  (Sadly for me, it's Java.  I haven't touched Java in about 13
years, and probably shouldn't be trusted with it.)
...
...
(The authority might someday give out bridges based on this
information. Who knows? Not me.  Better to be safe than sorry
IMO.)
Okay.
...
To sanitize an ed25519 identity, i'd suggest SHA256.  Best avoid
SHA1.
Sure, that would work.
By the way, here's how we're currently sanitizing bridge descriptors:
https://collector.torproject.org/formats.html#bridge-descriptors
Following those steps, I'd do the following things (quoting an actual
bridge descriptor as input here; edit: scrubbed potentially sensitive
fields, sorry for the linebreaks!):
...
router euler [scrubbed] 8000 0 0 identity-ed25519 -----BEGIN
ED25519 CERT----- [scrubbed] -----END ED25519 CERT-----
Base64-decode that block, throw it into SHA256(), base64-encode the
result, format as block.  But wouldn't the result be much shorter?
There's no new "fingerprint" equivalent, like "fingerprint-ed25519",
is there?
Oh dear.  That blob is a certificate, not a key.  It changes over
time, because the key that it certifies varies over time.

The format is documented in section 2.1 of proposal 220; the actual
identity key is in an extension labeled with type 04 (see section
2.2.1).

Maybe we should add a fingerprint-ed25519 line?  It would be
redundant, but maybe valuable.  What do you think?

-- 
Nick