Hello Tor devs,
Namecoin is interested in collaboration with Tor in relation to
human-readable .onion names; I'm reaching out to see how open the Tor
community would be to this, and to get feedback on how exactly the
integration might work.
The new hidden service spec is going to substantially increase the
length of .onion names, which presents usability concerns. Namecoin
provides a way to resolve a human-readable .bit name to a .onion name.
Another benefit of Namecoin is that it provides a way to lookup TLS
fingerprints for clearnet .bit sites, which reduces the risk of MITM
attacks on clearnet websites from malicious or compromised CA's.
I had the pleasure of meeting Mike Perry at the Decentralized Web Summit
at the Internet Archive in June; I talked to him about Namecoin's rough
plans and he suggested I post here. I understand that Riccardo Spagni
from Monero discussed this topic as well with Roger Dingledine at the
Security in Times of Surveillance conference at Ei_PSI.
The two most major concerns that I expect would be brought up involve
anonymity and blockchain size. Here's how we plan to deal with these
issues:
Namecoin already provides location-anonymity for name registrations
assuming that it's routed via Tor. It's also necessary to broadcast
transactions for different names to different peers, which isn't coded
yet, but this is just coding work rather than an engineering challenge
-- a usable workaround today is running multiple Namecoin wallets.
The more interesting challenge is blockchain anonymity for
registrations, due to the linkability required for blockchain
validation. An important point here is that transactions for a given
name are inherently linkable to each other, and that this isn't
problematic. The problem would come when multiple names are linked
together, or when a name is linked with currency transactions. The
solution I've come up with is to use atomic cross-chain trades, which
let a user buy namecoins using a cryptocurrency that is designed to
provide anonymity (such as Monero or Zcash, both of which have
cryptographic proofs of anonymity, given a certain anonymity set and
security assumptions). The user would use an anonymous cryptocurrency
to buy a small amount of namecoins (enough to register a single name and
keep it renewed for a while). If the user wanted to register another
name, she would perform another atomic cross-chain trade, receiving
namecoins that are not linked to the namecoins obtained for the first
name. As long as those namecoins are not mixed by the wallet software,
the names remain unlinked.
Many users won't want to download the full Namecoin blockchain (around 3
GB at the moment). I have a proof-of-concept SPV-based Namecoin name
lookup client working as of early June. I just got a large part of that
code upstreamed into libdohj, and I'm working on getting the rest
upstreamed and released. It's in Java (based on BitcoinJ), so it's not
subject to the memory safety concerns that C/C++ code are. The SPV name
lookups are implemented in 3 ways, depending on the user's needs:
Option A:
1. Block headers are synced over the Namecoin P2P network. (Over
clearnet this takes about 5 minutes the first time it runs.)
2. An index mapping unexpired block heights to block hashes is
constructed, so that lookups can be done quickly. (This occurs when the
SPV client starts, after syncup has completed; it's fast enough that I
haven't found a need to benchmark it.)
3. When a name lookup request is received, the client asks a remote API
server for the height of the last update of the name.
4. The client looks up the block hash of that height from its index, and
requests that block over the P2P network.
5. The client verifies that the received block matches the correct hash
and that the block follows Namecoin rules (e.g. verifying the merkle root).
6. The client looks through the transactions in the block until it finds
the one that updates the name.
7. The client retrieves the value of the name from that transaction, and
returns it to the user.
Option B:
1 through 3. Same as Option A.
4. The API server also provides the full content of the transaction, as
well as a merkle proof of inclusion in the block.
5. The client verifies that the merkle proof links the hash of the
provided transaction to the merkle root of the block header with the
given height.
6. The client retrieves the value of the name from the provided
transaction, and returns it to the user.
Option C:
1. Block headers are synced over the Namecoin P2P network, as well as
full blocks for the past year (meaning that all full blocks that contain
unexpired name data will be synced). (Over clearnet this takes about 10
minutes the first time it runs.)
2. An index mapping names to transactions is constructed as the full
blocks are downloaded. (This uses LevelDB.)
3. When a name lookup request is received, the client looks up the
transaction in the LevelDB index.
4. The client retrieves the value of the name from that transaction, and
returns it to the user.
For Options A and B, if the API server is malicious, it can do any of
the following:
1. Falsely claim that the name doesn't exist.
2. Provide outdated name data that is less than 36000 blocks old (the
expiration period for Namecoin).
(Option C is not vulnerable to either of those attacks.)
If multiple API servers are consulted, and they return different
results, it is easy to tell which is lying (although I haven't
implemented any such logic yet).
The API server cannot do any of the following:
1. Provide name data that isn't from the blockchain with the most work.
2. Provide name data that is more than 36000 blocks old (the expiration
period for Namecoin).
The reason an API server is used in Options A and B instead of the P2P
network, is that the P2P network is unauthenticated and easy to Sybil.
The P2P network is great for getting data that is independently
verifiable (e.g. block headers and contents of blocks), but it's unwise
to rely on the P2P network to get unverifiable data such as a block
height of a name. An API server is authenticated (currently via
CA-based TLS, but a cert pin or PGP signing is certainly doable), which
reduces the possible points of attack. This is analogous to why Tor
uses centralized directory authorities -- authenticated trust points are
harder to Sybil.
(We do have longer term plans to introduce a way for SPV clients to get
the latest transaction associated with a name, without using an API
server or needing to download any full blocks, but that's out of scope
of this email.)
Options A and B do reveal to the API server which name is being looked
up. If mode A is used, it also reveals to a P2P peer which block height
is being looked up (which narrows the set of names by a factor of
~36000). Therefore, Tor stream isolation should be used in such cases.
(That's not implemented yet.) Option C doesn't generate any network
traffic on lookups, so it doesn't reveal anything.
In my testing, an SPV-based name lookup using Option A takes around 650
milliseconds (over clearnet). The vast majority of this is latency to
the API server (the server I'm testing with is on a low-budget hosting
plan). The portion consisting of a block retrieval over P2P takes
around 98 milliseconds (although it varies by block size). Option C
takes around 4 milliseconds.
The storage overhead of Option C's LevelDB database is around 400 MB
right now, although I believe it's feasible to reduce this significantly.
There are a few options I can think of for integrating this with Tor for
.onion naming. One would be to modify OnioNS to call the Namecoin SPV
client. This would concern me because OnioNS is in C++, which
introduces the risk of memory safety vulnerabilities. Another would be
to use an intermediate proxy like Yawning's or-ctl-filter. A third
option would be to try to get external name resolution implemented in
Tor itself, which I believe Jeff Burdges has suggested in the past. If
Option A or B is used, any solution would need to pass the stream
isolation info to the SPV client.
Integrating this with Tor Browser for TLS certificate validation might
involve a Firefox patch. There are tricks that can be done with the
CertDB and SiteSecurityService XPCOM interfaces that will do the job
without Firefox patches, but XPCOM is being phased out by Mozilla in
favor of WebExtensions, and I'm unaware of any equivalent features in
WebExtensions. (Also, it's unclear to me whether CertDB and
SiteSecurityService would introduce isolation issues -- I can't think of
any obvious attacks, but I haven't thought very hard about it.) I'm
trying to engage with Mozilla to see if we can work out a WebExtensions
feature for this, but nothing conclusive has happened on that front yet.
On the subject of reproducible builds, I've never tried to build Java
code in Gitian, so I'm not certain how difficult it's going to be.
Since Android uses Java, maybe the Guardian Project devs would have some
insight into the best way to do it. One of the Namecoin developers
(Joseph Bisch) is really good with reproducible builds (you probably
know him since he's the author of the Debian guest support in Gitian),
so I'm reasonably confident that a way to do it can be found.
I'd love to hear feedback on all of this.
Cheers,
-Jeremy Rand
Lead Application Engineer of Namecoin