Hi Elias,
On Mar 9, 2016, at 5:31 AM, Elias Rohrer ml@tnull.de wrote:
Hello Marcela! Nice to meet you, too, and thank you for that interesting Paper! It seems I made a lot of wrong assumptions in my last mail, ugh, thats kind of embarrassing..
I really appreciate your interest in our paper and CONIKS, thanks :) No worries, there are a lot of subtle aspects about CONIKS, so it can take some time to fully see and understand them.
Yes, we propose that the identity providers can act as auditors and cross-verify each other in this way, and clients may query any auditor when they are checking for equivocation. However, I should note that we intended this to be the *ideal deployment* (and I apologize if this doesn’t come through clearly enough in the paper). CONIKS also allows third-party auditors whose sole job it is to check that the chain of signed tree roots is linear. In practice, we expect that auditors and identity providers will be separate in most cases (at least until there is a standard for CONIKS protocols and data formats, but we’re not there yet).
It’s not accurate to conflate auditing with being a key server, or say that auditing is a subset of the key server’s functionality. Auditors only need to keep records of identity providers’ signed tree roots, and they never download or record any key directory contents, so it doesn’t really make sense to equip auditors with key server functionality and “turn it off” when you’re not acting as a key server. Key servers have the more complex and rich functionality needing to maintain significantly larger data structures and perform more cryptographic computations. This means that key servers have very different space and availability requirements than auditors do.
Ah, yes, that is true. Just turning of features of the server to get an auditor might indeed be a bad idea. But wouldn't there be a lot of overlapping functionality in the server, auditor and even the clients? E.g. the crypto primitives for building and checking the tree? Maybe it would be viable to create a libCONIKS which could be used by all three components?
So it might seem at first glance that there’s a lot of overlapping functionality between the server, auditor and clients, but once you start implementing each one, it’s not really the case. Think about the different tasks of each component: the server is generating a bunch of digital signatures (e.g. signed tree roots, lookup indices) and hashes (Merkle tree, lookup indices, signed tree roots) to build the Merkle trees and hash chain of signed tree roots. Clients, on the other hand, do a bunch of signature and hash verifications (which, yes to be fair requires hashing data as well, but for different reasons) to verify individual authentication paths and signed tree roots. So there’s really no overlap in functionality with the server. Similarly, the auditors need to verify the signed tree roots so there’s some signature and hash verification as well, but in a different way than clients need to do this.
While generation and verification of these various crypto primitives go hand in hand, I would caution against putting them all in a single library if different components of the system will need different portions. If you look at the reference implementation, the only common code is the definition of the data formats. When I first started writing the reference implementation, I actually thought that it would be useful (and possible) to build a single libCONIKS, but sharing underlying crypto primitives isn’t reason enough to put all of the functionality in one place since the application of those primitives is what really matters. So I really think it makes more sense to make separate modules for server, client and auditor.
On Page 4 the authors state: "We can implement a VUF using any deterministic, existentially unforgeable signature scheme [42]. The signature scheme must be deterministic or else (in our application) the identity provider could insert multiple bindings for a user at different locations each with a valid authentication path. In practice, this could be a classic RSA signature [51] (using a deterministic padding scheme such as PKCS v. 1.5 [26]) or BLS “short signatures” [9], which provide the best space efficiency. Many common discrete-log based signature schemes such as DSA [30] are not immediately applicable as they require a random nonce. In CONIKS, we generate the index for a user u as: i = h (sign_K (u)) where h() is a one-way hash function and sign() is a deterministic, existentially unforgeable signature scheme. The extra hash function is added because some bits of sign_K (u) might otherwise leak through binding validity proofs, and signature schemes are generally not unforgeable given some bits of the valid signature. The hash function prevents this by ensuring the final lookup index leaks no information about the value of sign_K (u).”
I understand that using the signature of the user identity ensures that no information indicating its existence on the server is leaked. But why are they hashing the signature again? RSA-signatures normally already consist of a signed hash-value. Why would it bad to leak bits of the signature?
The answer to this question is in the portion of the paper you quote: "signature schemes are generally not unforgeable given some bits of the valid signature.” There’s some crypto-jargon here, which you may not be familiar with if you haven't taken a crypto course. What this sentence is trying to say is that, given some bits of a valid signature, an adversary could be able to find some message or data and a corresponding digital signature which is valid and which has not been generated by the legitimate signer (in this case the identity provider). This is called an existential forgery, and it means that an adversary can create forged lookup indices if the signature isn’t hashed. The hash on top of the signature also ensures that the lookup index can’t be easily computed from the VUF.
I actually had a crypto course some time ago, so crypto jargon should be fine.. However it seems that part of my brain needs some more training to get reactivated ;) I want to get this into my brain so here are some more (probably pretty naive) questions and thoughts on the topic:
- When bits of a valid signature leak, an external attacker could forge a colliding index for a different user? This would allow an attacker to register the same public key to a different username they control? So therefore a hash is used?
So the problem with leaked bits isn’t about an attacker forging an index. The problem is actually about privacy. Let’s go back and think about why the server transforms usernames into these lookup indices. The main purpose is so that the proofs of inclusion (i.e. the Merkle authentication paths) for one user don’t leak information about *any other* users in the tree. So if I receive the proof of inclusion for Alice, I should not be able to figure out who else might be in the tree. Remember that the proof of inclusion gives us the prefix of Alice’s lookup index, so in other words, we want to make sure that an attacker can’t find a name that results in an index with a shared prefix with Alice since this would be a privacy violation. If you use just a hash function, an attacker can reasonably hash a large set of usernames until it finds one that shares its prefix with Alice's index and determine whether this user exists by querying the server for the corresponding key.
Using only the digital signature doesn’t solve this problem since there are attacks that allow you to recover the rest of the signature bits given some of the first bits (I don’t know which specific attacks do this, this is the explanation I received from my collaborator). So given Alice’s index prefix, an attacker could try to guess if the index for a specific user (e.g. Bob) shares the same prefix, recover the remaining bits of the signature, and verify — using the provider's index public verification key -- whether this is a valid signature for Bob’s name. This is also a privacy violation. So by hashing the signature, anyone who receives the authentication path for Alice cannot determine based on this path, which other usernames exist in the tree since this would require finding the signatures whose hash shares a prefix with Alice’s index, a much more difficult problem.
- What still confuses me: When I recall correctly, 'existential forgery' is the lowest level of forgery since the attacker may not have control over the contents of the signed message, but may forge a valid signature for that message.
You propose to use a existentially unforgeable signature scheme for your VUF. But:
- As you explicitly use an existentially unforgeable signature scheme, why can the index then be forged if bits of this (existentially unforgeable) signature are leaked to an attacker? Is the hash needed when an external attacker has no knowledge of the private key used to sign the index?
- You later propose RSA for the VUF, but isn't RSA *the* example for which existential forgery can be done?
We realized the potential dangers of using RSA, so we designed a different VUF scheme. I think it’s best if you read the new paper to see the description.
- So am I right to assume that the attacker model shifts from an external one to a malicious identity provider? As he knows his private key, the signed data and the signature, what is the hash function hiding here?
See my explanation of the VUF above. CONIKS tries to protect against external *and* internal attackers. As I explained above, the hash function in the private lookup index is used to protect users’ privacy from external attackers. Like you said, the key server knows can know who all has name-to-key mappings in its own key directory. CONIKS deals with the internal attackers (i.e. malicious or compromised key server) using the tamper-evident data structure and signed tree roots to leave evidence of fake keys and equivocation.
Also, if I see this correctly, they don't do it in their reference implementation (https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s... https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java<https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_s... https://github.com/coniks-sys/coniks-ref-implementation/blob/master/coniks_server/src/org/coniks/coniks_server/SignatureOps.java>), instead they just use SHA256withRSA... So, am I misreading the paper? Or do they actually mean sign_K(h(u)), which would be the normal 'hash and sign' operation of RSA? Or does it make sense to rehash the signature and their own implementation is not consistent with it? This is probably no important point, however I want to make sure I'm not missing a piece concerning the security of the user identities.
Unfortunately, the reference implementation is still a work in progress (although I’ve had to take some time off from working on it due to other research projects), and is missing several features described in the paper. The SignatureOps class in the reference implementation simply implements wrappers around Java signature functions, so that’s not where the user lookup index calculation happens. The lookup index calculation happens in the unameToIndex() function in conks_server.ServerUtils. Now, I will note that the reference implementation currently *isn’t* doing the proper index calculation as it only currently hashes the username, so thank you for making me go back and double-check my code! But it would be relative straight-forward to change this and to add the digital signature computation that needs to be done before the hashing.
Ok, good to know that, I'll keep it in mind when reading the reference code! However, this is embarrassing, I only had a glance an the reference implementation, sorry for linking to an unrelated spot in the code and making assumptions...
I think my next steps will be to read the peer-reviewed version of your Paper and to refresh my crypto-knowledge a bit. I'd appreciate if you had some pointers to resources relevant to this project for me (other than classic textbooks). I may even try to have a look at the Micali/Rabin paper on VRFs.
As for good pointers for this project, I’m not sure that the Micali/Rabin paper on VRFs specifically is really going to be all that helpful for implementing CONIKS because the paper describes VRFs at a level of detail that isn’t needed to understand why it may be better to use an EC-based VUF construction than RSA. I apologize about the confusion arising from the older version of our paper
If you’re interested in the very formal crypto details of VRFs, you should read the paper, but for this project it’s more important to understand the security and privacy problems that CONIKS tries to solve and how it achieves those goals, and to understand how CONIKS fits into the workflow of the existing Tor Messenger service. Hopefully, our Usenix paper will explain most of the security and privacy design decisions, but definitely feel free to ask more questions! Another good resource might be to read about equivocation and fork consistency (David Mazieres’ paper on SUNDR first introduced this concept), and certificate transparency (there’s a good ACM Queue article) for more background on the problems CONIKS tries to solve and the design decisions we made. Goldberg et al’s paper on preventing DNS zone enumeration is a fairly recent paper that also shows how to use VUFs to prevent enumeration of sensitive information, so that might be another good resource. For figuring out how CONIKS can be integrated into Tor Messenger, I would look at the portions of the Tor Messenger code that currently handle encryption keys (e.g. key generation, registration, key lookups), and come up with a plan of how CONIKS will affect the existing workflow.
I hope this helps! Best, Marcela