[reposting this message with permission. It is a reply that I sent to Aaron, where I quoted an email from him about this proposal. Tim and Aaron had additional responses, which I'll let them quote here or not as they think best.]
On Sat, Aug 5, 2017 at 1:38 PM, Aaron Johnson aaron.m.johnson@nrl.navy.mil wrote: [...]
- There are a couple of documents in PrivCount that are missing: the deployment document and the configuration document. These set up things like the identities/public keys of the parties, the planned time of the measurements, the statistics to be computed, the noise levels to use. They were required to be agreed on by all parties. These values must be agreed upon by all parties (in some cases, such as disagreement about noise, the security/privacy guarantees could otherwise fail). How do you plan to replace these?
So, I hadn't planned to remove these documents, so much as to leave them out of scope for this proposal. Right now, in the code, there's no actual way to configure any of these things.
Thinking aloud:
I think we should engineer that piece by piece. We already have the consensus directory system as a way to communicate information that needs to be securely updated, and where everybody needs to update at once, so I'd like to reuse that to the extent that it's appropriate.
For some parts of it, I think we can use versions and named sets. For other parts, we want to be flexible, so that we can rotate keys frequently, react to tally reporters going offline, and so on. There may need to be more than one distribution mechanism for this metainfo.
These decisions will also be application-dependent: I've been thinking mainly of "always-on" applications, like network metrics, performance measurement, anomaly-detection [*], and so on. But I am probably under-engineering for "time-limited" applications like short-term research experiments.
- I believe that instead of dealing with Tally Reporter (TR) failures using multiple subsets, you could instead simply use (t,n) secret sharing, which would survive any t-1 failures (but also allow any subset of size t to determine the individual DC counts). The DC would create one blinding value B and then use Shamir secret sharing to send a share of B to each TR. To aggregate, each TR would first add together its shares, which would yield a share of the sum of the blinding values from all DCs. Then the TRs could simply reconstruct that sum publicly, which, when subtracted from the public, blinded, noisy, counts would reveal the final noisy sum. This would be more efficient than having each TR publish multiple potential inputs to different subsets of TRs.
So, I might have misunderstood the purpose here : I thought that the instances were to handle misbehaving DCs as well as malfunctioning TRs.
- Storing at the DC the blinded values encrypted to the TRs seems to violate forward privacy in that if during the measurement the adversary compromises a DC and then later (even after the final release) compromises the key of a TR, the adversary could determine the state of the DC’s counter at the time of compromise. The also applies to the optimization in Sec. 6 where the blinding values where a shared secret is hashed to produce the blinding values.
Well, the adversary would need to compromise the key of _every_ TR in at least one instance, or they couldn't recover the actual counters.
I guess we could, as in the original design (IIUC), send the encrypted blinding values (or public DH key in sec 6) immediately from the DC when it generates them, and then throw them away client-side. Now the adversary would need to break into all the TRs while they were holding these encrypted blinding values.
Or, almost equivalently, I think we could make the TR public encryption keys only get used for one round. That's good practice in general, and it's a direction I generally like.
And of course, DCs should use a forward-secure TLS for talking to the TRs, so that an eavesdropper doesn't learn anything.
[*] One anomaly detection mechanism I've been thinking of is to look at different "protocol-warn" log messages. These log messages indicate that some third party is not complying with the protocol. They're usually logged at info, since there's nothing an operator can do about them, but it would be good for us to get notification if some of them spike all of a sudden.