[tor-dev] Feedback on obfuscating hidden-service statistics

Tue Nov 25 17:41:35 UTC 2014

"A. Johnson" <aaron.m.johnson at nrl.navy.mil> writes:

> Hello all,
>
> <snip>
>
>> We put in some simple obfuscations in order to not reveal too
>> sensitive data: we multiplied actual values with a random number in
>> [0.9, 1.1] before including those obfuscated values in extra-info
>> descriptors.  Maybe there's something smarter we could do?  Or is this
>> okay?
>
> I actually think that additive rather than multiplicative noise
> (i.e. randomness) makes sense here. Let’s suppose that you would like
> to obscure any individual connection that contains C cells or fewer
> (obscuring extremely and unusually large connections seems hopeless
> but unnecessary). That is, you don’t want the (distribution of) the RP
> cellcount from any relay to change by much whether or not C cells are
> removed The standard differential privacy approach would be to *add*
> noise from the Laplace distribution Lab(\epsilon/C), where \epsilon
> controls how much the statistics *distribution* can multiplicatively
> differ. I’m not saying that we need to add noise exactly from that
> distribution (maybe we weaken the guarantee slightly to get better
> accuracy), but the same idea applies. This would apply the same to
> both large and small relays. You *want* to learn roughly how much RP
> traffic each relay has - you just want to obscure the exact number
> within some tolerance.
>

Hello Aaron,

I posted an initial draft of the proposal here:
https://lists.torproject.org/pipermail/tor-dev/2014-November/007863.html
Any feedback would be awesome.

Specifically, I would be interested in undertanding the concept of
additive noise a bit better. As you can see the proposal draft is
still using multiplicative noise, and if you think that additive is
better we should change it. Unfortunately, I couldn't find any good
resources on the Internet explaining the difference between additive
and multiplicative noise. Could you expand a bit on what you said
above? Or link to a paper that explains more? Or link to some other
system that is doing additive noise (or even better its implementation)?

Thanks!