[tor-bugs] #22422 [Core Tor/Tor]: Add noise to PaddingStatistics

Tue Jun 6 01:16:58 UTC 2017

#22422: Add noise to PaddingStatistics
--------------------------+------------------------------------
 Reporter:  teor          |          Owner:
     Type:  defect        |         Status:  new
 Priority:  High          |      Milestone:  Tor: 0.3.1.x-final
Component:  Core Tor/Tor  |        Version:  Tor: 0.3.1.1-alpha
 Severity:  Normal        |     Resolution:
 Keywords:                |  Actual Points:
Parent ID:                |         Points:  0.5
 Reviewer:                |        Sponsor:
--------------------------+------------------------------------

Comment (by teor):

 Replying to [comment:2 mikeperry]:
 > Karsten and I discussed this about a year ago, and came to the
 conclusion that rounding to 10k cells was sufficient, especially since
 these counts are accumulated over a full 24 hour period. Relays are
 already reporting higher resolution for BW read and write history, and
 relays that opt in have higher resolution for cell statistics too.

 Then we should (eventually) fix these higher resolution statistics by
 adding noise to them too.

 > Is there a specific thing we're worried about with the current numbers?

 We are not adding noise, so we are relying on the other user activity
 being variable enough to hide an individual user's activity. There's no
 guarantee that will happen.

 Here's one possible attack:

 1. I want to detect the padding being used by a particular client, to see
 if it is connecting to a particular guard. I know the likely padding
 amount for this client.

 2. I have some high-resolution non-noisy data figures available (for
 example, BW read and write history). I use these to estimate the final
 padding totals.

 3. I manipulate the final padding totals for the guard to be just below a
 rounding threshold.

 4. If the client connects, the guard reports a figure above the threshold.
 If the client does not, the guard reports a figure below the threshold.

 5. I repeat steps 2-4 until I know with enough certainty whether the
 client is connecting. (This takes time that depends on the variability in
 the system.)

 If I want to enhance this attack, I can use multiple statistics, or reduce
 the amount of variability in the system.

 > Can we quantify the additional privacy we'd get from noise vs just
 making the rounding larger? Should we do one, or the other, or both?

 Rounding does not guarantee you any privacy. The larger the rounding
 amount, and the more variability in the system, the less likely any
 particular total will expose a user's activity, but there is always a
 chance that it will.

 (But rounding is really good for grouping similar noisy figures, and
 helping people understand the precision of the data. That's why we should
 do it.)

 You get guaranteed privacy from noise. The larger the noise, the larger
 the amount of user activity that is guaranteed to be hidden over a larger
 amount of time. You don't have to round to get this guarantee: adding
 noise is enough. You also don't have to rely on any other activity in the
 system to get this guarantee.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22422#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online