[tor-dev] Anonymous Local Count Statistics Using PCSA - GSoC
tschorsch at informatik.hu-berlin.de
Sat Apr 1 11:19:48 UTC 2017
this sounds like an interesting summer project.
Since you are interested in using PCSA, our work on privacy-preserving statistics, which actually develops a privacy-enhanced version of PCSA, might be helpful. We also propose it as a way to collect distributed statistics.
In our HotPETs paper , we sketch the basic idea. In our journal paper , we provide additional details on the algorithm. If you have any questions, just let me know.
> On 30. Mar 2017, at 03:45, samir menon <menon.samir at gmail.com> wrote:
> Hi there!
> I'm Samir, a Computer Science student at Stanford University, with a
> focus in applied cryptography and computer security. This summer, I
> want to work (through GSoC) on computing usage statistics without
> keeping IP addresses in memory (see tickets #7532 and #15469)  .
> Currently, we keep sets of IP's (or hashed IP's) in memory so that we
> can compute the number of unique client connections. This has been
> pointed out as a pretty serious concern, because the IP's themselves
> are sensitive info that we don't want an attacker to acquire, but the
> statistics are relatively valuable.
> As Nick first pointed out in #15469, we can use proven techniques to
> compute these statistics without actually explicitly storing any IP's
> (or IP hashes) in memory. The technique I want to use, "Probabilistic
> Counting with Stochastic Averaging", or PCSA, is relatively
> well-studied, and can provide good estimates (<5% error) of the number
> of unique elements in a time series.
> The basic idea is to count the number of 0's before the least
> significant 1 in every (Jenkins hashed) IP, and then recognize that
> the more unique IP's we encounter, the more likely it is that we see a
> hashed IP with a large number of 0's before the least significant 1.
> (Shoutout to Jaskaran and  for helping me understand this). A more
> detailed explanation and more resources for understanding PCSA are in
> the proposal.
> Here is my draft proposal (also attached, but links don't work):
> I'd love to hear feedback on it - what's feasible, what's most useful,
> and what I should focus on, etc. You can also chat with me about it on
> IRC at `samir2`!
> ~Samir Menon
> menon.samir at gmail.com
> Stanford University, B.S. Computer Science, 2019
>  https://trac.torproject.org/projects/tor/ticket/7532
>  https://trac.torproject.org/projects/tor/ticket/15469
>  https://www.cs.princeton.edu/~rs/talks/AC11-Cardinality.pdf
> tor-dev mailing list
> tor-dev at lists.torproject.org
More information about the tor-dev