Also, I think that counting users by IP is still a fine way to do it (absent the privacy issue that PCSA tries to address). I was just stating that my understanding based on talking to the Tor Metrics people is that the plan is to handle the privacy issue by moving to per-connection country statistics instead of by implementing PCSA.

I would also wonder how the privacy of PCSA actually compares to the privacy of per-country (noisy) counting, especially if the local statistics could be locally stored in a differentially-private way (again, this requires an accuracy analysis). As Tschorsch and Scheuermann note [0], the FM sketch used by PCSA  can indicate the presence of an individual user (Sec. 4). Thus they propose to add noise by independently flipping some of the PCSA bits (Sec. 5). This seems quite similar to the differentially-private technique of adding noise to a counter. It is not clear to me that it is better to suffer the inaccuracy of the PCSA sketching plus that of the added noise when one could simply rely on adding differentially-private noise, especially when the latter provides a precise notion of privacy where the former does not.

Best,
Aaron

[0] Florian Tschorsch and Björn Scheuermann, "An algorithm for privacy-preserving distributed user statistics”, Computer Networks 57 (2013).

On Apr 2, 2017, at 9:07 AM, Aaron Johnson <aaron.m.johnson@nrl.navy.mil> wrote:

Sorry, I should have been more clear there. Tor Metrics estimates the total number of users by counting the number of directory downloads and dividing by an estimated expected number of directory downloads per user per day (10, I believe). This statistic is in the graph under the “Relay Users” tab on <https://metrics.torproject.org/userstats-relay-country.html>.

Best,
Aaron

On Apr 2, 2017, at 8:51 AM, Veer Kalantri <mads.531998@gmail.com> wrote:

about which stats are you talking Aaron?


On Sun, Apr 2, 2017 at 5:45 PM, Aaron Johnson <aaron.m.johnson@nrl.navy.mil> wrote:
> These statistics not just tell about the user's country but also keep a
> track of unique IP addresses connecting from each country. This is
> needed so as to present more realistic stats. If we increment counter on
> any IP address instead of unique IP address then the statistics would
> also reflect  user(s) connecting again and again. If we don't count
> Unique IPs, we would have stats about per country usage rather than per
> country users. We could do much better and implement a way(as described
> by the OP of thread) that counts unique IPs at the same time preserves
> privacy.

It is true that this would count connections rather than unique IPs. However, Tor already infers the number of users by counting directory downloads and then adjusting that number based on  how many each user is expected to make. In addition, each user doesn’t necessarily correspond to a different IP because of NAT, and so counting connections may actually be more accurate.

Best,
Aaron
_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev