Also, I think that counting users by IP is still a fine way to do it (absent the privacy issue that PCSA tries to address). I was just stating that my understanding based on talking to the Tor Metrics people is that the plan is to handle the privacy issue by moving to per-connection country statistics instead of by implementing PCSA.
I would also wonder how the privacy of PCSA actually compares to the privacy of per-country (noisy) counting, especially if the local statistics could be locally stored in a differentially-private way (again, this requires an accuracy analysis). As Tschorsch and Scheuermann note [0], the FM sketch used by PCSA can indicate the presence of an individual user (Sec. 4). Thus they propose to add noise by independently flipping some of the PCSA bits (Sec. 5). This seems quite similar to the differentially-private technique of adding noise to a counter. It is not clear to me that it is better to suffer the inaccuracy of the PCSA sketching plus that of the added noise when one could simply rely on adding differentially-private noise, especially when the latter provides a precise notion of privacy where the former does not.
Best, Aaron
[0] Florian Tschorsch and Björn Scheuermann, "An algorithm for privacy-preserving distributed user statistics”, Computer Networks 57 (2013).
On Apr 2, 2017, at 9:07 AM, Aaron Johnson aaron.m.johnson@nrl.navy.mil wrote:
Sorry, I should have been more clear there. Tor Metrics estimates the total number of users by counting the number of directory downloads and dividing by an estimated expected number of directory downloads per user per day (10, I believe). This statistic is in the graph under the “Relay Users” tab on <https://metrics.torproject.org/userstats-relay-country.html https://metrics.torproject.org/userstats-relay-country.html>.
Best, Aaron
On Apr 2, 2017, at 8:51 AM, Veer Kalantri <mads.531998@gmail.com mailto:mads.531998@gmail.com> wrote:
about which stats are you talking Aaron?
On Sun, Apr 2, 2017 at 5:45 PM, Aaron Johnson <aaron.m.johnson@nrl.navy.mil mailto:aaron.m.johnson@nrl.navy.mil> wrote:
These statistics not just tell about the user's country but also keep a track of unique IP addresses connecting from each country. This is needed so as to present more realistic stats. If we increment counter on any IP address instead of unique IP address then the statistics would also reflect user(s) connecting again and again. If we don't count Unique IPs, we would have stats about per country usage rather than per country users. We could do much better and implement a way(as described by the OP of thread) that counts unique IPs at the same time preserves privacy.
It is true that this would count connections rather than unique IPs. However, Tor already infers the number of users by counting directory downloads and then adjusting that number based on how many each user is expected to make. In addition, each user doesn’t necessarily correspond to a different IP because of NAT, and so counting connections may actually be more accurate.
Best, Aaron _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org mailto:tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
tor-dev mailing list tor-dev@lists.torproject.org mailto:tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev