Also, I think that counting users by IP is still a fine way to do it (absent the privacy issue that PCSA tries to address). I was just stating that my understanding based on talking to the Tor Metrics people is that the plan is to handle the privacy issue by moving to per-connection country statistics instead of by implementing PCSA.
I would also wonder how the privacy of PCSA actually compares to the privacy of per-country (noisy) counting, especially if the local statistics could be locally stored in a differentially-private way (again, this requires an accuracy analysis). As Tschorsch and Scheuermann note [0], the FM sketch used by PCSA can indicate the presence of an individual user (Sec. 4). Thus they propose to add noise by independently flipping some of the PCSA bits (Sec. 5). This seems quite similar to the differentially-private technique of adding noise to a counter. It is not clear to me that it is better to suffer the inaccuracy of the PCSA sketching plus that of the added noise when one could simply rely on adding differentially-private noise, especially when the latter provides a precise notion of privacy where the former does not.
Best,
Aaron
[0] Florian Tschorsch and Björn Scheuermann, "An algorithm for privacy-preserving distributed user statistics”, Computer Networks 57 (2013).
Sorry, I should have been more clear there. Tor Metrics estimates the total number of users by counting the number of directory downloads and dividing by an estimated expected number of directory downloads per user per day (10, I believe). This statistic is in the graph under the “Relay Users” tab on <
https://metrics.torproject.org/userstats-relay-country.html>.
Best,
Aaron