[tor-talk] Tor users from Finland jumped from 25 000 to 200 000

David Fifield david at bamsoftware.com
Fri Jan 14 06:58:22 UTC 2022

On Fri, Jan 14, 2022 at 05:38:14AM +0200, Markus Ottela via tor-talk wrote:
> The creation of the Onion Service uses tempfile to create a temporary
> directory each time a new Onion Service is spin up, but as per the log
> files, there was only 25 Onion Services created during that time.

Restarting tor multiple times with a fresh tempdir each time would make
you appear as multiple clients. If you ran 25 copies of the script, then
you would be counted as 25 clients, since each instance of tor would be
making its own separate directory requests. But I don't know if that's
enough to explain the large effect on the estimated number of users. It
depends on how often the scripts were restarting tor. The user counts
are built on the assumption that a tor client makes a directory request
every 144 minutes, on average. If the script restarted tor more
frequently than that, it would be counted as more clients than 1. But to
count as even 10,000 clients, each of the 25 script instances would have
to be restarting tor every 144*60*25 / 10000 = 21 seconds on average.

At least, that's according to my understanding of how it works.

> As for the client-side, new requests session was created for each
> connection*. I assumed Tor would keep a tunnel open to one guard node, and
> that each new session/connection would pass through it.

I don't think the number of requests sessions matters for this. The user
count estimates do not depend on the number of streams, number of
circuits, or anything like that, as far as I know.

> *In hindsight this I should've only done the GET requests inside the loop.
> Here's the script I was running:
> https://gist.github.com/maqp/0e5dcf542ebb97baf98d198115e931ea
> Markus
> On 13.1.2022 20.34, David Fifield wrote:
> > On Thu, Jan 13, 2022 at 06:09:24PM +0200, Markus Ottela via tor-talk wrote:
> > > I've been experiencing weird behavior with Tor + Stem + Flask Onion Services
> > > dying randomly once every 1..5 days. I wrote a script that's making
> > > connections to a test an Onion Service to see when exactly the servers
> > > disappear -- and creating logs based on that. The system spins up new
> > > requests client instance for each connection, so those might be what's
> > > appearing on the graph. I'm just puzzled why they'd appear as different
> > > users, given that the public IP has remained static. (Also the script
> > > automatically spins up new Onion Service once it's been down for an hour, so
> > > that could explain the spikes.)
> > > 
> > > Again I'm not sure that's what this is about, but both the start time, and
> > > the most recent major downtime spikes match. I've killed testing, let's see
> > > if it returns to normal; I think there's enough data to open a ticket about
> > > my issue anyway.
> > That's an interesting hypothesis. The user count estimate does not use
> > IP addresses; rather it counts directory requests. See:
> > https://gitweb.torproject.org/metrics-web.git/tree/src/main/resources/doc/users-q-and-a.txt?id=6c2679ec1797976e171a68bbd3d7442a34f0a5d1
> > 
> > > Q: How is it even possible to count users in an anonymity network?
> > > A: We actually don't count users, but we count requests to the
> > > directories that clients make periodically to update their list of
> > > relays and estimate user numbers indirectly from there.
> > > Q: What if a user runs tor on a laptop and changes their IP address a
> > > few times per day?  Don't you overcount that user?
> > > A: No, because that user updates their list of relays as often as a
> > > user that doesn't change IP address over the day.
> > In your experiments, were you starting tor with an empty DataDirectory
> > and a cold directory cache each time (e.g., in a freshly initialized
> > container), or were you reusing the same DataDirectory? The former I
> > would expect to have an effect on estimated users; the latter not.

More information about the tor-talk mailing list