Hello,
we recently started discussing about further statistics that we could measure about hidden services [0]. We believe that most of the interesting statistics should be collected using a relay-hiding statistics aggregation scheme [1], but we were also thinking of any stats that could be collected in a more short-term fashion.
I will now enumerate the stats that Aaron considers interesting and low-hanging-fruit:
(1) Number of descriptor updates (total count and distribution) (2) Number of RPs established on relays (3) Number of circuits using TAP and nTor (4) Number of descriptors with encrypted introduction points
This time I'm going to put extra focus on how to use these statistics and _what questions they help us answer_. If these stats don't help us answer any interesting questions, they are not that useful. Also, this time we should have an *exact strategy* on how to use specific stats to derive the results we want, so that we don't spend 2 months after we write the code to figure out how to do extrapolations.
So here are some first thoughts. For each statistic I also include the preliminary analysis from the stats tech report [2]:
(1) Number of descriptor updates (total count and distribution) (Sec. 4.2.4)
Relays count how many descriptor updates they see per service. Assuming that stats are published daily (which is not necessary), this is going to be a number between 1 and 24 (since RendPostPeriod is currently one hour) and services pick a new directory after 24 hours (see rendcommon.c:get_time_period()).
Benefits This could reveal overall HS descriptor stability, which reflects the frequency of events causing descriptor updates, such as changing IPs or changing authentication keys. Also, this could reveal client errors or DoS attacks on HSDirs.
Risks Depending on how many HSes are behind each HSDir, this statistic might or might not reveal uptime information about specific services. Still it doesn’t seem like something we want to risk. Also, if the result is greater than 24, it means that an HS with modded RendPostPeriod was publishing to that HSDir (and that the HSDir doesn’t have many clients). Do we want to reveal that? OTOH, it seems to me that if the directory is serving many services, this statistic doesn’t really provide any insight. In addition, this could be used to reveal the 9introduction points used by a hidden service (assuming its address is known, but its descriptors are encrypted) by DOSing suspected IPs and observing in the responsible HSDirs report a higher number of descriptor updates.
[ohmygodel] Obfuscating this number by the maximum amount of descriptor updates per service that we would expect (so, 24) would provide reasonable privacy for the update frequency of any individual service while still providing a sufficiently accurate estimate of the average number of updates. This statistic would also be easily improved via anonymous statistics reporting.
I'm not yet convinced this is a useful stat. What is its use and which *questions* would it help us answer?
I'm assuming that we would total count here, since revealing the exact distribution could leak information about specific hidden services.
Also, this is related to the "Number of unique HSes per HSDir" statistic that we are already doing. This means, that we can do the division and arrive to "Average number of descriptor updates per HS". I'm not sure if I like this, since there are *specific* HSes corresponding to each HSDir. Are we sure that there are not edge-cases that this can be exploited to learn their uptime? I'm not.
(2) Number of RPs established on relays
Number of RPs established on relays (Sec. 4.3.3)
Relays report how many ESTABLISH_RENDEZVOUS cells they received.
Benefits The number of received ESTABLISH_RENDEZVOUS cells indicates how many connection attempts there are by clients to services that are running. This number is different from the number of descriptor fetches which happen when clients don't know yet whether a service is running, which will be omitted if clients still have a descriptor cached from a previous connection, and which we may not even gather because of privacy concerns. We can easily weight the number of ESTABLISH_RENDEZVOUS cells with the probability of choosing a relay as rendezvous point to estimate the total number of such cells in the network.
Risk This should be reported with a time delay so that it is unlikely that any existing rendezvous connections included in the statistic. Otherwise, it is possible that the adversary could attempt to disrupt existing connections at the RP as a DoS.
[ohmygodel] This seems straightforward and useful to collect, and it is closely related to the number of data cells at an RP, which we are already collecting.
OK, I can see how this stat would give us the number of "connection attempts there are by clients to services that are running". Is this a number we are interested in? I guess so maybe.
It seems related to the "How many hidden service clients are there?" question, which is obviously interesting, but it's not exactly there.
This stat seems more innocent to collect for the simple reason that clients pick RPs at random, so they don't correspond to specific hidden services like HSDir stats do.
However, there is again a correlation between "Number of RP cells" and this one. This would give the "Average number of cells per rendezvous circuit". To be honest, this might be a more interesting statistic than "number of RPs established", but it's more closely related to the application data of the hidden service circuit, so it should be analyzed with more scrutiny.
A relay-hiding statistics aggregation scheme would make this statistic easier to collect.
Number of circuits using TAP and nTor
Older clients (0.2.3.x) would build/extend circuits using TAP, newer clients would use nTor for that. IPs and RPs can report the number of introduction circuits that were built using either of the two methods. More precisely, relays would remember for each circuit how it was built, and as soon as they receive an \verb+ESTABLISH_INTRO+ or \verb+ESTABLISH_RENDEZVOUS+ cell, they increment one of two counters. See ticket 13466 for details.
Benefits We would learn what fraction of clients and what fraction of services run older tor versions (0.2.3.x or older).
Risks As tor-0.2.3.x gets less common and only a few hidden services still use it, an adversary would be able to track their introduction points by checking which relays still report TAP clients on their statistics.
[ohmygodel] We should be able to hide the effect of circuits from any one HS or client by adding obfuscation. Having IPs report statistics could reveal unknown IPs, however. Also, why not have all relays report this in addition to RPs? That seems useful and easy.
This statistic can reveal other information too since it's basically a circuit count. For example, if you count and publish the number of circuits containing ESTABLISH_INTRO, you get the "Number of IPs established on the network" statistic. If you count and publish the number of circuits containing ESTABLISH_RENDEZVOUS, you get the "Number of RPs established on relays" statistic I discussed in the previous section.
Also, why do we care how many hidden services are using older versions of Tor? And why do we care how many clients are using older versions of Tor? Is this to specifically detect botnet activity?
Also, why do this just for hidden services?
Number of descriptors with encrypted introduction points
Relays can look at published hidden-service descriptor and count descriptors with plain-text vs. encrypted introduction point sections.
Benefits We would learn what fraction of services uses authentication features. This statistic won't be available after implementing rend-spec-ng (224)
Risks There is no obvious risk from sharing this number if aggregated over a large enough time period.
[ohmygodel] Both the benefit and risk seem low here. Roger mentioned this statistic as being interested to him; maybe we should implement it for that reason alone.
This seems like a stat that would answer a very concete question "How many hidden services are using authorization currently?".
Answering this question seems useful for evaluating the user base and popularity of this feature.
However, I'm not sure if I want to learn this information at all. People who use hidden service authorization are cautious users, and it seems weird to count them like this. It might be okay if there are 10000 of these hidden services, but if there are only 100, I wouldn't want to out them like this. More thinking required.
[0]: https://lists.torproject.org/pipermail/tor-dev/2015-February/008228.html [1]: https://lists.torproject.org/pipermail/tor-dev/2015-January/008086.html [2]: https://people.torproject.org/~karsten/volatile/hidden-service-stats-2015-01...
Hi George,
I’m glad you’re putting serious thought into these stats. I’ll give you my perspective on some of the issues you raise.
I will now enumerate the stats that Aaron considers interesting and low-hanging-fruit:
I should mention that all of these came out of a list that came out of Roger’s mouth, and so you might try and get further thoughts from him.
This time I'm going to put extra focus on how to use these statistics and _what questions they help us answer_. If these stats don't help us answer any interesting questions, they are not that useful.
I think that overall many statistics are useful just to check for abuse, misconfiguration, or bugs. If the statistic is way out of line of what we would expect, especially when compared to other statistics, then that would reveal an unexpected and potentially problematic behavior.
Also, this time we should have an *exact strategy* on how to use specific stats to derive the results we want, so that we don't spend 2 months after we write the code to figure out how to do extrapolations.
I agree that it is important to be confident that we can use the data that we collect. Paul and I actually went through many of the desired statistics early on (during the kickoff meeting in mid-September) sketching out how extrapolation would work. I had attached that document to Trac ticket #13509, although it may be hard to understand.
(1) Number of descriptor updates (total count and distribution) (Sec. 4.2.4)
...
I'm not yet convinced this is a useful stat. What is its use and which *questions* would it help us answer?
In addition to revealing if somebody is sending way too many updates, it would help us understand the general level of churn of hidden services. Are there lots of short-lived services?
I'm assuming that we would total count here, since revealing the exact distribution could leak information about specific hidden services.
I believe that the distribution can be revealed to some extent safely. You choose a small number of bins chopping up the possible numbers of updates, and then publish the counts for each bin in the same way that you would publish a single overall count. The details are in the stats tech report.
Also, this is related to the "Number of unique HSes per HSDir" statistic that we are already doing. This means, that we can do the division and arrive to "Average number of descriptor updates per HS". I'm not sure if I like this, since there are *specific* HSes corresponding to each HSDir. Are we sure that there are not edge-cases that this can be exploited to learn their uptime? I'm not.
I do think that if you know of a specific HS, then you can watch the descriptor update stats from its HSDir over time and gradually learn about how many times that HS updates its descriptors. But if you know of a specific HS, you can do that anyway simply by fetching the descriptors. Thus this doesn’t seem like a problem to me.
(2) Number of RPs established on relays
...
OK, I can see how this stat would give us the number of "connection attempts there are by clients to services that are running". Is this a number we are interested in? I guess so maybe.
I think this is very interesting. How much traffic tends to flow over a typical HS circuit? Are there a huge number of established RPs relative to the amount of traffic (this could indicate either DoS or botnet clients)? Do clients make lots of little connections or fewer large ones?
Number of circuits using TAP and nTor
...
This statistic can reveal other information too since it's basically a circuit count. For example, if you count and publish the number of circuits containing ESTABLISH_INTRO, you get the "Number of IPs established on the network" statistic. If you count and publish the number of circuits containing ESTABLISH_RENDEZVOUS, you get the "Number of RPs established on relays" statistic I discussed in the previous section.
Agreed.
Also, why do we care how many hidden services are using older versions of Tor? And why do we care how many clients are using older versions of Tor? Is this to specifically detect botnet activity?
Roger has mentioned this a couple of times, both in the context of identifying botnet activity. I think more generally, it would be helpful to Tor to understand the distribution of software versions in active use among clients and HSes. This would help them better target upgrading if necessary to improve user security, and it could reveal when older versions are out of use and can be safely end-of-lifed.
Also, why do this just for hidden services?
It is interesting for HSes to figure out how much HS activity is from botnets. I agree that it is interesting more generally as well.
Number of descriptors with encrypted introduction points
...
This seems like a stat that would answer a very concete question "How many hidden services are using authorization currently?".
Answering this question seems useful for evaluating the user base and popularity of this feature.
Yes, agreed. Among other things, this could help direct Tor to improve the usability of such a feature.
However, I'm not sure if I want to learn this information at all. People who use hidden service authorization are cautious users, and it seems weird to count them like this. It might be okay if there are 10000 of these hidden services, but if there are only 100, I wouldn't want to out them like this. More thinking required.
I agree that no individual service should be revealed. That is why we would round and add noise as usual. That would hide the existence of any small number of services (we have used 8 for similar purposes).
Cheers, Aaron