commit 8d61d733dad03a3bace814cfe5972759f9716757 Author: Karsten Loesing karsten.loesing@gmx.net Date: Wed Nov 26 10:38:47 2014 +0100
Import hidden-service-stats report from pad. --- 2015/hidden-service-stats/.gitignore | 3 + 2015/hidden-service-stats/hidden-service-stats.tex | 1016 ++++++++++++++++++++ 2015/hidden-service-stats/tortechrep.cls | 1 + 3 files changed, 1020 insertions(+)
diff --git a/2015/hidden-service-stats/.gitignore b/2015/hidden-service-stats/.gitignore new file mode 100644 index 0000000..2c5e321 --- /dev/null +++ b/2015/hidden-service-stats/.gitignore @@ -0,0 +1,3 @@ +.DS_Store +hidden-service-stats.pdf + diff --git a/2015/hidden-service-stats/hidden-service-stats.tex b/2015/hidden-service-stats/hidden-service-stats.tex new file mode 100644 index 0000000..ef8f46c --- /dev/null +++ b/2015/hidden-service-stats/hidden-service-stats.tex @@ -0,0 +1,1016 @@ +\documentclass{tortechrep} +\usepackage{url} +\usepackage{hyperref} +\usepackage{longtable} + +\begin{document} + +\title{Hidden-service statistics reported by relays} + +\author{David Goulet, George Kadianakis, Karsten Loesing} + +\contact{\href{mailto:dgoulet@torproject.org}{dgoulet@torproject.org},% +\href{mailto:asn@torproject.org}{asn@torproject.org},% +\href{mailto:karsten@torproject.org}{karsten@torproject.org}} + +\reportid{DRAFT} +\date{January XX, 2015} + +\maketitle + +% Text conventions +% - Each sentence ends with a newline, even inside a paragraph. +% - Lines are at most 74 characters wide. +% - Abbreviations are best avoided. +% - Code, cell names, etc. are put inside \verb+...+. + +\begin{abstract} +This document discusses new hidden-service related statistics to be +gathered by relays and reported to the directory authorities in their +extra-info descriptors. +\end{abstract} + +\section{Motivation} + +We have little insight into hidden-service usage in the Tor network. +The statistics discussed in this document shall help us get a basic +understanding of hidden-service usage, improve their performance, find +bugs, etc. + +\section{Design} + +The statistics discussed here can all be gathered by relays taking one of +three possible roles in the rendezvous protocol: as 1) introduction point, +2) rendezvous point, or 3) hidden-service directory. +All statistics will be reported by relays to the directory authorities in +their extra-info descriptors, possibly every 24 hours. + +General considerations for gathering hidden-service statistics: + +\begin{itemize} +\item Should we report number and type of failures in the protocol, if +these statistics are not sufficient to actually debug a problem? +Could be a starting point to look at actual logs from relays. +But is this what statistics are for? +\item Should we not report statistics if a relay acted as dir/IPo/RPo for +less than a certain threshold of clients/services? +Can we make sure that an adversary doesn't generate traffic on their own +to push a relay above that threshold and report a tiny number of real +users? +\end{itemize} + +There are no plans for gathering hidden-service statistics on hidden +servers or clients, mostly because there is no data-collecting +infrastructure in place and because privacy implications are even less +clear in the case of single clients or servers reporting statistics than +in the case of relays serving dozens or hundreds of hidden services and +their clients. + +Note: there is an evaluation in the next section that can lead to +positive/negative/neutral recommendations for actually proposing and +implementing statistics. + +\subsection{Statistics from relays acting as introduction points} + +The following statistics are related to relays acting as introduction +points. +These cover (1) services establishing introduction points +(\verb+ESTABLISH_INTRO+ cell) and (2) clients sending introductions to +introduction points (\verb+INTRODUCE1+ cell).% and 3) the server +%responding to the introduction point (the server does not respond to the +%introduction point). + +\subsubsection{Statistics on hidden services establishing introduction +points} + +\paragraph{Number of attempts to establish an introduction point (1.1.1.)} + +\subparagraph{Details} + +A relay counts how many \verb+ESTABLISH_INTRO+ cells it receives during +the statistics interval. + +\subparagraph{Benefits} + +We could validate that we have a ``uniform'' random distribution among +chosen introduction points in the network. +If not, there might be a problem. + +\subparagraph{Risks} +Considering we have a good randomness meaning every relay has the same +chance to be picked, there are no obvious risks to share this. +If not, we don't see a real risk for an attacker to know that a specific +relay got chosen X times instead of the measured average Y. + +\paragraph{Time from establishing a circuit to becoming an introduction +point (1.1.2.)} + +% (the following distinction cannot be made, AFAIK. here's what happens: +% we receive a CREATE (?) cell from another relay that establishes the +% circuit to us, and then we receive an ESTABLISH_INTRO cell. if the time +% difference between those two events is small, we can guess that the +% client built a new circuit for using us as introduction point, or that +% she extended an existing circuit by one hop to do so. if the time +% difference is more than, say, a second, we can guess that the client +% created this circuit a while ago and only recently decided to +% cannibalize it and use it as introduction circuit. statistics would +% tell us what fraction of circuits is newly built and what was +% cannibalized; well, allow guesses about the two cases.) + +\subparagraph{Details} + +A relay measures the time difference between a circuit extension from the +previous relay in the circuit to receiving an \verb+ESTABLISH_INTRO+ cell. +A very small time difference implies that the circuit was built/extended +specifically for use as introduction point, whereas a larger time +difference hints to the hidden service re-using a pre-built circuit for +the introduction point. + +% [dgoulet]: "if the time difference between those two events is small, we +% can guess that the client built a new circuit for using us as +% introduction point, or that she extended an existing circuit by one hop +% to do so" -- I think it should be that if the time difference is small, +% one can assume a cannibalized circuit else a new circuit. +% [karsten]: it might be that we're thinking of different things when we +% say cannibalizing a circuit. here are three possible cases, from the +% perspective of a client/service: +% - create, extend, extend, ............. wait ................., +% establish introduction point -> large time difference observed by +% last relay +% - create, extend, extend, ............. wait ........., extend, +% establish introduction point -> small time difference observed by +% last relay +% - create, extend, extend, establish introduction point +% -> small time difference observed by last relay +% which of these do you call cannibalized? (I'm not sure what the code +% says here.) +% [dgoulet]: To be honest not sure which one is right but what I can see +% from the code is that a client circuit can be cannibalized for the +% introducing part which is extended with an extra hop. Now the way I see +% it is that there is probably a noticable time difference between using +% an already created circuit for which we simply extend one hop versus +% establishing a new one of 4 hops. + + +% Newly established circuit. +% Benefits: Performance reason, this can be useful to know the real cost +% (on average) of becoming an IP. Can lead to understanding bottle necks +% across the network or maybe identify relay that are misbehaving. +% Risks: That can be tricky. Having a specific time frame on a circuit +% establishment can maybe lead to some traffic correlation. +% Cannibalized circuit. +% Benefits: Performance reason, we can see if cannibalizing a circuit is +% actually a gain from a new one. This value also could tell us what's the +% fraction of circuit that are cannibalized and the net performance gain +% of that which could lead to maybe better heuristic on choosing/creating +% circuit to be cannibalized. +% Risks: Also tricky. That info could tell us clearly if the IP circuit is +% on a new or already established circuit which changes the traffic +% timing. Not sure how useful it is to an attacker though. + +\subparagraph{Benefits} + +We would learn what fraction of introduction points can be established on +short notice using pre-built circuits vs. first having to build or extend +circuits. +This is something we would measure on hidden services, but given that we +don't have statistics from those, measuring this on introduction points +seems like a fine workaround. + +% Both of these stats should probably report an average and a variance +% instead of a <timestamp> + <circ. creation time>, that would be a +% disaster. (yes, please, no data about single events; that wouldn't fit +% into descriptors anyway, and it would reveal far too much detail.) +% I really wonder if an attacker could use this average to partition part +% of the network to predict where the circuit can be located? + +\subparagraph{Risks} + +No obvious risks. % only talking about aggregate statistics here, not +% single observations. + +\paragraph{Number of failed attempts to establish an introduction point +(1.1.3.)} + +\subparagraph{Details} + +A relay can not decline to be an introduction point. +However, an \verb+ESTABLISH_INTRO+ cell might be malformed (wrong public +key, bad signature, etc...). +The relay would count the number of declined \verb+ESTABLISH_INTRO+ cells +and report them along with the total number of received +\verb+ESTABLISH_INTRO+ cells. +Or it would report successes and failures, rather than totals and +failures. + +\subparagraph{Benefits} + +Wrong \verb+ESTABLISH_INTRO+ cells shows either a very bad bug in the code +or a deliberate action (data mangling, unknown attack, DoS, ...). + +% [dgoulet]: After an IRC discussion with arma and asn, I remember that +% this one could be "cool to have" but without more information that we +% can't collect for privacy reasons, this stat would not help at all in +% the end game. The question remains if we should simply keep it or not +% even if right now we don't see a added value? +% [karsten:] right, this is a fine question, not only limited to this +% statistic. I added a new paragraph to the section start for "general +% considerations for gathering hidden-service statistics". + +\subparagraph{Risks} + +No obvious risks. + +\paragraph{Lifetime of introduction circuits (1.1.4.)} + +\subparagraph{Details} + +How long did an introduction circuit last? +Relays would report statistics like mean/median time, variance/IQR, and/or +percentiles here. + +\subparagraph{Benefits} + +The longer introduction circuits last, the better, from a performance POV. +If many circuits break after a short time period, that indicates that +services should attempt to make better path-selection decisions for +building introduction circuits. + +\paragraph{Reasons for terminating established introduction points +(1.1.5.)} + +\subparagraph{Details} + +Relays report frequencies of circuit terminations requested by services +vs. different types of failures. + +\subparagraph{Benefits} + +If there are more than a small percentage of failures, decide how to make +things more robust. + +\subparagraph{Risks} + +No obvious risks. + +\paragraph{Number of introduction circuits built with TAP vs. nTor +(1.1.6.)} + +\subparagraph{Details} + +Older clients (0.2.3.x) would build/extend circuits using TAP, newer +clients would use nTor for that. +Relays can report the number of introduction circuits that were built +using either of the two methods. +More precisely, relays would remember for each circuit how it was built, +and as soon as they receive an \verb+ESTABLISH_INTRO+ cell, they increment +one of two counters. +See ticket 13466 for details. + +\subparagraph{Benefits} + +We would learn what fraction of hidden services run older tor versions +(0.2.3.x or older). + +\subsubsection{Statistics on clients connecting to introduction points} + +\paragraph{Total number of introductions received from clients (1.2.1.)} + +\subparagraph{Details} + +Relays report how many \verb+INTRODUCE1+ cells they received from clients. + +\subparagraph{Benefits} + +This indicates that there is in fact a client trying to reach a hidden +service thus the amount of cells could give us a rough estimate of how +many clients are actually connecting and using hidden services. + +\subparagraph{Risks} + +Unclear. +On the one hand, this is basically the same risk as the amount of time a +relay is picked as an introduction point. +On the other hand, an adversary could fetch a hidden-service descriptor, +learn that a particular relay was an introduction point for that service, +and then see the relay receive many \verb+INTRODUCE1+ cells. +Basically, this statistic could be used to learn how many connection +requests a very popular hidden service gets. + +% [dgoulet]: I think, after discussing it with Nick, that this might be OK +% if the relay reports this stat for a lot of HS meaning the relay has at +% least been an IP for multiple HS thus this stat can't be correlate to +% one specific HS. Now, the period here can be difficult to get right. +% RendPostPeriod is at 1 hour but is the HS actually changes the IP set +% every upload period? If yes, that means that over let say 24 hours, that +% INTRODUCE1 cell could potentially more than one HS which seems to me ok. +% Might be still dicy if wrongly implemented. +% [karsten]: this is also a fine question, not limited to this statistic; +% which is why I moved it to the section start, too. but I'm unclear what +% this has to do with RendPostPeriod. servers don't create a new set of +% introduction points every hour, AFAIK. +% [dgoulet]: No they don't, I confirmed in the code. + +\paragraph{Number of introductions received by established introduction +point (1.2.2.)} + +\subparagraph{Details} + +Relays can serve as introduction point for an arbitrary number of hidden +services. +Relays could report statistics (like percentiles) on received +\verb+INTRODUCE1+ cell by introduction circuit. + +\subparagraph{Benefits} + +This statistic would tell us something about usage diversity of hidden +services. +A special case would be the number or fraction of established introduction +points that never sees a single \verb+INTRODUCE1+ cell. +It's unclear what we'd do with this information, though. + +\paragraph{Number of discarded client introductions by reason (1.2.3.)} + +\subparagraph{Details} + +How many \verb+INTRODUCE1+ cells have been discarded because of unknown +service/malformed (?)/whatever-can-go-wrong, by introduction point? + +\subparagraph{Benefits} + +Anything exceeding a small portion of discarded \verb+INTRODUCE1+ cells +shows either a very bad bug in the code or a deliberate action (data +mangling, unknown attack, DoS, ...). + +% [dgoulet]: That is again a "cool to have" stat but not sure how it would +% help us investiguate. It can I guess trigger an alarm but apart from +% that... +% [karsten]: right, see section start. + +\subparagraph{Risks} + +No obvious risks. +More precisely, if absolute numbers are reported, the risk is the same as +the risk of reporting the number of received \verb+INTRODUCE1+ cells; if +only fractions are reported, it's not that bad. + +\subparagraph{Recommendation} + +\paragraph{Time between establishing introduction point and receiving the +first client introduction (1.2.4.)} + +\subparagraph{Details} + +Relays report the time between \verb+ESTABLISH_INTRO+ and first +\verb+INTRODUCE1+ cell. + +\subparagraph{Benefits} + +This statistic tells us how long it takes for the hidden service to +include a relay in its descriptor and publish that descriptor, and for the +first client to fetch that descriptor and use that relay for its +introduction. +This may not be very useful, but is listed here for completeness. +% [dgoulet]: That would basically leak the RendPostPeriod (if IP changes +% at each upload) of the HS. Not sure how an attacker could use that to +% his/her advantage but to consider. +% [karsten]: again, I think you're wrong about introduction points +% changing at each upload. +% [dgoulet]: Yup, IP do *NOT* change at each upload. + +\subparagraph{Risks} + +No obvious risks. + +\paragraph{Number of client introductions coming in via circuits built +with TAP vs. nTor (1.2.5.)} + +\subparagraph{Details} + +Relays remember whether an incoming circuit was built using TAP or nTor. +Whenever they receive an INTRODUCE1 cell they increment a counter for +either TAP or nTor. +See ticket 13466 for details. + +\subparagraph{Benefits} + +We would learn what fraction of hidden-service clients run older tor +versions (0.2.3.x or older). + +%3) Stats about the server responding to the introduction point. +% (this does not happen in the protocol) +% +% - How many INTRODUCE2 replayed cell we've observed? +% This can actually be an active attack or a client sending multiple +% INTRODUCE1 cell via different introduction points. +% - Benefits: Could give us an idea of how many client are misbehaving +% (actually need to confirm if the tor client can send multiple +% INTRODUCE1 for the same service). +% - Risks: No obvious risks. +% +% Note that the amount of valid INTRODUCE2 cell seen should correspond +% to the amount of RP circuit launched (might be a cannibalized one). +% So, having that stat could be useful +% to again simply correlate that stat with the RP amout stat. Finding +% out that there is a discrepancy could help us narrow down performance +% issue. +% +% (removed the following, because receiving INTRODUCE1 triggers an event +% that ends with responding with INTRODUCE_ACK.) +% - How many INTRODUCE_ACK were sent to the client? +% - Benefits: Can be coupled with the how many INTRODUCE1 we've seen and +% look for discrepancy. The difference of intro1 and intro_ack could not +% be explained though without a reason why the HS dropped it or if the +% HS did receive the intro1 at all. So, this stat can be fun to have but +% not really useful for performance tuning I would say. +% - Risks: No obvious risks. + +\subsection{Statistics from relays acting as rendezvous points} + +The following statistics are all related to relays acting as rendezvous +points. +These statistics cover the whole process from (1) clients establishing +rendezvous points, (2) servers connecting to a client's rendezvous point, +and (3) clients creating streams to the server, exchanging data, and +tearing down the circuit. +These phases of the rendezvous protocol are also used to organize the +statistics below. +All statistics focus on the number or timing of cells exchanged in the +rendezvous protocol and underlying OR protocol. + +\subsubsection{Statistics on clients establishing rendezvous points} + +\paragraph{Number of established rendezvous points (2.1.1.)} + +\subparagraph{Details} + +Relays report how many \verb+ESTABLISH_RENDEZVOUS+ cells they received. + +\subparagraph{Benefits} + +The number of received \verb+ESTABLISH_RENDEZVOUS+ cells indicates how +many connection attempts there are by clients to services that are +running. +This number is different from the number of descriptor fetches which +happen when clients don't know yet whether a service is running, which +will be omitted if clients still have a descriptor cached from a previous +connection, and which we may not even gather because of privacy concerns. +We can easily weight the number of \verb+ESTABLISH_RENDEZVOUS+ cells with +the probability of choosing a relay as rendezvous point to estimate the +total number of such cells in the network. + +\subparagraph{Risk} + +There is no obvious risk from sharing this number if aggregated over a +large enough time period. + +\paragraph{Time from circuit creation to establishing rendezvous point +(2.1.2.)} + +\subparagraph{Details} + +Relays report statistics on the time between circuit creation to receiving +a \verb+ESTABLISH_RENDEZVOUS+ cell. + +\subparagraph{Benefits} + +The time from receiving a circuit creation request to seeing the +\verb+ESTABLISH_RENDEZVOUS+ cell can help us optimize the rendezvous +protocol for performance. +The current implementation either builds a new circuit or extends an +existing circuit by one hop before sending the \verb+ESTABLISH_RENDEZVOUS+ +cell. +So, the measured time will be close to zero. +But if we ever decide to re-use existing circuits for rendezvous without +extending them by another hop, this metric will give us an idea on the +adoption of that change. +Admitted, this benefit is not huge. + +\subparagraph{Risk} + +There is no obvious risk related to this statistic. + +% [dgoulet]: Reporting a mean/average and with maybe a treshold before we +% publish like "we need 100 rdv cell before reporting this stat" ? +% [karsten]: agreed that this may seem useful in general. but what if the +% adversary sends 100 cells themselves to help us get past the threshold +% and report a tiny number of actual user cells? but I added an item to +% the section start where we can discuss whether this is a good safeguard +% in general. +% Right well that's a time statistic and not an amount so if an attacker +% would establish 100 RP I guess he/she indeed poisoning the stat?... + +\subparagraph{Recommendation} + +\paragraph{Number of rendezvous point establishment requests coming in via +circuits built with TAP vs. nTor (2.1.3.)} + +\subparagraph{Details} + +Relays remember whether an incoming circuit was built using TAP or nTor. +Whenever they receive an \verb+ESTABLISH_RENDEZVOUS+ cell they increment a +counter for either TAP or nTor. +See ticket 13466 for details. + +\subparagraph{Benefits} + +We would learn what fraction of hidden-service clients run older tor +versions (0.2.3.x or older). + +% How much RP traffic was transfererd through RP circuits? (see below re: +% RELAY cells) + +% Average traffic transfered through RP circuits? (see below re: RELAY +% cells) + +\subsubsection{Statistics on servers connecting to a client's rendezvous +point} + +\paragraph{Number of server rendezvous (2.2.1.)} + +\subparagraph{Details} + +Relays report the total number of \verb+RENDEZVOUS1+ cells they receive. + +\subparagraph{Benefits} + +The number of received \verb+RENDEZVOUS1+ cells tells us how many +connection requests are actually accepted by servers. +This number may be lower than the number of \verb+ESTABLISH_RENDEZVOUS+ +cells, because of failures in connection establishment, authentication +failures, or other reasons. + +\subparagraph{Risks} + +There is no obvious risk from this metric, because it's unrelated to any +given client or server. + +% [dgoulet]: Wondering if there is a real benefit here? I guess if we see +% 100 RENDEZVOUS1 and onlye *one* ESTABLISH_RENDEZVOUS, that might signal +% an issue... ? +% [karsten]: the idea is that things can go wrong between establishing a +% rendezvous point and the server sending a rendezvous. knowing what +% fraction of established rendezvous are actually used tells us something. +% and I think you mean 1 RENDEZVOUS1 and 100 ESTABLISH_RENDEZVOUS in your +% example. because 100 RENDEZVOUS1 for a single ESTABLISH_RENDEZVOUS +% would for sure look funny. +% [dgoulet]: well either way, it's an issue :)... If the HS sends a big +% amount of RENDEZVOUS1 to Alice's RP for which Alice only created one RP +% (one ESTABLISH_RENDEZVOUS), that's quite an issue (loop that went wrong +% :). + +\paragraph{Time from establishing a rendezvous point to receiving the +server rendezvous (2.2.2.)} + +\subparagraph{Details} + +Relays report the time from receiving an \verb+ESTABLISH_RENDEZVOUS+ cell +to receiving the corresponding \verb+RENDEZVOUS1+ cell. + +\subparagraph{Benefits} + +The time between receiving an \verb+ESTABLISH_RENDEZVOUS+ cell from the +client and the corresponding \verb+RENDEZVOUS1+ cell from the server tells +us a lot about performance of the rendezvous protocol. +The rendezvous point is the only place in the protocol that witnesses +events near the beginning and near the end of the connection establishment +process. +If we ever want to improve the substeps inbetween, this metric is the only +way to measure effectiveness of improvements in the deployed network. + +\subparagraph{Risks} + +Again, there are at least no obvious risks from gathering this statistic. + +\paragraph{Number of server rendezvous with unknown rendezvous cookie +(2.2.3.)} + +\subparagraph{Details} + +Relays report the number of \verb+RENDEZVOUS1+ cell with unknown +rendezvous cookie. + +\subparagraph{Benefits} + +The number of \verb+RENDEZVOUS1+ cell that cannot be matched with a +previously established rendezvous circuit can be interesting for analyzing +problems in the protocol. +We might even distinguish between rendezvous cookies that were previously +known to the relay and those that seem entirely unrelated. +The benefit gained from this statistic is not huge though. + +\subparagraph{Risk} + +No obvious risks. + +\paragraph{Number of server rendezvous coming in via circuits built with +TAP vs. nTor (2.2.4.)} + +\subparagraph{Details} + +Relays remember whether an incoming circuit was built using TAP or nTor. +Whenever they receive a \verb+RENDEZVOUS1+ cell they increment a counter +for either TAP or nTor. +See ticket 13466 for details. + +\subparagraph{Benefits} + +We would learn what fraction of hidden services run older tor versions +(0.2.3.x or older). + +\subsubsection{Statistics on clients creating streams to the server, +exchanging data, and tearing down the circuit} + +\paragraph{Time from server rendezvous to first client data (2.3.1.)} + +\subparagraph{Details} + +Relays report the time from receiving a \verb+RENDEZVOUS1+ cell to seeing +the first \verb+RELAY+ cell sent from the client. + +\subparagraph{Benefits} +The time from receiving a \verb+RENDEZVOUS1+ cell from the server (and +relaying it as \verb+RENDEZVOUS2+ cell to the client) and receiving the +first \verb+RELAY+ cell from the client is another performance indicator +of the protocol. + +\subparagraph{Risks} + +There are no obvious risks from learning the time between these two +substeps in the rendezvous protocol. + +\paragraph{Amount of data sent over connected rendezvous circuits in +either direction (2.3.2.)} + +\subparagraph{Details} + +Relays report the number of \verb+RELAY+ cells sent in either direction. + +\subparagraph{Benefits} + +The number of \verb+RELAY+ cells sent by either client or server can give +us a detailed view on hidden service usage. +In contrast to common Tor usage, there is no point in the rendezvous +protocol where we could count transferred bytes. +The number of cells is the best approximation that we have. +In addition to the total number of cells, the number of cells by direction +can indicate how common classical client-server protocols are compared to +peer-to-peer models. +As a special case, we'd want to know what fraction of circuits has zero +\verb+RELAY+ cells, which would indicate a connection problem late in the +process. + +\subparagraph{Risks} + +In contrast to the cells discussed above, \verb+RELAY+ cells contain +actual user content. +The pattern of \verb+RELAY+ cells could also be used to fingerprint a +given server or even client. +While total number of cells by direction aggregated over a certain time +period should be okay to measure, any statistics going further than that +need closer analysis. + +\paragraph{Time from first client data to tearing down circuit (2.3.3.)} + +\subparagraph{Details} + +Relays report the time from seeing the first \verb+RELAY+ cell sent by the +client to tearing down circuit by either client or server. + +\subparagraph{Benefits} + +The time between receiving the first \verb+RELAY+ cell to tearing down the +circuit indicates typical session length of hidden service connections. +We'd be able to say whether typical hidden-service connections are rather +short-lived or long-lived. +This information may help us make educated guesses on the type of +applications run over hidden services. +It may also help us improve the selection criteria for rendezvous points. + +\subparagraph{Risks} + +Session length is quite sensitive data that could be correlated with +circuit lifetimes at other places in the network. +Fortunately, the rendezvous point is neither specific to any given client +or service, which makes this information slightly less sensitive. +Still, this metric needs further analysis. + +% How many rendezvous requests finally succeded? +% Opposite: What percentage of the time did the rendezvous fail to happen? +% (rendezvous can fail at different steps. one way to count failures is +% to compare number of ESTABLISH_RENDEZVOUS, RENDEZVOUS1, and subsequent +% RELAY cells.) +% How much time did it take to splice the RP circuit? (#13194) (you mean +% time from RENDEZVOUS1 to first RELAY cell?) + +\subsection{Statistics from relays acting as hidden-service directories} + +% HSDirs threat model notes +Hidden Service directories periodically receive HS descriptors from hidden +services. +They cache them, and then serve them to any clients that ask for them. + +Hidden service directories are placed in a hash ring, and each hidden +service picks a slice of hidden service directories from that hash ring. +Given the address of a hidden service, it's easy to learn which +directories are responsible for it. +This makes hidden-service directory statistics dangerous since they can +potentially be matched to specific hidden services. + +Furthermore, each hidden service has 6 directories, and each directory +serves a different set of services. +This means that attackers have 6 different data points per hidden service +every hour that can be used to reduce measurement noise. + +The following statistics are grouped by (1) hidden services publishing +descriptors and (2) clients fetching descriptors from hidden-service +directories. + +\subsubsection{Statistics on hidden services publishing descriptors to +hidden-service directories} + +\paragraph{Number of cached descriptors (3.1.1.)} + +\subparagraph{Details} + +Relays keep a local count of cached hidden-service descriptors. +Every time they add or remove a descriptor to their cache, relays update +their counter and record the time of change. +At the end of the statistics period they calculate statistics like +minimum, maximum, average number of hosted descriptors during the +statistics interval. +(There may be more efficient ways to implement these statistics that avoid +keeping a full history with timestamps, which are not discussed here.) + +\subparagraph{Benefits} + +This is an interesting statistic that would allow us to understand how +used hidden services are, and also detect sudden changes in the number of +services (botnets, chat protocols, etc.). +Also, learning the number of hidden services per directory will help us +find bugs in the hash ring code and also understand how loaded directories +are. +FWIW, when \verb+rend-spec-ng.txt+ gets implemented, it will be harder for +hidden service directories to learn the number of served services since +the descriptor will be encrypted. +However, directories will still be able to approximate the number of +services by checking the amount of descriptors received per publishing +period. +If this ever becomes a problem we can imagine publishing fake descriptors +to confuse the directories. + +\subparagraph{Risks} + +Publishing this stat would allow someone who is indexing hidden services +to be able to say ``I have seen 76~% of all HSes''. +We would really like to avoid having such an enumeration-facilitating +property. +We could be persuaded that with some heavy stats obfuscation (heavier than +the bridge stats obfuscation), this statistic might be plausible. +By statistics obfuscation, we mean obfuscating the numbers so that the +attacker can only say ``I'm somewhere between 60~% to 75~% of all +HSes.''. +This is a bit related to differential privacy as we understand it, but +much more basic. + +\paragraph{Number of descriptor updates per service (3.1.2.)} + +\subparagraph{Details} + +Relays count how many descriptor updates they see per service. +Assuming that stats are published daily (which is not necessary), this is +going to be a number between 1 and 24 (since RendPostPeriod is currently +one hour) and services pick a new directory after 24 hours (see +\verb+rendcommon.c:get_time_period()+). + +\subparagraph{Risks} + +Depending on how many HSes are behind each HSDir, this statistic might or +might not reveal uptime information about specific services. +Still it doesn't seem like something we want to risk. +Also, if the result is greater than 24, it means that an HS with modded +RendPostPeriod was publishing to that HSDir (and that the HSDir doesn't +have many clients). +Do we want to reveal that? +OTOH, it seems to me that if the directory is serving many services, this +statistic doesn't really provide any insight. + +\paragraph{Size of hidden service descriptors (3.1.3.)} + +\subparagraph{Details} + +Relays report the total/average size of received hidden service +descriptors. + +\subparagraph{Benefits} + +These statistics are not very helpful if reported by directories that +serve many services. +Any bugs or irregularities of one service will be smoothed out by all the +other services. +Basically, the only thing we would learn is approximately how much disk +space descriptors take, and maybe the average number of contained +introduction points (if we also know the number of services). +This statistic seems not very useful. + +\paragraph{Number of introduction points contained in descriptors +(3.1.4.)} + +\subparagraph{Details} + +Relays report average number of introduction points contained in +hidden-service descriptors, possibly also percentiles. + +\subparagraph{Benefits} + +It would be interesting to know whether services deviate from the default +number of introduction points. +Though it's unclear what we're going to do with this information. +This statistic will also be killed by rend-spec-ng. + +\paragraph{Number of descriptors with encrypted introduction points +(3.1.5.)} + +\subparagraph{Details} + +Relays can look at published hidden-service descriptor and count +descriptors with plain-text vs. encrypted introduction point sections. + +\subparagraph{Benefits} + +We would learn what fraction of services uses authentication features. +This statistic won't be available after implementing rend-spec-ng. + +\paragraph{Number of descriptors published over circuits built with TAP +vs. nTor (3.1.6.)} + +\subparagraph{Details} + +Relays remember whether an incoming circuit was built using TAP or nTor. +Whenever they receive a descriptor publication request they increment a +counter for either TAP or nTor. +See ticket 13466 for details. + +\subparagraph{Benefits} + +We would learn what fraction of hidden services run older tor versions +(0.2.3.x or older). + +\paragraph{Number of descriptors published to the wrong directory +(3.1.7.)} + +\subparagraph{Details} + +A relay reports the number of published descriptors that it is not +responsible for. + +\subsubsection{Statistics on clients fetching descriptors from +hidden-service directories} + +\paragraph{Number of descriptor fetch requests (3.2.1.)} + +\subparagraph{Details} + +A relay reports the total number of descriptor fetch requests, regardless +of the requested hidden service identity. + +\subparagraph{Risks} + +An adversary can use this statistic to evaluate the popularity of an HS. +An adversary can also use this stat to detect big changes in the numbers +of visitors of popular HSes. +Of course, there will be noise in the statitics since multiple services +correspond to each directory, but the adversary could reduce the noise +after observing the same service rotating to different directories, and +also by examining the statistics of all 6 directories that correspond to +the service. +This doesn't seem like a problem that is solvable with simple obfuscation +of stats, and I suggest we don't do this statistic at all. + +\paragraph{Number of descriptor fetch requests by hidden service identity +(3.2.2.)} + +\subparagraph{Details} + +Relays report the distribution of descriptor fetch requests to hidden +service identities. + +\paragraph{Number of descriptor fetch requests for non-existent descriptor +(3.2.3.)} + +\subparagraph{Details} + +Relays count the number of fetch requests for hidden service identities +they don't have in their cache. +We need to enumerate the reasons why a client would ask for the wrong +descriptor. +(but how do we find out...?) +For example: a) clock sync issues, b) different network view between, c) +``the hidden service hasn't published recently'', d) ``the hidden service +is offline for months''. + +\subparagraph{Benefits} + +This seems like a statistic that could potentially find bugs in Tor. + +\subparagraph{Risks} + +This statistic could reveal things that we don't really understand and +might reveal information about specific services. + +\paragraph{Number of descriptors fetched over circuits built with TAP vs. +nTor (3.2.4.)} + +\subparagraph{Details} + +Relays remember whether an incoming circuit was built using TAP or nTor. +Whenever they receive a descriptor fetch request they increment a counter +for either TAP or nTor. +See ticket 13466 for details. + +\subparagraph{Benefits} + +We would learn what fraction of hidden-service clients run older tor +versions (0.2.3.x or older). + +%- How many HSes is the HSDir hosting descriptors for? (harder to do with +%rend-spec-ng) (assuming that each HS desc is for one HS, this is already +%covered above.) +%- How many updates for the same HS desc did the HSDir see? (already covered +%above, it seems.) + +\section{Evaluation} + +Adding new statistics to something as sensitive as hidden services has two +sides: one side is the benefit from gathering data that can be used to +improve them, but the other side is potential harm to users. +The following table assigns points to both benefits and risks. +Each statistic can earn between 0 and 2 benefit points and between 0 and 2 +(negative) risk points. +The sum of both points provides us with a priority list from statistics +that make a lot of sense and don't pose much risk to statistics that are +mostly useless and at the same time very risky. + +\begin{longtable}{p{1cm}p{1cm}p{1cm}p{12cm}} +B & R & S \ +$+$ & $0$ & $+$ & Number of attempts to establish an introduction point +(1.1.1.) \ +$0$ & $0$ & $0$ & Time from establishing a circuit to becoming an +introduction point (1.1.2.) \ +$+$ & $0$ & $+$ & Number of failed attempts to establish an introduction +point (1.1.3.) \ +$+$ & $0$ & $+$ & Lifetime of introduction circuits (1.1.4.) \ +$+$ & $0$ & $+$ & Reasons for terminating established introduction points +(1.1.5.) \ +$+$ & $0$ & $+$ & Number of introduction circuits built with TAP vs. nTor +(1.1.6.) \ +$+$ & $-$ & $0$ & Total number of introductions received from clients +(1.2.1.) \ +$+$ & $-$ & $0$ & Number of introductions received by established +introduction point (1.2.2.) \ +$+$ & $-$ & $0$ & Number of discarded client introductions by reason +(1.2.3.) \ +$0$ & $0$ & $0$ & Time between establishing introduction point and receiving +the first client introduction (1.2.4.) \ +$+$ & $0$ & $+$ & Number of client introductions coming in via circuits +built with TAP vs. nTor (1.2.5.) \ +$+$ & $0$ & $+$ & Number of established rendezvous points (2.1.1.) \ +$0$ & $0$ & $0$ & Time from circuit creation to establishing rendezvous +point (2.1.2.) \ +$+$ & $0$ & $+$ & Number of rendezvous point establishment requests coming +in via circuits built with TAP vs. nTor (2.1.3.) \ +$++$ & $0$ & $++$ & Number of server rendezvous (2.2.1.) \ +$++$ & $0$ & $++$ & Time from establishing a rendezvous point to receiving the +server rendezvous (2.2.2.) \ +$+$ & $0$ & $+$ & Number of server rendezvous with unknown rendezvous cookie +(2.2.3.) \ +$+$ & $0$ & $+$ & Number of server rendezvous coming in via circuits built +with TAP vs. nTor (2.2.4.) \ +$++$ & $0$ & $++$ & Time from server rendezvous to first client data (2.3.1.) +\ +$++$ & $-$ & $+$ & Amount of data sent over connected rendezvous circuits in +either direction (2.3.2.) \ +$+$ & $-$ & $0$ & Time from first client data to tearing down circuit +(2.3.3.) \ +$++$ & $-$ & $+$ & Number of cached descriptors (3.1.1.) \ +$+$ & $-$ & $0$ & Number of descriptor updates per service (3.1.2.) \ +$0$ & $0$ & $0$ & Size of hidden service descriptors (3.1.3.) \ +$+$ & $0$ & $+$ & Number of introduction points contained in descriptors +(3.1.4.) \ +$+$ & $0$ & $+$ & Number of descriptors with encrypted introduction points +(3.1.5.) \ +$+$ & $0$ & $+$ & Number of descriptors published over circuits built with +TAP vs. nTor (3.1.6.) \ + & & & Number of descriptors published to the wrong directory +(3.1.7.) \ +$+$ & $--$ & $-$ & Number of descriptor fetch requests (3.2.1.) \ $+$ & +$--$ & $-$ & Number of descriptor fetch requests by hidden service identity +(3.2.2.) \ +$+$ & $0$ & $+$ & Number of descriptor fetch requests for non-existent +descriptor (3.2.3.) \ +$+$ & $0$ & $+$ & Number of descriptors fetched over circuits built with TAP +vs. nTor (3.2.4.) \ +\end{longtable} + +\end{document} + diff --git a/2015/hidden-service-stats/tortechrep.cls b/2015/hidden-service-stats/tortechrep.cls new file mode 120000 index 0000000..4c24db2 --- /dev/null +++ b/2015/hidden-service-stats/tortechrep.cls @@ -0,0 +1 @@ +../../tortechrep.cls \ No newline at end of file