[tech-reports/master] Import hidden-service-stats report from pad.

17 Jun 2015

commit 8d61d733dad03a3bace814cfe5972759f9716757
Author: Karsten Loesing <karsten.loesing@gmx.net>
Date:   Wed Nov 26 10:38:47 2014 +0100

    Import hidden-service-stats report from pad.
---
 2015/hidden-service-stats/.gitignore               |    3 +
 2015/hidden-service-stats/hidden-service-stats.tex | 1016 ++++++++++++++++++++
 2015/hidden-service-stats/tortechrep.cls           |    1 +
 3 files changed, 1020 insertions(+)

diff --git a/2015/hidden-service-stats/.gitignore b/2015/hidden-service-stats/.gitignore
new file mode 100644
index 0000000..2c5e321
--- /dev/null
+++ b/2015/hidden-service-stats/.gitignore
@@ -0,0 +1,3 @@
+.DS_Store
+hidden-service-stats.pdf
+
diff --git a/2015/hidden-service-stats/hidden-service-stats.tex b/2015/hidden-service-stats/hidden-service-stats.tex
new file mode 100644
index 0000000..ef8f46c
--- /dev/null
+++ b/2015/hidden-service-stats/hidden-service-stats.tex
@@ -0,0 +1,1016 @@
+\documentclass{tortechrep}
+\usepackage{url}
+\usepackage{hyperref}
+\usepackage{longtable}
+
+\begin{document}
+
+\title{Hidden-service statistics reported by relays}
+
+\author{David Goulet, George Kadianakis, Karsten Loesing}
+
+\contact{\href{mailto:dgoulet@torproject.org}{dgoulet@torproject.org},%
+\href{mailto:asn@torproject.org}{asn@torproject.org},%
+\href{mailto:karsten@torproject.org}{karsten@torproject.org}}
+
+\reportid{DRAFT}
+\date{January XX, 2015}
+
+\maketitle
+
+% Text conventions
+%  - Each sentence ends with a newline, even inside a paragraph.
+%  - Lines are at most 74 characters wide.
+%  - Abbreviations are best avoided.
+%  - Code, cell names, etc. are put inside \verb+...+.
+
+\begin{abstract}
+This document discusses new hidden-service related statistics to be
+gathered by relays and reported to the directory authorities in their
+extra-info descriptors.
+\end{abstract}
+
+\section{Motivation}
+
+We have little insight into hidden-service usage in the Tor network.
+The statistics discussed in this document shall help us get a basic
+understanding of hidden-service usage, improve their performance, find
+bugs, etc.
+
+\section{Design}
+
+The statistics discussed here can all be gathered by relays taking one of
+three possible roles in the rendezvous protocol: as 1) introduction point,
+2) rendezvous point, or 3) hidden-service directory.
+All statistics will be reported by relays to the directory authorities in
+their extra-info descriptors, possibly every 24 hours.
+
+General considerations for gathering hidden-service statistics:
+
+\begin{itemize}
+\item Should we report number and type of failures in the protocol, if
+these statistics are not sufficient to actually debug a problem?
+Could be a starting point to look at actual logs from relays.
+But is this what statistics are for?
+\item Should we not report statistics if a relay acted as dir/IPo/RPo for
+less than a certain threshold of clients/services?
+Can we make sure that an adversary doesn't generate traffic on their own
+to push a relay above that threshold and report a tiny number of real
+users?
+\end{itemize}
+
+There are no plans for gathering hidden-service statistics on hidden
+servers or clients, mostly because there is no data-collecting
+infrastructure in place and because privacy implications are even less
+clear in the case of single clients or servers reporting statistics than
+in the case of relays serving dozens or hundreds of hidden services and
+their clients.
+
+Note: there is an evaluation in the next section that can lead to
+positive/negative/neutral recommendations for actually proposing and
+implementing statistics.
+
+\subsection{Statistics from relays acting as introduction points}
+
+The following statistics are related to relays acting as introduction
+points.
+These cover (1) services establishing introduction points
+(\verb+ESTABLISH_INTRO+ cell) and (2) clients sending introductions to
+introduction points (\verb+INTRODUCE1+ cell).% and 3) the server
+%responding to the introduction point (the server does not respond to the
+%introduction point).
+
+\subsubsection{Statistics on hidden services establishing introduction
+points}
+
+\paragraph{Number of attempts to establish an introduction point (1.1.1.)}
+
+\subparagraph{Details}
+
+A relay counts how many \verb+ESTABLISH_INTRO+ cells it receives during
+the statistics interval.
+
+\subparagraph{Benefits}
+
+We could validate that we have a ``uniform'' random distribution among
+chosen introduction points in the network.
+If not, there might be a problem.
+
+\subparagraph{Risks}
+Considering we have a good randomness meaning every relay has the same
+chance to be picked, there are no obvious risks to share this.
+If not, we don't see a real risk for an attacker to know that a specific
+relay got chosen X times instead of the measured average Y.
+
+\paragraph{Time from establishing a circuit to becoming an introduction
+point (1.1.2.)}
+
+% (the following distinction cannot be made, AFAIK.  here's what happens:
+% we receive a CREATE (?) cell from another relay that establishes the
+% circuit to us, and then we receive an ESTABLISH_INTRO cell.  if the time
+% difference between those two events is small, we can guess that the
+% client built a new circuit for using us as introduction point, or that
+% she extended an existing circuit by one hop to do so.  if the time
+% difference is more than, say, a second, we can guess that the client
+% created this circuit a while ago and only recently decided to
+% cannibalize it and use it as introduction circuit.  statistics would
+% tell us what fraction of circuits is newly built and what was
+% cannibalized; well, allow guesses about the two cases.)
+
+\subparagraph{Details}
+
+A relay measures the time difference between a circuit extension from the
+previous relay in the circuit to receiving an \verb+ESTABLISH_INTRO+ cell.
+A very small time difference implies that the circuit was built/extended
+specifically for use as introduction point, whereas a larger time
+difference hints to the hidden service re-using a pre-built circuit for
+the introduction point.
+
+% [dgoulet]: "if the time difference between those two events is small, we
+% can guess that the client built a new circuit for using us as
+% introduction point, or that she extended an existing circuit by one hop
+% to do so" -- I think it should be that if the time difference is small,
+% one can assume a cannibalized circuit else a new circuit.
+% [karsten]: it might be that we're thinking of different things when we
+% say cannibalizing a circuit.  here are three possible cases, from the
+% perspective of a client/service:
+%   - create, extend, extend, ............. wait .................,
+%   establish introduction point   ->   large time difference observed by
+%   last relay
+%   - create, extend, extend, ............. wait ........., extend,
+%   establish introduction point   ->   small time difference observed by
+%   last relay
+%   - create, extend, extend, establish introduction point
+%   ->   small time difference observed by last relay
+%  which of these do you call cannibalized?  (I'm not sure what the code
+%  says here.)
+% [dgoulet]: To be honest not sure which one is right but what I can see
+% from the code is that a client circuit can be cannibalized for the
+% introducing part which is extended with an extra hop. Now the way I see
+% it is that there is probably a noticable time difference between using
+% an already created circuit for which we simply extend one hop versus
+% establishing a new one of 4 hops.
+
+
+% Newly established circuit.
+% Benefits: Performance reason, this can be useful to know the real cost
+% (on average) of becoming an IP. Can lead to understanding bottle necks
+% across the network or maybe identify relay that are misbehaving.
+% Risks: That can be tricky. Having a specific time frame on a circuit
+% establishment can maybe lead to some traffic correlation.
+% Cannibalized circuit.
+% Benefits: Performance reason, we can see if cannibalizing a circuit is
+% actually a gain from a new one. This value also could tell us what's the
+% fraction of circuit that are cannibalized and the net performance gain
+% of that which could lead to maybe better heuristic on choosing/creating
+% circuit to be cannibalized.
+% Risks: Also tricky. That info could tell us clearly if the IP circuit is
+% on a new or already established circuit which changes the traffic
+% timing. Not sure how useful it is to an attacker though.
+
+\subparagraph{Benefits}
+
+We would learn what fraction of introduction points can be established on
+short notice using pre-built circuits vs. first having to build or extend
+circuits.
+This is something we would measure on hidden services, but given that we
+don't have statistics from those, measuring this on introduction points
+seems like a fine workaround.
+
+% Both of these stats should probably report an average and a variance
+% instead of a <timestamp> + <circ. creation time>, that would be a
+% disaster.  (yes, please, no data about single events; that wouldn't fit
+% into descriptors anyway, and it would reveal far too much detail.)
+% I really wonder if an attacker could use this average to partition part
+% of the network to predict where the circuit can be located?
+
+\subparagraph{Risks}
+
+No obvious risks.  % only talking about aggregate statistics here, not
+% single observations.
+
+\paragraph{Number of failed attempts to establish an introduction point
+(1.1.3.)}
+
+\subparagraph{Details}
+
+A relay can not decline to be an introduction point.
+However, an \verb+ESTABLISH_INTRO+ cell might be malformed (wrong public
+key, bad signature, etc...).
+The relay would count the number of declined \verb+ESTABLISH_INTRO+ cells
+and report them along with the total number of received
+\verb+ESTABLISH_INTRO+ cells.
+Or it would report successes and failures, rather than totals and
+failures.
+
+\subparagraph{Benefits}
+
+Wrong \verb+ESTABLISH_INTRO+ cells shows either a very bad bug in the code
+or a deliberate action (data mangling, unknown attack, DoS, ...).
+
+% [dgoulet]: After an IRC discussion with arma and asn, I remember that
+% this one could be "cool to have" but without more information that we
+% can't collect for privacy reasons, this stat would not help at all in
+% the end game. The question remains if we should simply keep it or not
+% even if right now we don't see a added value?
+% [karsten:] right, this is a fine question, not only limited to this
+% statistic.  I added a new paragraph to the section start for "general
+% considerations for gathering hidden-service statistics".
+
+\subparagraph{Risks}
+
+No obvious risks.
+
+\paragraph{Lifetime of introduction circuits (1.1.4.)}
+
+\subparagraph{Details}
+
+How long did an introduction circuit last?
+Relays would report statistics like mean/median time, variance/IQR, and/or
+percentiles here.
+
+\subparagraph{Benefits}
+
+The longer introduction circuits last, the better, from a performance POV.
+If many circuits break after a short time period, that indicates that
+services should attempt to make better path-selection decisions for
+building introduction circuits.
+
+\paragraph{Reasons for terminating established introduction points
+(1.1.5.)}
+
+\subparagraph{Details}
+
+Relays report frequencies of circuit terminations requested by services
+vs. different types of failures.
+
+\subparagraph{Benefits}
+
+If there are more than a small percentage of failures, decide how to make
+things more robust.
+
+\subparagraph{Risks}
+
+No obvious risks.
+
+\paragraph{Number of introduction circuits built with TAP vs. nTor
+(1.1.6.)}
+
+\subparagraph{Details}
+
+Older clients (0.2.3.x) would build/extend circuits using TAP, newer
+clients would use nTor for that.
+Relays can report the number of introduction circuits that were built
+using either of the two methods.
+More precisely, relays would remember for each circuit how it was built,
+and as soon as they receive an \verb+ESTABLISH_INTRO+ cell, they increment
+one of two counters.
+See ticket 13466 for details.
+
+\subparagraph{Benefits}
+
+We would learn what fraction of hidden services run older tor versions
+(0.2.3.x or older).
+
+\subsubsection{Statistics on clients connecting to introduction points}
+
+\paragraph{Total number of introductions received from clients (1.2.1.)}
+
+\subparagraph{Details}
+
+Relays report how many \verb+INTRODUCE1+ cells they received from clients.
+
+\subparagraph{Benefits}
+
+This indicates that there is in fact a client trying to reach a hidden
+service thus the amount of cells could give us a rough estimate of how
+many clients are actually connecting and using hidden services.
+
+\subparagraph{Risks}
+
+Unclear.
+On the one hand, this is basically the same risk as the amount of time a
+relay is picked as an introduction point.
+On the other hand, an adversary could fetch a hidden-service descriptor,
+learn that a particular relay was an introduction point for that service,
+and then see the relay receive many \verb+INTRODUCE1+ cells.
+Basically, this statistic could be used to learn how many connection
+requests a very popular hidden service gets.
+
+% [dgoulet]: I think, after discussing it with Nick, that this might be OK
+% if the relay reports this stat for a lot of HS meaning the relay has at
+% least been an IP for multiple HS thus this stat can't be correlate to
+% one specific HS. Now, the period here can be difficult to get right.
+% RendPostPeriod is at 1 hour but is the HS actually changes the IP set
+% every upload period? If yes, that means that over let say 24 hours, that
+% INTRODUCE1 cell could potentially more than one HS which seems to me ok.
+% Might be still dicy if wrongly implemented.
+% [karsten]: this is also a fine question, not limited to this statistic;
+% which is why I moved it to the section start, too.  but I'm unclear what
+% this has to do with RendPostPeriod.  servers don't create a new set of
+% introduction points every hour, AFAIK.
+% [dgoulet]: No they don't, I confirmed in the code.
+
+\paragraph{Number of introductions received by established introduction
+point (1.2.2.)}
+
+\subparagraph{Details}
+
+Relays can serve as introduction point for an arbitrary number of hidden
+services.
+Relays could report statistics (like percentiles) on received
+\verb+INTRODUCE1+ cell by introduction circuit.
+
+\subparagraph{Benefits}
+
+This statistic would tell us something about usage diversity of hidden
+services.
+A special case would be the number or fraction of established introduction
+points that never sees a single \verb+INTRODUCE1+ cell.
+It's unclear what we'd do with this information, though.
+
+\paragraph{Number of discarded client introductions by reason (1.2.3.)}
+
+\subparagraph{Details}
+
+How many \verb+INTRODUCE1+ cells have been discarded because of unknown
+service/malformed (?)/whatever-can-go-wrong, by introduction point?
+
+\subparagraph{Benefits}
+
+Anything exceeding a small portion of discarded \verb+INTRODUCE1+ cells
+shows either a very bad bug in the code or a deliberate action (data
+mangling, unknown attack, DoS, ...).
+
+% [dgoulet]: That is again a "cool to have" stat but not sure how it would
+% help us investiguate. It can I guess trigger an alarm but apart from
+% that...
+% [karsten]: right, see section start.
+
+\subparagraph{Risks}
+
+No obvious risks.
+More precisely, if absolute numbers are reported, the risk is the same as
+the risk of reporting the number of received \verb+INTRODUCE1+ cells; if
+only fractions are reported, it's not that bad.
+
+\subparagraph{Recommendation}
+
+\paragraph{Time between establishing introduction point and receiving the
+first client introduction (1.2.4.)}
+
+\subparagraph{Details}
+
+Relays report the time between \verb+ESTABLISH_INTRO+ and first
+\verb+INTRODUCE1+ cell.
+
+\subparagraph{Benefits}
+
+This statistic tells us how long it takes for the hidden service to
+include a relay in its descriptor and publish that descriptor, and for the
+first client to fetch that descriptor and use that relay for its
+introduction.
+This may not be very useful, but is listed here for completeness.
+% [dgoulet]: That would basically leak the RendPostPeriod (if IP changes
+% at each upload) of the HS. Not sure how an attacker could use that to
+% his/her advantage but to consider.
+% [karsten]: again, I think you're wrong about introduction points
+% changing at each upload.
+% [dgoulet]: Yup, IP do *NOT* change at each upload.
+
+\subparagraph{Risks}
+
+No obvious risks.
+
+\paragraph{Number of client introductions coming in via circuits built
+with TAP vs. nTor (1.2.5.)}
+
+\subparagraph{Details}
+
+Relays remember whether an incoming circuit was built using TAP or nTor.
+Whenever they receive an INTRODUCE1 cell they increment a counter for
+either TAP or nTor.
+See ticket 13466 for details.
+
+\subparagraph{Benefits}
+
+We would learn what fraction of hidden-service clients run older tor
+versions (0.2.3.x or older).
+
+%3) Stats about the server responding to the introduction point.
+%   (this does not happen in the protocol)
+%
+% - How many INTRODUCE2 replayed cell we've observed?
+%   This can actually be an active attack or a client sending multiple
+%   INTRODUCE1 cell via different introduction points.
+%   - Benefits: Could give us an idea of how many client are misbehaving
+%   (actually need to confirm if the tor client can send multiple
+%   INTRODUCE1 for the same service).
+%   - Risks: No obvious risks.
+%
+%   Note that the amount of valid INTRODUCE2 cell seen should correspond
+%   to the amount of RP circuit launched (might be a cannibalized one).
+%   So, having that stat could be useful
+%   to again simply correlate that stat with the RP amout stat. Finding
+%   out that there is a discrepancy could help us narrow down performance
+%   issue.
+%
+% (removed the following, because receiving INTRODUCE1 triggers an event
+% that ends with responding with INTRODUCE_ACK.)
+% - How many INTRODUCE_ACK were sent to the client?
+%   - Benefits: Can be coupled with the how many INTRODUCE1 we've seen and
+%   look for discrepancy. The difference of intro1 and intro_ack could not
+%   be explained though without a reason why the HS dropped it or if the
+%   HS did receive the intro1 at all. So, this stat can be fun to have but
+%   not really useful for performance tuning I would say.
+%   - Risks: No obvious risks.
+
+\subsection{Statistics from relays acting as rendezvous points}
+
+The following statistics are all related to relays acting as rendezvous
+points.
+These statistics cover the whole process from (1) clients establishing
+rendezvous points, (2) servers connecting to a client's rendezvous point,
+and (3) clients creating streams to the server, exchanging data, and
+tearing down the circuit.
+These phases of the rendezvous protocol are also used to organize the
+statistics below.
+All statistics focus on the number or timing of cells exchanged in the
+rendezvous protocol and underlying OR protocol.
+
+\subsubsection{Statistics on clients establishing rendezvous points}
+
+\paragraph{Number of established rendezvous points (2.1.1.)}
+
+\subparagraph{Details}
+
+Relays report how many \verb+ESTABLISH_RENDEZVOUS+ cells they received.
+
+\subparagraph{Benefits}
+
+The number of received \verb+ESTABLISH_RENDEZVOUS+ cells indicates how
+many connection attempts there are by clients to services that are
+running.
+This number is different from the number of descriptor fetches which
+happen when clients don't know yet whether a service is running, which
+will be omitted if clients still have a descriptor cached from a previous
+connection, and which we may not even gather because of privacy concerns.
+We can easily weight the number of \verb+ESTABLISH_RENDEZVOUS+ cells with
+the probability of choosing a relay as rendezvous point to estimate the
+total number of such cells in the network.
+
+\subparagraph{Risk}
+
+There is no obvious risk from sharing this number if aggregated over a
+large enough time period.
+
+\paragraph{Time from circuit creation to establishing rendezvous point
+(2.1.2.)}
+
+\subparagraph{Details}
+
+Relays report statistics on the time between circuit creation to receiving
+a \verb+ESTABLISH_RENDEZVOUS+ cell.
+
+\subparagraph{Benefits}
+
+The time from receiving a circuit creation request to seeing the
+\verb+ESTABLISH_RENDEZVOUS+ cell can help us optimize the rendezvous
+protocol for performance.
+The current implementation either builds a new circuit or extends an
+existing circuit by one hop before sending the \verb+ESTABLISH_RENDEZVOUS+
+cell.
+So, the measured time will be close to zero.
+But if we ever decide to re-use existing circuits for rendezvous without
+extending them by another hop, this metric will give us an idea on the
+adoption of that change.
+Admitted, this benefit is not huge.
+
+\subparagraph{Risk}
+
+There is no obvious risk related to this statistic.
+
+% [dgoulet]: Reporting a mean/average and with maybe a treshold before we
+% publish like "we need 100 rdv cell before reporting this stat" ?
+% [karsten]: agreed that this may seem useful in general.  but what if the
+% adversary sends 100 cells themselves to help us get past the threshold
+% and report a tiny number of actual user cells?  but I added an item to
+% the section start where we can discuss whether this is a good safeguard
+% in general.
+% Right well that's a time statistic and not an amount so if an attacker
+% would establish 100 RP I guess he/she indeed poisoning the stat?...
+
+\subparagraph{Recommendation}
+
+\paragraph{Number of rendezvous point establishment requests coming in via
+circuits built with TAP vs. nTor (2.1.3.)}
+
+\subparagraph{Details}
+
+Relays remember whether an incoming circuit was built using TAP or nTor.
+Whenever they receive an \verb+ESTABLISH_RENDEZVOUS+ cell they increment a
+counter for either TAP or nTor.
+See ticket 13466 for details.
+
+\subparagraph{Benefits}
+
+We would learn what fraction of hidden-service clients run older tor
+versions (0.2.3.x or older).
+
+% How much RP traffic was transfererd through RP circuits?  (see below re:
+% RELAY cells)
+
+% Average traffic transfered through RP circuits?  (see below re: RELAY
+% cells)
+
+\subsubsection{Statistics on servers connecting to a client's rendezvous
+point}
+
+\paragraph{Number of server rendezvous (2.2.1.)}
+
+\subparagraph{Details}
+
+Relays report the total number of \verb+RENDEZVOUS1+ cells they receive.
+
+\subparagraph{Benefits}
+
+The number of received \verb+RENDEZVOUS1+ cells tells us how many
+connection requests are actually accepted by servers.
+This number may be lower than the number of \verb+ESTABLISH_RENDEZVOUS+
+cells, because of failures in connection establishment, authentication
+failures, or other reasons.
+
+\subparagraph{Risks}
+
+There is no obvious risk from this metric, because it's unrelated to any
+given client or server.
+
+% [dgoulet]: Wondering if there is a real benefit here? I guess if we see
+% 100 RENDEZVOUS1 and onlye *one* ESTABLISH_RENDEZVOUS, that might signal
+% an issue... ?
+% [karsten]: the idea is that things can go wrong between establishing a
+% rendezvous point and the server sending a rendezvous.  knowing what
+% fraction of established rendezvous are actually used tells us something.
+% and I think you mean 1 RENDEZVOUS1 and 100 ESTABLISH_RENDEZVOUS in your
+% example.  because 100 RENDEZVOUS1 for a single ESTABLISH_RENDEZVOUS
+% would for sure look funny.
+% [dgoulet]: well either way, it's an issue :)... If the HS sends a big
+% amount of RENDEZVOUS1 to Alice's RP for which Alice only created one RP
+% (one  ESTABLISH_RENDEZVOUS), that's quite an issue (loop that went wrong
+% :).
+
+\paragraph{Time from establishing a rendezvous point to receiving the
+server rendezvous (2.2.2.)}
+
+\subparagraph{Details}
+
+Relays report the time from receiving an \verb+ESTABLISH_RENDEZVOUS+ cell
+to receiving the corresponding \verb+RENDEZVOUS1+ cell.
+
+\subparagraph{Benefits}
+
+The time between receiving an \verb+ESTABLISH_RENDEZVOUS+ cell from the
+client and the corresponding \verb+RENDEZVOUS1+ cell from the server tells
+us a lot about performance of the rendezvous protocol.
+The rendezvous point is the only place in the protocol that witnesses
+events near the beginning and near the end of the connection establishment
+process.
+If we ever want to improve the substeps inbetween, this metric is the only
+way to measure effectiveness of improvements in the deployed network.
+
+\subparagraph{Risks}
+
+Again, there are at least no obvious risks from gathering this statistic.
+
+\paragraph{Number of server rendezvous with unknown rendezvous cookie
+(2.2.3.)}
+
+\subparagraph{Details}
+
+Relays report the number of \verb+RENDEZVOUS1+ cell with unknown
+rendezvous cookie.
+
+\subparagraph{Benefits}
+
+The number of \verb+RENDEZVOUS1+ cell that cannot be matched with a
+previously established rendezvous circuit can be interesting for analyzing
+problems in the protocol.
+We might even distinguish between rendezvous cookies that were previously
+known to the relay and those that seem entirely unrelated.
+The benefit gained from this statistic is not huge though.
+
+\subparagraph{Risk}
+
+No obvious risks.
+
+\paragraph{Number of server rendezvous coming in via circuits built with
+TAP vs. nTor (2.2.4.)}
+
+\subparagraph{Details}
+
+Relays remember whether an incoming circuit was built using TAP or nTor.
+Whenever they receive a \verb+RENDEZVOUS1+ cell they increment a counter
+for either TAP or nTor.
+See ticket 13466 for details.
+
+\subparagraph{Benefits}
+
+We would learn what fraction of hidden services run older tor versions
+(0.2.3.x or older).
+
+\subsubsection{Statistics on clients creating streams to the server,
+exchanging data, and tearing down the circuit}
+
+\paragraph{Time from server rendezvous to first client data (2.3.1.)}
+
+\subparagraph{Details}
+
+Relays report the time from receiving a \verb+RENDEZVOUS1+ cell to seeing
+the first \verb+RELAY+ cell sent from the client.
+
+\subparagraph{Benefits}
+The time from receiving a \verb+RENDEZVOUS1+ cell from the server (and
+relaying it as \verb+RENDEZVOUS2+ cell to the client) and receiving the
+first \verb+RELAY+ cell from the client is another performance indicator
+of the protocol.
+
+\subparagraph{Risks}
+
+There are no obvious risks from learning the time between these two
+substeps in the rendezvous protocol.
+
+\paragraph{Amount of data sent over connected rendezvous circuits in
+either direction (2.3.2.)}
+
+\subparagraph{Details}
+
+Relays report the number of \verb+RELAY+ cells sent in either direction.
+
+\subparagraph{Benefits}
+
+The number of \verb+RELAY+ cells sent by either client or server can give
+us a detailed view on hidden service usage.
+In contrast to common Tor usage, there is no point in the rendezvous
+protocol where we could count transferred bytes.
+The number of cells is the best approximation that we have.
+In addition to the total number of cells, the number of cells by direction
+can indicate how common classical client-server protocols are compared to
+peer-to-peer models.
+As a special case, we'd want to know what fraction of circuits has zero
+\verb+RELAY+ cells, which would indicate a connection problem late in the
+process.
+
+\subparagraph{Risks}
+
+In contrast to the cells discussed above, \verb+RELAY+ cells contain
+actual user content.
+The pattern of \verb+RELAY+ cells could also be used to fingerprint a
+given server or even client.
+While total number of cells by direction aggregated over a certain time
+period should be okay to measure, any statistics going further than that
+need closer analysis.
+
+\paragraph{Time from first client data to tearing down circuit (2.3.3.)}
+
+\subparagraph{Details}
+
+Relays report the time from seeing the first \verb+RELAY+ cell sent by the
+client to tearing down circuit by either client or server.
+
+\subparagraph{Benefits}
+
+The time between receiving the first \verb+RELAY+ cell to tearing down the
+circuit indicates typical session length of hidden service connections.
+We'd be able to say whether typical hidden-service connections are rather
+short-lived or long-lived.
+This information may help us make educated guesses on the type of
+applications run over hidden services.
+It may also help us improve the selection criteria for rendezvous points.
+
+\subparagraph{Risks}
+
+Session length is quite sensitive data that could be correlated with
+circuit lifetimes at other places in the network.
+Fortunately, the rendezvous point is neither specific to any given client
+or service, which makes this information slightly less sensitive.
+Still, this metric needs further analysis.
+
+% How many rendezvous requests finally succeded?
+% Opposite: What percentage of the time did the rendezvous fail to happen?
+% (rendezvous can fail at different steps.  one way to count failures is
+% to compare number of ESTABLISH_RENDEZVOUS, RENDEZVOUS1, and subsequent
+% RELAY cells.)
+% How much time did it take to splice the RP circuit? (#13194)  (you mean
+% time from RENDEZVOUS1 to first RELAY cell?)
+
+\subsection{Statistics from relays acting as hidden-service directories}
+
+% HSDirs threat model notes
+Hidden Service directories periodically receive HS descriptors from hidden
+services.
+They cache them, and then serve them to any clients that ask for them.
+
+Hidden service directories are placed in a hash ring, and each hidden
+service picks a slice of hidden service directories from that hash ring.
+Given the address of a hidden service, it's easy to learn which
+directories are responsible for it.
+This makes hidden-service directory statistics dangerous since they can
+potentially be matched to specific hidden services.
+
+Furthermore, each hidden service has 6 directories, and each directory
+serves a different set of services.
+This means that attackers have 6 different data points per hidden service
+every hour that can be used to reduce measurement noise.
+
+The following statistics are grouped by (1) hidden services publishing
+descriptors and (2) clients fetching descriptors from hidden-service
+directories.
+
+\subsubsection{Statistics on hidden services publishing descriptors to
+hidden-service directories}
+
+\paragraph{Number of cached descriptors (3.1.1.)}
+
+\subparagraph{Details}
+
+Relays keep a local count of cached hidden-service descriptors.
+Every time they add or remove a descriptor to their cache, relays update
+their counter and record the time of change.
+At the end of the statistics period they calculate statistics like
+minimum, maximum, average number of hosted descriptors during the
+statistics interval.
+(There may be more efficient ways to implement these statistics that avoid
+keeping a full history with timestamps, which are not discussed here.)
+
+\subparagraph{Benefits}
+
+This is an interesting statistic that would allow us to understand how
+used hidden services are, and also detect sudden changes in the number of
+services (botnets, chat protocols, etc.).
+Also, learning the number of hidden services per directory will help us
+find bugs in the hash ring code and also understand how loaded directories
+are.
+FWIW, when \verb+rend-spec-ng.txt+ gets implemented, it will be harder for
+hidden service directories to learn the number of served services since
+the descriptor will be encrypted.
+However, directories will still be able to approximate the number of
+services by checking the amount of descriptors received per publishing
+period.
+If this ever becomes a problem we can imagine publishing fake descriptors
+to confuse the directories.
+
+\subparagraph{Risks}
+
+Publishing this stat would allow someone who is indexing hidden services
+to be able to say ``I have seen 76~\% of all HSes''.
+We would really like to avoid having such an enumeration-facilitating
+property.
+We could be persuaded that with some heavy stats obfuscation (heavier than
+the bridge stats obfuscation), this statistic might be plausible.
+By statistics obfuscation, we mean obfuscating the numbers so that the
+attacker can only say ``I'm somewhere between 60~\% to 75~\% of all
+HSes.''.
+This is a bit related to differential privacy as we understand it, but
+much more basic.
+
+\paragraph{Number of descriptor updates per service (3.1.2.)}
+
+\subparagraph{Details}
+
+Relays count how many descriptor updates they see per service.
+Assuming that stats are published daily (which is not necessary), this is
+going to be a number between 1 and 24 (since RendPostPeriod is currently
+one hour) and services pick a new directory after 24 hours (see
+\verb+rendcommon.c:get_time_period()+).
+
+\subparagraph{Risks}
+
+Depending on how many HSes are behind each HSDir, this statistic might or
+might not reveal uptime information about specific services.
+Still it doesn't seem like something we want to risk.
+Also, if the result is greater than 24, it means that an HS with modded
+RendPostPeriod was publishing to that HSDir (and that the HSDir doesn't
+have many clients).
+Do we want to reveal that?
+OTOH, it seems to me that if the directory is serving many services, this
+statistic doesn't really provide any insight.
+
+\paragraph{Size of hidden service descriptors (3.1.3.)}
+
+\subparagraph{Details}
+
+Relays report the total/average size of received hidden service
+descriptors.
+
+\subparagraph{Benefits}
+
+These statistics are not very helpful if reported by directories that
+serve many services.
+Any bugs or irregularities of one service will be smoothed out by all the
+other services.
+Basically, the only thing we would learn is approximately how much disk
+space descriptors take, and maybe the average number of contained
+introduction points (if we also know the number of services).
+This statistic seems not very useful.
+
+\paragraph{Number of introduction points contained in descriptors
+(3.1.4.)}
+
+\subparagraph{Details}
+
+Relays report average number of introduction points contained in
+hidden-service descriptors, possibly also percentiles.
+
+\subparagraph{Benefits}
+
+It would be interesting to know whether services deviate from the default
+number of introduction points.
+Though it's unclear what we're going to do with this information.
+This statistic will also be killed by rend-spec-ng.
+
+\paragraph{Number of descriptors with encrypted introduction points
+(3.1.5.)}
+
+\subparagraph{Details}
+
+Relays can look at published hidden-service descriptor and count
+descriptors with plain-text vs. encrypted introduction point sections.
+
+\subparagraph{Benefits}
+
+We would learn what fraction of services uses authentication features.
+This statistic won't be available after implementing rend-spec-ng.
+
+\paragraph{Number of descriptors published over circuits built with TAP
+vs. nTor (3.1.6.)}
+
+\subparagraph{Details}
+
+Relays remember whether an incoming circuit was built using TAP or nTor.
+Whenever they receive a descriptor publication request they increment a
+counter for either TAP or nTor.
+See ticket 13466 for details.
+
+\subparagraph{Benefits}
+
+We would learn what fraction of hidden services run older tor versions
+(0.2.3.x or older).
+
+\paragraph{Number of descriptors published to the wrong directory
+(3.1.7.)}
+
+\subparagraph{Details}
+
+A relay reports the number of published descriptors that it is not
+responsible for.
+
+\subsubsection{Statistics on clients fetching descriptors from
+hidden-service directories}
+
+\paragraph{Number of descriptor fetch requests (3.2.1.)}
+
+\subparagraph{Details}
+
+A relay reports the total number of descriptor fetch requests, regardless
+of the requested hidden service identity.
+
+\subparagraph{Risks}
+
+An adversary can use this statistic to evaluate the popularity of an HS.
+An adversary can also use this stat to detect big changes in the numbers
+of visitors of popular HSes.
+Of course, there will be noise in the statitics since multiple services
+correspond to each directory, but the adversary could reduce the noise
+after observing the same service rotating to different directories, and
+also by examining the statistics of all 6 directories that correspond to
+the service.
+This doesn't seem like a problem that is solvable with simple obfuscation
+of stats, and I suggest we don't do this statistic at all.
+
+\paragraph{Number of descriptor fetch requests by hidden service identity
+(3.2.2.)}
+
+\subparagraph{Details}
+
+Relays report the distribution of descriptor fetch requests to hidden
+service identities.
+
+\paragraph{Number of descriptor fetch requests for non-existent descriptor
+(3.2.3.)}
+
+\subparagraph{Details}
+
+Relays count the number of fetch requests for hidden service identities
+they don't have in their cache.
+We need to enumerate the reasons why a client would ask for the wrong
+descriptor.
+(but how do we find out...?)
+For example: a) clock sync issues, b) different network view between, c)
+``the hidden service hasn't published recently'', d) ``the hidden service
+is offline for months''.
+
+\subparagraph{Benefits}
+
+This seems like a statistic that could potentially find bugs in Tor.
+
+\subparagraph{Risks}
+
+This statistic could reveal things that we don't really understand and
+might reveal information about specific services.
+
+\paragraph{Number of descriptors fetched over circuits built with TAP vs.
+nTor (3.2.4.)}
+
+\subparagraph{Details}
+
+Relays remember whether an incoming circuit was built using TAP or nTor.
+Whenever they receive a descriptor fetch request they increment a counter
+for either TAP or nTor.
+See ticket 13466 for details.
+
+\subparagraph{Benefits}
+
+We would learn what fraction of hidden-service clients run older tor
+versions (0.2.3.x or older).
+
+%- How many HSes is the HSDir hosting descriptors for? (harder to do with
+%rend-spec-ng) (assuming that each HS desc is for one HS, this is already
+%covered above.)
+%- How many updates for the same HS desc did the HSDir see? (already covered
+%above, it seems.)
+
+\section{Evaluation}
+
+Adding new statistics to something as sensitive as hidden services has two
+sides: one side is the benefit from gathering data that can be used to
+improve them, but the other side is potential harm to users.
+The following table assigns points to both benefits and risks.
+Each statistic can earn between 0 and 2 benefit points and between 0 and 2
+(negative) risk points.
+The sum of both points provides us with a priority list from statistics
+that make a lot of sense and don't pose much risk to statistics that are
+mostly useless and at the same time very risky.
+
+\begin{longtable}{p{1cm}p{1cm}p{1cm}p{12cm}}
+B    & R    & S \\
+$+$  & $0$  & $+$  & Number of attempts to establish an introduction point
+(1.1.1.) \\
+$0$  & $0$  & $0$  & Time from establishing a circuit to becoming an
+introduction point (1.1.2.) \\
+$+$  & $0$  & $+$  & Number of failed attempts to establish an introduction
+point (1.1.3.) \\
+$+$  & $0$  & $+$  & Lifetime of introduction circuits (1.1.4.) \\
+$+$  & $0$  & $+$  & Reasons for terminating established introduction points
+(1.1.5.) \\
+$+$  & $0$  & $+$  & Number of introduction circuits built with TAP vs. nTor
+(1.1.6.) \\
+$+$  & $-$  & $0$  & Total number of introductions received from clients
+(1.2.1.) \\
+$+$  & $-$  & $0$  & Number of introductions received by established
+introduction point (1.2.2.) \\
+$+$  & $-$  & $0$  & Number of discarded client introductions by reason
+(1.2.3.) \\
+$0$  & $0$  & $0$  & Time between establishing introduction point and receiving
+the first client introduction (1.2.4.) \\
+$+$  & $0$  & $+$  & Number of client introductions coming in via circuits
+built with TAP vs. nTor (1.2.5.) \\
+$+$  & $0$  & $+$  & Number of established rendezvous points (2.1.1.) \\
+$0$  & $0$  & $0$  & Time from circuit creation to establishing rendezvous
+point (2.1.2.) \\
+$+$  & $0$  & $+$  & Number of rendezvous point establishment requests coming
+in via circuits built with TAP vs. nTor (2.1.3.) \\
+$++$ & $0$  & $++$ & Number of server rendezvous (2.2.1.) \\
+$++$ & $0$  & $++$ & Time from establishing a rendezvous point to receiving the
+server rendezvous (2.2.2.) \\
+$+$  & $0$  & $+$  & Number of server rendezvous with unknown rendezvous cookie
+(2.2.3.) \\
+$+$  & $0$  & $+$  & Number of server rendezvous coming in via circuits built
+with TAP vs. nTor (2.2.4.) \\
+$++$ & $0$  & $++$ & Time from server rendezvous to first client data (2.3.1.)
+\\
+$++$ & $-$  & $+$  & Amount of data sent over connected rendezvous circuits in
+either direction (2.3.2.) \\
+$+$  & $-$  & $0$  & Time from first client data to tearing down circuit
+(2.3.3.) \\
+$++$ & $-$  & $+$  & Number of cached descriptors (3.1.1.) \\
+$+$  & $-$  & $0$  & Number of descriptor updates per service (3.1.2.) \\
+$0$  & $0$  & $0$  & Size of hidden service descriptors (3.1.3.) \\
+$+$  & $0$  & $+$  & Number of introduction points contained in descriptors
+(3.1.4.) \\
+$+$  & $0$  & $+$  & Number of descriptors with encrypted introduction points
+(3.1.5.) \\
+$+$  & $0$  & $+$  & Number of descriptors published over circuits built with
+TAP vs. nTor (3.1.6.) \\
+     &      &      & Number of descriptors published to the wrong directory
+(3.1.7.) \\
+$+$  & $--$ & $-$  & Number of descriptor fetch requests (3.2.1.) \\ $+$  &
+$--$ & $-$  & Number of descriptor fetch requests by hidden service identity
+(3.2.2.) \\
+$+$  & $0$  & $+$  & Number of descriptor fetch requests for non-existent
+descriptor (3.2.3.) \\
+$+$  & $0$  & $+$  & Number of descriptors fetched over circuits built with TAP
+vs. nTor (3.2.4.) \\
+\end{longtable}
+
+\end{document}
+
diff --git a/2015/hidden-service-stats/tortechrep.cls b/2015/hidden-service-stats/tortechrep.cls
new file mode 120000
index 0000000..4c24db2
--- /dev/null
+++ b/2015/hidden-service-stats/tortechrep.cls
@@ -0,0 +1 @@
+../../tortechrep.cls
\ No newline at end of file

    

karsten＠torproject.org

tags

participants (1)