[tor-commits] [tech-reports/master] Rewrite major parts of the hid-serv-stats report.

karsten at torproject.org karsten at torproject.org
Wed Jun 17 18:48:07 UTC 2015


commit ba71cbb86d48b54b3dd10104bdfa12c8b0c17a66
Author: Karsten Loesing <karsten.loesing at gmx.net>
Date:   Sat Nov 29 09:54:35 2014 +0100

    Rewrite major parts of the hid-serv-stats report.
---
 2015/hidden-service-stats/.gitignore               |    1 +
 2015/hidden-service-stats/hidden-service-stats.tex | 1531 ++++++++++----------
 2015/hidden-service-stats/protocol.odg             |  Bin 0 -> 17986 bytes
 2015/hidden-service-stats/protocol.pdf             |  Bin 0 -> 20270 bytes
 4 files changed, 801 insertions(+), 731 deletions(-)

diff --git a/2015/hidden-service-stats/.gitignore b/2015/hidden-service-stats/.gitignore
index 2c5e321..863d44c 100644
--- a/2015/hidden-service-stats/.gitignore
+++ b/2015/hidden-service-stats/.gitignore
@@ -1,3 +1,4 @@
 .DS_Store
 hidden-service-stats.pdf
+*.toc
 
diff --git a/2015/hidden-service-stats/hidden-service-stats.tex b/2015/hidden-service-stats/hidden-service-stats.tex
index 75ce789..7760e83 100644
--- a/2015/hidden-service-stats/hidden-service-stats.tex
+++ b/2015/hidden-service-stats/hidden-service-stats.tex
@@ -1,5 +1,6 @@
 \documentclass{tortechrep}
 \usepackage{url}
+\usepackage{graphicx}
 \usepackage{hyperref}
 \usepackage{longtable}
 
@@ -24,86 +25,335 @@
 %  - Abbreviations are best avoided.
 %  - Code, cell names, etc. are put inside \verb+...+.
 
+\tableofcontents
+
 \begin{abstract}
-This document discusses new hidden-service related statistics to be
-gathered by relays and reported to the directory authorities in their
-extra-info descriptors.
+We have little insight into hidden-service usage in the public Tor
+network.
+In this report we discuss possible statistics to understand hidden
+services better which can be used to make performance improvements, find
+bugs, and possibly even detect attacks.
+We also evaluate whether, and how, statistics could be abused by an
+adversary.
+The main contribution of this report is a comprehensive list of
+hidden-service related statistics with recommendations for or against
+gathering them in the public Tor network.
 \end{abstract}
 
-\section{Motivation}
-
-We have little insight into hidden-service usage in the Tor network.
-The statistics discussed in this document shall help us get a basic
-understanding of hidden-service usage, improve their performance, find
-bugs, etc.
-
-\section{Design}
-
-The statistics discussed here can all be gathered by relays taking one of
-three possible roles in the rendezvous protocol: as 1) introduction point,
-2) rendezvous point, or 3) hidden-service directory.
-All statistics will be reported by relays to the directory authorities in
-their extra-info descriptors, possibly every 24 hours.
-
-General considerations for gathering hidden-service statistics:
-
-\begin{itemize}
-\item Should we report number and type of failures in the protocol, if
-these statistics are not sufficient to actually debug a problem?
-Could be a starting point to look at actual logs from relays.
-But is this what statistics are for?
-\item Should we not report statistics if a relay acted as dir/IPo/RPo for
-less than a certain threshold of clients/services?
-Can we make sure that an adversary doesn't generate traffic on their own
-to push a relay above that threshold and report a tiny number of real
-users?
-\end{itemize}
-
-There are no plans for gathering hidden-service statistics on hidden
-servers or clients, mostly because there is no data-collecting
-infrastructure in place and because privacy implications are even less
-clear in the case of single clients or servers reporting statistics than
-in the case of relays serving dozens or hundreds of hidden services and
-their clients.
-
-Note: there is an evaluation in the next section that can lead to
-positive/negative/neutral recommendations for actually proposing and
-implementing statistics.
-
-\subsection{Statistics from relays acting as introduction points}
-
-The following statistics are related to relays acting as introduction
-points.
-These cover (1) services establishing introduction points
-(\verb+ESTABLISH_INTRO+ cell) and (2) clients sending introductions to
-introduction points (\verb+INTRODUCE1+ cell).% and 3) the server
-%responding to the introduction point (the server does not respond to the
-%introduction point).
-
-\subsubsection{Statistics on hidden services establishing introduction
-points}
-
-\paragraph{Number of attempts to establish an introduction point (1.1.1.)}
+\section{Introduction}
+
+Tor hidden services are a means to provide web services over Tor while
+hiding their physical location.
+Unfortunately (or some would say fortunately), we have little insight into
+hidden-service usage in the public Tor network.
+A basic understanding of hidden-service usage would help us direct efforts
+on hidden-service development.
+These efforts include making performance improvements, finding bugs, and
+possibly even detecting attacks.
+
+In this report we compile a list of hidden-service related statistics and
+evaluate whether and how they could be used to make hidden services
+better.
+% There are no plans for gathering hidden-service statistics on hidden
+% servers or clients, mostly because there is no data-collecting
+% infrastructure in place and because privacy implications are even less
+% clear in the case of single clients or servers reporting statistics than
+% in the case of relays serving dozens or hundreds of hidden services and
+% their clients.
+At the same time we evaluate whether and how these statistics could be
+abused by an adversary by providing them with data from relays they don't
+control.
+The primary purpose of hidden services to hide their location and the
+location of their users must not be put at risk.
+Other security properties like their ability to hide their existance, at
+least to some extent, should not be sacrificed for potential improvements.
+As a result we present a list of hidden-service related statistics with
+recommendations for or against deploying them in the public Tor network.
+
+This report is structured as follows:
+the next section gives an overview of the hidden-service protocol with
+special focus on measurement points for hidden-service statistics.
+Section~\ref{sec:criteria} defines evaluation criteria for statistics to
+decide whether they should be considered helpful and/or harmful.
+Section~\ref{sec:list} contains a list of hidden-service related
+statistics, partly with early results obtained from a private Tor network,
+and an evaluation of their helpfulness and harmfulness.
+Section~\ref{sec:recommendation} concludes this report by recommending
+which statistics should be implemented and deployed in the public Tor
+network.
+
+\section{Hidden-service protocol and measurement points}
+\label{sec:protocol}
+
+The hidden-service protocol consists of multiple substeps to make a
+service available in the network and for clients connecting to a service.
+The statistics discussed in this document all rely on relays gathering
+aggregate statistics of their role in the hidden-service protocol.
+These roles are: a) directory, b) introduction point, and c) rendezvous
+point.
+Figure~\ref{fig:protocol} gives an overview of protocol steps.
+
+\begin{figure}
+\centering
+\includegraphics[width=0.8\textwidth]{protocol.pdf}
+\caption{Hidden-service protocol steps as observed by relays.}
+\label{fig:protocol}
+\end{figure}
+
+The protocol substeps for making a service available in the network are as
+follows:
+
+\begin{enumerate}
+\item The service \emph{establishes an introduction point} on one or more
+relays.
+A relay first receives a circuit extension request from another relay to
+become the next relay in a circuit.
+At this time the relay does not recognize that it will become an
+introduction point.
+Then the relay receives a request to establish an introduction point on
+behalf of the service that built the circuit.
+However, the introduction point does not know which service it's serving,
+because the service creates a fresh identity key for each introduction
+point.
+The circuit, now called introduction circuit, will be kept open until the
+service closes it.
+From that time on the relay accepts client introductions for the service
+coming in via other circuits.
+\item The service \emph{publishes a descriptor} to a total number of six
+directories, which are common relays with high-enough uptime of 24 hours
+or more.
+Each of those relays first receives a circuit extension request, followed
+by a request to publish the descriptor.
+The relay can read most parts of the descriptor which includes service
+identity and selected introduction points.
+The circuit that was built by the service is closed immediately after
+transmitting the descriptor.
+\end{enumerate}
+
+The protocol substeps for a client connecting to a service are as
+follows:
+
+\begin{enumerate}
+\setcounter{enumi}{2}
+\item The client \emph{fetches a descriptors} from a directory.
+The relay that acts as directory first receives a circuit extension
+request, followed by a request to fetch the descriptor.
+Once the descriptor is returned, or a response saying that it does not
+exist, the circuit is closed immediately.
+\item The client \emph{establishes a rendezvous point} on a relay.
+The relay observes a circuit extension request, followed by the request to
+become the client's rendezvous point for connecting to a service.
+The relay does not learn which service the client attempts to connect to,
+but it only learns a random identifier.
+The circuit, now called rendezvous circuit, is kept open until either
+client or service close it.
+\item The client \emph{sends an introduction} to one of the service's
+introduction points.
+The introduction point first observes a circuit extension request,
+followed by the introduction message, which it forwards to the service via
+its introduction circuit.
+The client's circuit is torn down immediately after receiving the
+introduction and responding with an acknowledgement.
+\item The service \emph{sends a rendezvous message} to the client's
+rendezvous point.
+The rendezvous point sees a circuit extension request, followed by the
+rendezvous message, which it forwards to the client via its rendezvous
+circuit.
+Both service-side and client-side circuits are kept open until either side
+decides to close it.
+\item Both \emph{client and service send and receive data} along their
+part of the rendezvous circuit.
+The rendezvous point sees cells coming in from either side and forwards
+them to the other side.
+All cells are padded and end-to-end encrypted between client and service,
+so that the rendezvous point only sees encrypted cells of the same size.
+\end{enumerate}
+
+All these protocol steps constitute potential measurement points for
+hidden-service related statistics.
+
+\section{Evaluation criteria for statistics}
+\label{sec:criteria}
+
+Each of the hidden-service related statistics needs to fulfill two
+criteria: first, it needs to serve a concrete purpose for making hidden
+services better; and second, it must not provide an adversary with data
+that would help them locate clients or services.
+
+\subsection{Possible benefits from gathering statistics}
+
+The purpose of the statistics discussed in this report is to learn more
+about hidden services and as a result make them better.
+We can imagine a couple possible benefits from gathering these statistics
+that we outline in the following.
+For all possible benefits we assess whether we need statistics from the
+public Tor network, or if we could as well obtain statistics in a private
+testing network.
+
+\paragraph{Learn about usage}
+
+We are interested in learning more about hidden service usage to direct
+our development efforts.
+All past design decisions around hidden services were either based on
+assumptions how hidden services might be used, or on own observations.
+If we had statistics about provided services and their usage, we might
+adapt the design to its actual usage.
+Related to this, having statistics on hidden service usage as compared to
+normal Tor usage might help in getting sponsors and developers interested
+in making hidden services better.
+
+\paragraph{Improve performance}
+
+We want to measure performance of hidden services as a whole and of their
+protocol substeps to identify any bottlenecks.
+We hope that we can perform most of these measurements in private networks
+where we don't put any users at risk.
+But if we want to build a model that resembles reality, we'll need at
+least some real data as input, or we're back at making assumptions which
+may not reflect reality.
+
+\paragraph{Identify bugs}
+
+Statistics can help us detect bugs that cannot be found in private Tor
+networks.
+Obviously, we'd want to fix as many bugs as possible in a private network
+setting.
+But there will always be cases that we'd miss in a test network, possibly
+caused by different software versions or non-standard usage that we didn't
+think of.
+Having some real data indicating problems in the hidden-service protocol
+would serve as good starting point to go bug hunting.
+
+\paragraph{Discover attacks}
+
+We might be able to use hidden-service related statistics to uncover
+ongoing attacks in the network.
+If a reported statistic is off by more than a certain expected threshold
+or against the past trend, that might indicate that an attack is mounted
+on the network.
+This is obviously something we can only find out in the public Tor network
+and not in private testing networks.
+
+\subsection{Possible risks of gathering statistics}
 
-\subparagraph{Details}
-
-A relay counts how many \verb+ESTABLISH_INTRO+ cells it receives during
-the statistics interval.
+% HSDirs threat model notes
+%
+% Hidden Service directories periodically receive HS descriptors from
+% hidden services.  They cache them, and then serve them to any clients
+% that ask for them.
+%
+% Hidden service directories are placed in a hash ring, and each hidden
+% service picks a slice of hidden service directories from that hash ring.
+% Given the address of a hidden service, it's easy to learn which
+% directories are responsible for it.  This makes hidden-service directory
+% statistics dangerous since they can potentially be matched to specific
+% hidden services.
+%
+% Furthermore, each hidden service has 6 directories, and each directory
+% serves a different set of services.  This means that attackers have 6
+% different data points per hidden service every hour that can be used to
+% reduce measurement noise.
+
+The benefits of statistics have been discussed above, so it's clear that
+statistics can be used for good.
+But they can also be used for bad.
+The risk of gathering statistics is that an adversary could misuse them
+for their attacks on clients and/or services.
+All statistics are designed to be publicly available, so an adversary,
+who might already control one or more relays, could use statistics to
+learn something about relays she does not (yet) control.
+In the following we outline aspects of hidden-service usage that we don't
+want to reveal by statistics.
+
+\paragraph{Infer availability or popularity of a specific service}
+
+We want to learn interesting facts about all services together, but we
+want to avoid that statistics can be used to single out any specific
+service and derive its availability or popularity.
+This includes services identified by their service address as well as
+popular services that may be identified by the number of connecting
+clients or handled traffic volume.
+As a matter of fact, it's not difficult for an adversary to link services
+to relays working as directories or introduction points: the six
+directories storing descriptors for that service can be determined easily,
+and the introduction points of a service are listed in its descriptors.
+The adversary can compare statistics reported by all directories or
+introduction points of a service to reduce measurement noise.
+Only the rendezvous point changes for each client connection, so that
+statistics reported by rendezvous point cannot easily be linked to a
+specific service.
+
+\paragraph{Infer activity of a specific client}
+
+Related to the above, we want to learn about activity of all clients, but
+we want to avoid that statistics can be used to single out a specific
+client and learn about its activity.
+This includes power users that access lots of services or transfer large
+data volumes as well as clients which are services themselves, like
+tor2web.
+
+\paragraph{Assess precise number of available services}
+
+We want to learn roughly how many services are available in the network,
+but we want to avoid that these estimates make it easier for an adversary
+to enumerate available services.
+While hiding the existence of a service is not the primary purpose of
+hidden services, it's a security feature we don't want to give up easily.
+
+\subsection{Other aspects of gathering statistics}
+
+There are certain aspects of any given statistic that should be
+considered in order to decide for or against gathering them.
+We list a few of those aspects below.
+
+\paragraph{Robustness against liars}
+
+The statistics discussed in this document would all be reported by relays
+and not confirmed by third parties.
+We must consider cases where a relay operators modifies their source code
+to manipulate reported statistics to their advantage.
+A statistic should be robust against single liars, as long as there is a
+sufficient number of honest relays, possibly run by trusted operators.
+We also should not depend on statistics reported by single relays, if
+possible.
+Though it would be interesting to have statistics indicating adoption of
+protocol changes.
+
+\paragraph{Robustness against protocol changes}
+
+We are planning to improve the hidden-service protocol in the medium term
+by making major protocol changes.%
+\footnote{\url{https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/224-rend-spec-ng.txt}}
+We should not implement a statistic if we know it will soon become
+obsolete because of those protocol changes.
+
+\section{List of hidden-service related statistics}
+\label{sec:list}
+
+\textbf{The descriptions in this section have not been updated yet.  Let's
+wait with cleaning them up until we have a final list of statistics and a
+final section structure.  Otherwise we'd be rewriting this section over
+and over and over.}
+
+In this section we attempt to compile a comprehensive list of
+hidden-service related statistics.
+We start with general circuit related statistics, statistics on
+successfully advertised services and successful service usage,
+performance-related statistics, and finally discuss statistics on protocol
+failures.
 
-\subparagraph{Benefits}
+Not all of the following statistics provide an actual benefit towards
+making hidden services better, and some statistics are outright dangerous
+and should never be deployed in the public Tor network.
+However, they are all worth being listed here, if only to have a reference
+for the future that says \emph{why} they are a bad idea.
 
-We could validate that we have a ``uniform'' random distribution among
-chosen introduction points in the network.
-If not, there might be a problem.
+\subsection{General hidden-service circuit related statistics}
 
-\subparagraph{Risks}
-Considering we have a good randomness meaning every relay has the same
-chance to be picked, there are no obvious risks to share this.
-If not, we don't see a real risk for an attacker to know that a specific
-relay got chosen X times instead of the measured average Y.
+We start with statistics that are not specific to the three roles of
+relays in the hidden-service protocol, but that apply to all of them.
 
-\paragraph{Time from establishing a circuit to becoming an introduction
-point (1.1.2.)}
+\subsubsection{Time from circuit extension to circuit purpose change}
 
 % (the following distinction cannot be made, AFAIK.  here's what happens:
 % we receive a CREATE (?) cell from another relay that establishes the
@@ -116,16 +366,16 @@ point (1.1.2.)}
 % cannibalize it and use it as introduction circuit.  statistics would
 % tell us what fraction of circuits is newly built and what was
 % cannibalized; well, allow guesses about the two cases.)
-
-\subparagraph{Details}
-
+%
+\textbf{Details:}
+%
 A relay measures the time difference between a circuit extension from the
 previous relay in the circuit to receiving an \verb+ESTABLISH_INTRO+ cell.
 A very small time difference implies that the circuit was built/extended
 specifically for use as introduction point, whereas a larger time
 difference hints to the hidden service re-using a pre-built circuit for
 the introduction point.
-
+%
 % [dgoulet]: "if the time difference between those two events is small, we
 % can guess that the client built a new circuit for using us as
 % introduction point, or that she extended an existing circuit by one hop
@@ -155,7 +405,8 @@ the introduction point.
 % relay only sees that the circuit gets extended to it, but it has no
 % information how long the client had the circuit lying around before
 % extending it.
-
+%
+%
 % Newly established circuit.
 % Benefits: Performance reason, this can be useful to know the real cost
 % (on average) of becoming an IP. Can lead to understanding bottle necks
@@ -171,127 +422,307 @@ the introduction point.
 % Risks: Also tricky. That info could tell us clearly if the IP circuit is
 % on a new or already established circuit which changes the traffic
 % timing. Not sure how useful it is to an attacker though.
-
-\subparagraph{Benefits}
-
+%
+\textbf{Benefits:}
+%
 We would learn what fraction of introduction points can be established on
 short notice using pre-built circuits vs. first having to build or extend
 circuits.
 This is something we would measure on hidden services, but given that we
 don't have statistics from those, measuring this on introduction points
 seems like a fine workaround.
-
+%
 % Both of these stats should probably report an average and a variance
 % instead of a <timestamp> + <circ. creation time>, that would be a
 % disaster.  (yes, please, no data about single events; that wouldn't fit
 % into descriptors anyway, and it would reveal far too much detail.)
 % I really wonder if an attacker could use this average to partition part
 % of the network to predict where the circuit can be located?
-
-\subparagraph{Risks}
-
+%
+The time from receiving a circuit creation request to seeing the
+\verb+ESTABLISH_RENDEZVOUS+ cell can help us optimize the rendezvous
+protocol for performance.
+The current implementation either builds a new circuit or extends an
+existing circuit by one hop before sending the \verb+ESTABLISH_RENDEZVOUS+
+cell.
+So, the measured time will be close to zero.
+But if we ever decide to re-use existing circuits for rendezvous without
+extending them by another hop, this metric will give us an idea on the
+adoption of that change.
+Admitted, this benefit is not huge.
+%
+\textbf{Risks:}
+%
 No obvious risks.  % only talking about aggregate statistics here, not
 % single observations.
 
-\paragraph{Number of failed attempts to establish an introduction point
-(1.1.3.)}
+% [dgoulet]: Reporting a mean/average and with maybe a treshold before we
+% publish like "we need 100 rdv cell before reporting this stat" ?
+% [karsten]: agreed that this may seem useful in general.  but what if the
+% adversary sends 100 cells themselves to help us get past the threshold
+% and report a tiny number of actual user cells?  but I added an item to
+% the section start where we can discuss whether this is a good safeguard
+% in general.
+% [dgoulet]: Right well that's a time statistic and not an amount so if an
+% attacker would establish 100 RP I guess he/she indeed poisoning the
+% stat?...
+% [karsten]: Maybe.  In theory, the stat is not poisoned for the attacker
+% if she knows what values she's contributed to it.  But I agree that this
+% is not the best example.
 
-\subparagraph{Details}
+\subsubsection{Number of circuits built with TAP vs. nTor}
 
-A relay can not decline to be an introduction point.
-However, an \verb+ESTABLISH_INTRO+ cell might be malformed (wrong public
-key, bad signature, etc...).
-The relay would count the number of declined \verb+ESTABLISH_INTRO+ cells
-and report them along with the total number of received
-\verb+ESTABLISH_INTRO+ cells.
-Or it would report successes and failures, rather than totals and
-failures.
+\textbf{Details:}
+%
+Older clients (0.2.3.x) would build/extend circuits using TAP, newer
+clients would use nTor for that.
+Relays can report the number of introduction circuits that were built
+using either of the two methods.
+More precisely, relays would remember for each circuit how it was built,
+and as soon as they receive an \verb+ESTABLISH_INTRO+ cell, they increment
+one of two counters.
+See ticket 13466 for details.
+%
+\textbf{Benefits:}
+%
+We would learn what fraction of clients and what fraction of services run
+older tor versions (0.2.3.x or older).
 
-\subparagraph{Benefits}
+\subsubsection{Time from circuit purpose change to tearing down circuit}
 
-Wrong \verb+ESTABLISH_INTRO+ cells shows either a very bad bug in the code
-or a deliberate action (data mangling, unknown attack, DoS, ...).
+\textbf{Details:}
+%
+Relays report how long it takes from changing the purpose of circuit to a
+hidden-service specific purpose to tearing down the circuit.
+This statistic only makes sense for circuits which are only built for
+sending a single message, which includes services publishing descriptors,
+clients fetching descriptors, and clients sending introductions.
+In theory, we would expect the circuits to be torn down immediately, but
+in practice circuits might be kept open longer.
 
-% [dgoulet]: After an IRC discussion with arma and asn, I remember that
-% this one could be "cool to have" but without more information that we
-% can't collect for privacy reasons, this stat would not help at all in
-% the end game. The question remains if we should simply keep it or not
-% even if right now we don't see a added value?
-% [karsten:] right, this is a fine question, not only limited to this
-% statistic.  I added a new paragraph to the section start for "general
-% considerations for gathering hidden-service statistics".
+\subsection{Statistics on service advertisement}
 
-\subparagraph{Risks}
+The following statistics are all about (successful) service advertisement.
+Performance-related statistics and failure statistics are covered in a
+later section.
 
-No obvious risks.
+\subsubsection{Number of established introduction points}
 
-\paragraph{Lifetime of introduction circuits (1.1.4.)}
+\textbf{Details:}
+%
+A relay counts how many \verb+ESTABLISH_INTRO+ cells it receives and acts
+upon during the statistics interval.
+%
+\textbf{Benefits:}
+%
+We could validate that we have a ``uniform'' random distribution among
+chosen introduction points in the network.
+If not, there might be a problem.
+%
+\textbf{Risks:}
+Considering we have a good randomness meaning every relay has the same
+chance to be picked, there are no obvious risks to share this.
+If not, we don't see a real risk for an attacker to know that a specific
+relay got chosen X times instead of the measured average Y.
 
-\subparagraph{Details}
+\subsubsection{Time from establishing introduction point to tearing down
+circuit (1.1.4.)}
 
+\textbf{Details:}
+%
 How long did an introduction circuit last?
 Relays would report statistics like mean/median time, variance/IQR, and/or
 percentiles here.
-
-\subparagraph{Benefits}
-
+%
+\textbf{Benefits:}
+%
 The longer introduction circuits last, the better, from a performance POV.
 If many circuits break after a short time period, that indicates that
 services should attempt to make better path-selection decisions for
 building introduction circuits.
+This statistic can also be used to analyze what fraction of services is
+available for a short time only, and what fraction is available most of
+the time.
 
-\paragraph{Reasons for terminating established introduction points
-(1.1.5.)}
+\subsubsection{Number of descriptor publish request (3.1.1.)}
+
+\textbf{Details:}
+%
+Relays keep a local count of cached hidden-service descriptors.
+Every time they add or remove a descriptor to their cache, relays update
+their counter and record the time of change.
+At the end of the statistics period they calculate statistics like
+minimum, maximum, average number of hosted descriptors during the
+statistics interval.
+(There may be more efficient ways to implement these statistics that avoid
+keeping a full history with timestamps, which are not discussed here.)
+%
+\textbf{Benefits:}
+%
+This is an interesting statistic that would allow us to understand how
+used hidden services are, and also detect sudden changes in the number of
+services (botnets, chat protocols, etc.).
+Also, learning the number of hidden services per directory will help us
+find bugs in the hash ring code and also understand how loaded directories
+are.
+FWIW, when \verb+rend-spec-ng.txt+ gets implemented, it will be harder for
+hidden service directories to learn the number of served services since
+the descriptor will be encrypted.
+However, directories will still be able to approximate the number of
+services by checking the amount of descriptors received per publishing
+period.
+If this ever becomes a problem we can imagine publishing fake descriptors
+to confuse the directories.
+%
+\textbf{Risks:}
+%
+Publishing this stat would allow someone who is indexing hidden services
+to be able to say ``I have seen 76~\% of all HSes''.
+We would really like to avoid having such an enumeration-facilitating
+property.
+We could be persuaded that with some heavy stats obfuscation (heavier than
+the bridge stats obfuscation), this statistic might be plausible.
+By statistics obfuscation, we mean obfuscating the numbers so that the
+attacker can only say ``I'm somewhere between 60~\% to 75~\% of all
+HSes.''.
+This is a bit related to differential privacy as we understand it, but
+much more basic.
 
-\subparagraph{Details}
+\subsubsection{Number of descriptor updates per service (3.1.2.)}
 
-Relays report frequencies of circuit terminations requested by services
-vs. different types of failures.
+\textbf{Details:}
+%
+Relays count how many descriptor updates they see per service.
+Assuming that stats are published daily (which is not necessary), this is
+going to be a number between 1 and 24 (since RendPostPeriod is currently
+one hour) and services pick a new directory after 24 hours (see
+\verb+rendcommon.c:get_time_period()+).
+%
+\textbf{Risks:}
+%
+Depending on how many HSes are behind each HSDir, this statistic might or
+might not reveal uptime information about specific services.
+Still it doesn't seem like something we want to risk.
+Also, if the result is greater than 24, it means that an HS with modded
+RendPostPeriod was publishing to that HSDir (and that the HSDir doesn't
+have many clients).
+Do we want to reveal that?
+OTOH, it seems to me that if the directory is serving many services, this
+statistic doesn't really provide any insight.
 
-\subparagraph{Benefits}
+\subsubsection{Time between last and first published descriptor with same
+identifier}
 
-If there are more than a small percentage of failures, decide how to make
-things more robust.
+\textbf{Details:}
+%
+Relays report statistics on the time difference between the last and the
+first descriptor published under the same identifier.
+This statistics indicates service uptime, because a service with high
+uptime would periodically re-publish its descriptor whenever its
+introduction points change.
+There is an upper bound on this statistic at 24 hours, because that's when
+descriptor identifiers change.
+
+\subsubsection{Number of introduction points contained in descriptors
+(3.1.4.)}
 
-\subparagraph{Risks}
+\textbf{Details:}
+%
+Relays report average number of introduction points contained in
+hidden-service descriptors, possibly also percentiles.
+%
+\textbf{Benefits:}
+%
+It would be interesting to know whether services deviate from the default
+number of introduction points.
+Though it's unclear what we're going to do with this information.
+This statistic will also be killed by rend-spec-ng.
 
-No obvious risks.
+\subsubsection{Number of descriptors with encrypted introduction points
+(3.1.5.)}
 
-\paragraph{Number of introduction circuits built with TAP vs. nTor
-(1.1.6.)}
+\textbf{Details:}
+%
+Relays can look at published hidden-service descriptor and count
+descriptors with plain-text vs. encrypted introduction point sections.
+%
+\textbf{Benefits:}
+%
+We would learn what fraction of services uses authentication features.
+This statistic won't be available after implementing rend-spec-ng.
 
-\subparagraph{Details}
+\subsection{Statistics on service usage}
 
-Older clients (0.2.3.x) would build/extend circuits using TAP, newer
-clients would use nTor for that.
-Relays can report the number of introduction circuits that were built
-using either of the two methods.
-More precisely, relays would remember for each circuit how it was built,
-and as soon as they receive an \verb+ESTABLISH_INTRO+ cell, they increment
-one of two counters.
-See ticket 13466 for details.
+The following statistics are about service usage, where
+performance-related statistics and failure statistics are covered at a
+later time.
 
-\subparagraph{Benefits}
+\subsubsection{Number of descriptor fetch requests (3.2.1.)}
 
-We would learn what fraction of hidden services run older tor versions
-(0.2.3.x or older).
+\textbf{Details:}
+%
+A relay reports the total number of descriptor fetch requests, regardless
+of the requested hidden service identity.
+%
+\textbf{Risks:}
+%
+An adversary can use this statistic to evaluate the popularity of an HS.
+An adversary can also use this stat to detect big changes in the numbers
+of visitors of popular HSes.
+Of course, there will be noise in the statitics since multiple services
+correspond to each directory, but the adversary could reduce the noise
+after observing the same service rotating to different directories, and
+also by examining the statistics of all 6 directories that correspond to
+the service.
+This doesn't seem like a problem that is solvable with simple obfuscation
+of stats, and I suggest we don't do this statistic at all.
 
-\subsubsection{Statistics on clients connecting to introduction points}
+\subsubsection{Number of descriptor fetch requests by service identity
+(3.2.2.)}
 
-\paragraph{Total number of introductions received from clients (1.2.1.)}
+\textbf{Details:}
+%
+Relays report the distribution of descriptor fetch requests to hidden
+service identities.
 
-\subparagraph{Details}
+\subsubsection{Number of established rendezvous points (2.1.1.)}
 
-Relays report how many \verb+INTRODUCE1+ cells they received from clients.
+\textbf{Details:}
+%
+Relays report how many \verb+ESTABLISH_RENDEZVOUS+ cells they received.
+%
+\textbf{Benefits:}
+%
+The number of received \verb+ESTABLISH_RENDEZVOUS+ cells indicates how
+many connection attempts there are by clients to services that are
+running.
+This number is different from the number of descriptor fetches which
+happen when clients don't know yet whether a service is running, which
+will be omitted if clients still have a descriptor cached from a previous
+connection, and which we may not even gather because of privacy concerns.
+We can easily weight the number of \verb+ESTABLISH_RENDEZVOUS+ cells with
+the probability of choosing a relay as rendezvous point to estimate the
+total number of such cells in the network.
+%
+\textbf{Risk:}
+%
+There is no obvious risk from sharing this number if aggregated over a
+large enough time period.
 
-\subparagraph{Benefits}
+\subsubsection{Number of introductions received from clients (1.2.1.)}
 
+\textbf{Details:}
+%
+Relays report how many \verb+INTRODUCE1+ cells they received from clients.
+%
+\textbf{Benefits:}
+%
 This indicates that there is in fact a client trying to reach a hidden
 service thus the amount of cells could give us a rough estimate of how
 many clients are actually connecting and using hidden services.
-
-\subparagraph{Risks}
-
+%
+\textbf{Risks:}
+%
 Unclear.
 On the one hand, this is basically the same risk as the amount of time a
 relay is picked as an introduction point.
@@ -315,241 +746,40 @@ requests a very popular hidden service gets.
 % introduction points every hour, AFAIK.
 % [dgoulet]: No they don't, I confirmed in the code.
 
-\paragraph{Number of introductions received by established introduction
-point (1.2.2.)}
-
-\subparagraph{Details}
+\subsubsection{Number of introductions received by established
+introduction point (1.2.2.)}
 
+\textbf{Details:}
+%
 Relays can serve as introduction point for an arbitrary number of hidden
 services.
 Relays could report statistics (like percentiles) on received
 \verb+INTRODUCE1+ cell by introduction circuit.
-
-\subparagraph{Benefits}
-
+%
+\textbf{Benefits:}
+%
 This statistic would tell us something about usage diversity of hidden
 services.
 A special case would be the number or fraction of established introduction
 points that never sees a single \verb+INTRODUCE1+ cell.
 It's unclear what we'd do with this information, though.
 
-\paragraph{Number of discarded client introductions by reason (1.2.3.)}
-
-\subparagraph{Details}
-
-How many \verb+INTRODUCE1+ cells have been discarded because of unknown
-service/malformed (?)/whatever-can-go-wrong, by introduction point?
-
-\subparagraph{Benefits}
-
-Anything exceeding a small portion of discarded \verb+INTRODUCE1+ cells
-shows either a very bad bug in the code or a deliberate action (data
-mangling, unknown attack, DoS, ...).
-
-% [dgoulet]: That is again a "cool to have" stat but not sure how it would
-% help us investiguate. It can I guess trigger an alarm but apart from
-% that...
-% [karsten]: right, see section start.
-
-\subparagraph{Risks}
-
-No obvious risks.
-More precisely, if absolute numbers are reported, the risk is the same as
-the risk of reporting the number of received \verb+INTRODUCE1+ cells; if
-only fractions are reported, it's not that bad.
-
-\subparagraph{Recommendation}
-
-\paragraph{Time between establishing introduction point and receiving the
-first client introduction (1.2.4.)}
-
-\subparagraph{Details}
-
-Relays report the time between \verb+ESTABLISH_INTRO+ and first
-\verb+INTRODUCE1+ cell.
-
-\subparagraph{Benefits}
-
-This statistic tells us how long it takes for the hidden service to
-include a relay in its descriptor and publish that descriptor, and for the
-first client to fetch that descriptor and use that relay for its
-introduction.
-This may not be very useful, but is listed here for completeness.
-% [dgoulet]: That would basically leak the RendPostPeriod (if IP changes
-% at each upload) of the HS. Not sure how an attacker could use that to
-% his/her advantage but to consider.
-% [karsten]: again, I think you're wrong about introduction points
-% changing at each upload.
-% [dgoulet]: Yup, IP do *NOT* change at each upload.
-
-\subparagraph{Risks}
-
-No obvious risks.
-
-\paragraph{Number of client introductions coming in via circuits built
-with TAP vs. nTor (1.2.5.)}
-
-\subparagraph{Details}
-
-Relays remember whether an incoming circuit was built using TAP or nTor.
-Whenever they receive an INTRODUCE1 cell they increment a counter for
-either TAP or nTor.
-See ticket 13466 for details.
-
-\subparagraph{Benefits}
-
-We would learn what fraction of hidden-service clients run older tor
-versions (0.2.3.x or older).
-
-%3) Stats about the server responding to the introduction point.
-%   (this does not happen in the protocol)
-%
-% - How many INTRODUCE2 replayed cell we've observed?
-%   This can actually be an active attack or a client sending multiple
-%   INTRODUCE1 cell via different introduction points.
-%   - Benefits: Could give us an idea of how many client are misbehaving
-%   (actually need to confirm if the tor client can send multiple
-%   INTRODUCE1 for the same service).
-%   - Risks: No obvious risks.
-%
-%   Note that the amount of valid INTRODUCE2 cell seen should correspond
-%   to the amount of RP circuit launched (might be a cannibalized one).
-%   So, having that stat could be useful
-%   to again simply correlate that stat with the RP amout stat. Finding
-%   out that there is a discrepancy could help us narrow down performance
-%   issue.
-%
-% (removed the following, because receiving INTRODUCE1 triggers an event
-% that ends with responding with INTRODUCE_ACK.)
-% - How many INTRODUCE_ACK were sent to the client?
-%   - Benefits: Can be coupled with the how many INTRODUCE1 we've seen and
-%   look for discrepancy. The difference of intro1 and intro_ack could not
-%   be explained though without a reason why the HS dropped it or if the
-%   HS did receive the intro1 at all. So, this stat can be fun to have but
-%   not really useful for performance tuning I would say.
-%   - Risks: No obvious risks.
-
-\subsection{Statistics from relays acting as rendezvous points}
-
-The following statistics are all related to relays acting as rendezvous
-points.
-These statistics cover the whole process from (1) clients establishing
-rendezvous points, (2) servers connecting to a client's rendezvous point,
-and (3) clients creating streams to the server, exchanging data, and
-tearing down the circuit.
-These phases of the rendezvous protocol are also used to organize the
-statistics below.
-All statistics focus on the number or timing of cells exchanged in the
-rendezvous protocol and underlying OR protocol.
-
-\subsubsection{Statistics on clients establishing rendezvous points}
-
-\paragraph{Number of established rendezvous points (2.1.1.)}
-
-\subparagraph{Details}
-
-Relays report how many \verb+ESTABLISH_RENDEZVOUS+ cells they received.
-
-\subparagraph{Benefits}
-
-The number of received \verb+ESTABLISH_RENDEZVOUS+ cells indicates how
-many connection attempts there are by clients to services that are
-running.
-This number is different from the number of descriptor fetches which
-happen when clients don't know yet whether a service is running, which
-will be omitted if clients still have a descriptor cached from a previous
-connection, and which we may not even gather because of privacy concerns.
-We can easily weight the number of \verb+ESTABLISH_RENDEZVOUS+ cells with
-the probability of choosing a relay as rendezvous point to estimate the
-total number of such cells in the network.
-
-\subparagraph{Risk}
-
-There is no obvious risk from sharing this number if aggregated over a
-large enough time period.
-
-\paragraph{Time from circuit creation to establishing rendezvous point
-(2.1.2.)}
-
-\subparagraph{Details}
-
-Relays report statistics on the time between circuit creation to receiving
-a \verb+ESTABLISH_RENDEZVOUS+ cell.
-
-\subparagraph{Benefits}
-
-The time from receiving a circuit creation request to seeing the
-\verb+ESTABLISH_RENDEZVOUS+ cell can help us optimize the rendezvous
-protocol for performance.
-The current implementation either builds a new circuit or extends an
-existing circuit by one hop before sending the \verb+ESTABLISH_RENDEZVOUS+
-cell.
-So, the measured time will be close to zero.
-But if we ever decide to re-use existing circuits for rendezvous without
-extending them by another hop, this metric will give us an idea on the
-adoption of that change.
-Admitted, this benefit is not huge.
-
-\subparagraph{Risk}
-
-There is no obvious risk related to this statistic.
-
-% [dgoulet]: Reporting a mean/average and with maybe a treshold before we
-% publish like "we need 100 rdv cell before reporting this stat" ?
-% [karsten]: agreed that this may seem useful in general.  but what if the
-% adversary sends 100 cells themselves to help us get past the threshold
-% and report a tiny number of actual user cells?  but I added an item to
-% the section start where we can discuss whether this is a good safeguard
-% in general.
-% [dgoulet]: Right well that's a time statistic and not an amount so if an
-% attacker would establish 100 RP I guess he/she indeed poisoning the
-% stat?...
-% [karsten]: Maybe.  In theory, the stat is not poisoned for the attacker
-% if she knows what values she's contributed to it.  But I agree that this
-% is not the best example.
-
-\subparagraph{Recommendation}
-
-\paragraph{Number of rendezvous point establishment requests coming in via
-circuits built with TAP vs. nTor (2.1.3.)}
-
-\subparagraph{Details}
-
-Relays remember whether an incoming circuit was built using TAP or nTor.
-Whenever they receive an \verb+ESTABLISH_RENDEZVOUS+ cell they increment a
-counter for either TAP or nTor.
-See ticket 13466 for details.
-
-\subparagraph{Benefits}
-
-We would learn what fraction of hidden-service clients run older tor
-versions (0.2.3.x or older).
-
-% How much RP traffic was transfererd through RP circuits?  (see below re:
-% RELAY cells)
-
-% Average traffic transfered through RP circuits?  (see below re: RELAY
-% cells)
-
-\subsubsection{Statistics on servers connecting to a client's rendezvous
-point}
-
-\paragraph{Number of server rendezvous (2.2.1.)}
-
-\subparagraph{Details}
+\subsubsection{Number of server rendezvous (2.2.1.)}
 
+\textbf{Details:}
+%
 Relays report the total number of \verb+RENDEZVOUS1+ cells they receive.
-
-\subparagraph{Benefits}
-
+%
+\textbf{Benefits:}
+%
 The number of received \verb+RENDEZVOUS1+ cells tells us how many
 connection requests are actually accepted by servers.
 This number may be lower than the number of \verb+ESTABLISH_RENDEZVOUS+
 cells, because of failures in connection establishment, authentication
 failures, or other reasons.
-
-\subparagraph{Risks}
-
+%
+\textbf{Risks:}
+%
 There is no obvious risk from this metric, because it's unrelated to any
 given client or server.
 
@@ -567,95 +797,16 @@ given client or server.
 % (one  ESTABLISH_RENDEZVOUS), that's quite an issue (loop that went wrong
 % :).
 
-\paragraph{Time from establishing a rendezvous point to receiving the
-server rendezvous (2.2.2.)}
-
-\subparagraph{Details}
-
-Relays report the time from receiving an \verb+ESTABLISH_RENDEZVOUS+ cell
-to receiving the corresponding \verb+RENDEZVOUS1+ cell.
-
-\subparagraph{Benefits}
-
-The time between receiving an \verb+ESTABLISH_RENDEZVOUS+ cell from the
-client and the corresponding \verb+RENDEZVOUS1+ cell from the server tells
-us a lot about performance of the rendezvous protocol.
-The rendezvous point is the only place in the protocol that witnesses
-events near the beginning and near the end of the connection establishment
-process.
-If we ever want to improve the substeps inbetween, this metric is the only
-way to measure effectiveness of improvements in the deployed network.
-
-\subparagraph{Risks}
-
-Again, there are at least no obvious risks from gathering this statistic.
-
-\paragraph{Number of server rendezvous with unknown rendezvous cookie
-(2.2.3.)}
-
-\subparagraph{Details}
-
-Relays report the number of \verb+RENDEZVOUS1+ cell with unknown
-rendezvous cookie.
-
-\subparagraph{Benefits}
-
-The number of \verb+RENDEZVOUS1+ cell that cannot be matched with a
-previously established rendezvous circuit can be interesting for analyzing
-problems in the protocol.
-We might even distinguish between rendezvous cookies that were previously
-known to the relay and those that seem entirely unrelated.
-The benefit gained from this statistic is not huge though.
-
-\subparagraph{Risk}
-
-No obvious risks.
-
-\paragraph{Number of server rendezvous coming in via circuits built with
-TAP vs. nTor (2.2.4.)}
-
-\subparagraph{Details}
-
-Relays remember whether an incoming circuit was built using TAP or nTor.
-Whenever they receive a \verb+RENDEZVOUS1+ cell they increment a counter
-for either TAP or nTor.
-See ticket 13466 for details.
-
-\subparagraph{Benefits}
-
-We would learn what fraction of hidden services run older tor versions
-(0.2.3.x or older).
-
-\subsubsection{Statistics on clients creating streams to the server,
-exchanging data, and tearing down the circuit}
-
-\paragraph{Time from server rendezvous to first client data (2.3.1.)}
-
-\subparagraph{Details}
-
-Relays report the time from receiving a \verb+RENDEZVOUS1+ cell to seeing
-the first \verb+RELAY+ cell sent from the client.
-
-\subparagraph{Benefits}
-The time from receiving a \verb+RENDEZVOUS1+ cell from the server (and
-relaying it as \verb+RENDEZVOUS2+ cell to the client) and receiving the
-first \verb+RELAY+ cell from the client is another performance indicator
-of the protocol.
 
-\subparagraph{Risks}
-
-There are no obvious risks from learning the time between these two
-substeps in the rendezvous protocol.
-
-\paragraph{Amount of data sent over connected rendezvous circuits in
-either direction (2.3.2.)}
-
-\subparagraph{Details}
+\subsubsection{Number of cells sent over rendezvous circuits in either
+direction (2.3.2.)}
 
+\textbf{Details:}
+%
 Relays report the number of \verb+RELAY+ cells sent in either direction.
-
-\subparagraph{Benefits}
-
+%
+\textbf{Benefits:}
+%
 The number of \verb+RELAY+ cells sent by either client or server can give
 us a detailed view on hidden service usage.
 In contrast to common Tor usage, there is no point in the rendezvous
@@ -667,9 +818,9 @@ peer-to-peer models.
 As a special case, we'd want to know what fraction of circuits has zero
 \verb+RELAY+ cells, which would indicate a connection problem late in the
 process.
-
-\subparagraph{Risks}
-
+%
+\textbf{Risks:}
+%
 In contrast to the cells discussed above, \verb+RELAY+ cells contain
 actual user content.
 The pattern of \verb+RELAY+ cells could also be used to fingerprint a
@@ -678,15 +829,16 @@ While total number of cells by direction aggregated over a certain time
 period should be okay to measure, any statistics going further than that
 need closer analysis.
 
-\paragraph{Time from first client data to tearing down circuit (2.3.3.)}
-
-\subparagraph{Details}
+\subsubsection{Time from first client data to tearing down circuit
+(2.3.3.)}
 
+\textbf{Details:}
+%
 Relays report the time from seeing the first \verb+RELAY+ cell sent by the
 client to tearing down circuit by either client or server.
-
-\subparagraph{Benefits}
-
+%
+\textbf{Benefits:}
+%
 The time between receiving the first \verb+RELAY+ cell to tearing down the
 circuit indicates typical session length of hidden service connections.
 We'd be able to say whether typical hidden-service connections are rather
@@ -694,9 +846,9 @@ short-lived or long-lived.
 This information may help us make educated guesses on the type of
 applications run over hidden services.
 It may also help us improve the selection criteria for rendezvous points.
-
-\subparagraph{Risks}
-
+%
+\textbf{Risks:}
+%
 Session length is quite sensitive data that could be correlated with
 circuit lifetimes at other places in the network.
 Fortunately, the rendezvous point is neither specific to any given client
@@ -711,203 +863,152 @@ Still, this metric needs further analysis.
 % How much time did it take to splice the RP circuit? (#13194)  (you mean
 % time from RENDEZVOUS1 to first RELAY cell?)
 
-\subsection{Statistics from relays acting as hidden-service directories}
-
-% HSDirs threat model notes
-Hidden Service directories periodically receive HS descriptors from hidden
-services.
-They cache them, and then serve them to any clients that ask for them.
-
-Hidden service directories are placed in a hash ring, and each hidden
-service picks a slice of hidden service directories from that hash ring.
-Given the address of a hidden service, it's easy to learn which
-directories are responsible for it.
-This makes hidden-service directory statistics dangerous since they can
-potentially be matched to specific hidden services.
-
-Furthermore, each hidden service has 6 directories, and each directory
-serves a different set of services.
-This means that attackers have 6 different data points per hidden service
-every hour that can be used to reduce measurement noise.
-
-The following statistics are grouped by (1) hidden services publishing
-descriptors and (2) clients fetching descriptors from hidden-service
-directories.
-
-\subsubsection{Statistics on hidden services publishing descriptors to
-hidden-service directories}
-
-\paragraph{Number of cached descriptors (3.1.1.)}
-
-\subparagraph{Details}
-
-Relays keep a local count of cached hidden-service descriptors.
-Every time they add or remove a descriptor to their cache, relays update
-their counter and record the time of change.
-At the end of the statistics period they calculate statistics like
-minimum, maximum, average number of hosted descriptors during the
-statistics interval.
-(There may be more efficient ways to implement these statistics that avoid
-keeping a full history with timestamps, which are not discussed here.)
-
-\subparagraph{Benefits}
+\subsubsection{Number of closed rendezvous circuits without a single data
+cells}
 
-This is an interesting statistic that would allow us to understand how
-used hidden services are, and also detect sudden changes in the number of
-services (botnets, chat protocols, etc.).
-Also, learning the number of hidden services per directory will help us
-find bugs in the hash ring code and also understand how loaded directories
-are.
-FWIW, when \verb+rend-spec-ng.txt+ gets implemented, it will be harder for
-hidden service directories to learn the number of served services since
-the descriptor will be encrypted.
-However, directories will still be able to approximate the number of
-services by checking the amount of descriptors received per publishing
-period.
-If this ever becomes a problem we can imagine publishing fake descriptors
-to confuse the directories.
-
-\subparagraph{Risks}
-
-Publishing this stat would allow someone who is indexing hidden services
-to be able to say ``I have seen 76~\% of all HSes''.
-We would really like to avoid having such an enumeration-facilitating
-property.
-We could be persuaded that with some heavy stats obfuscation (heavier than
-the bridge stats obfuscation), this statistic might be plausible.
-By statistics obfuscation, we mean obfuscating the numbers so that the
-attacker can only say ``I'm somewhere between 60~\% to 75~\% of all
-HSes.''.
-This is a bit related to differential privacy as we understand it, but
-much more basic.
-
-\paragraph{Number of descriptor updates per service (3.1.2.)}
-
-\subparagraph{Details}
-
-Relays count how many descriptor updates they see per service.
-Assuming that stats are published daily (which is not necessary), this is
-going to be a number between 1 and 24 (since RendPostPeriod is currently
-one hour) and services pick a new directory after 24 hours (see
-\verb+rendcommon.c:get_time_period()+).
-
-\subparagraph{Risks}
-
-Depending on how many HSes are behind each HSDir, this statistic might or
-might not reveal uptime information about specific services.
-Still it doesn't seem like something we want to risk.
-Also, if the result is greater than 24, it means that an HS with modded
-RendPostPeriod was publishing to that HSDir (and that the HSDir doesn't
-have many clients).
-Do we want to reveal that?
-OTOH, it seems to me that if the directory is serving many services, this
-statistic doesn't really provide any insight.
-
-\paragraph{Size of hidden service descriptors (3.1.3.)}
-
-\subparagraph{Details}
-
-Relays report the total/average size of received hidden service
-descriptors.
-
-\subparagraph{Benefits}
-
-These statistics are not very helpful if reported by directories that
-serve many services.
-Any bugs or irregularities of one service will be smoothed out by all the
-other services.
-Basically, the only thing we would learn is approximately how much disk
-space descriptors take, and maybe the average number of contained
-introduction points (if we also know the number of services).
-This statistic seems not very useful.
-
-\paragraph{Number of introduction points contained in descriptors
-(3.1.4.)}
-
-\subparagraph{Details}
+\textbf{Details:}
+%
+Relays report the number of rendezvous circuits that have been closed
+before client or service sent a single data cell.
 
-Relays report average number of introduction points contained in
-hidden-service descriptors, possibly also percentiles.
+\subsection{Performance-related statistics}
 
-\subparagraph{Benefits}
+\subsubsection{Time from establishing introduction point to receiving
+first client introduction (1.2.4.)}
 
-It would be interesting to know whether services deviate from the default
-number of introduction points.
-Though it's unclear what we're going to do with this information.
-This statistic will also be killed by rend-spec-ng.
+\textbf{Details:}
+%
+Relays report the time between \verb+ESTABLISH_INTRO+ and first
+\verb+INTRODUCE1+ cell.
+%
+\textbf{Benefits:}
+%
+This statistic tells us how long it takes for the hidden service to
+include a relay in its descriptor and publish that descriptor, and for the
+first client to fetch that descriptor and use that relay for its
+introduction.
+This may not be very useful, but is listed here for completeness.
+% [dgoulet]: That would basically leak the RendPostPeriod (if IP changes
+% at each upload) of the HS. Not sure how an attacker could use that to
+% his/her advantage but to consider.
+% [karsten]: again, I think you're wrong about introduction points
+% changing at each upload.
+% [dgoulet]: Yup, IP do *NOT* change at each upload.
+%
+\textbf{Risks:}
+%
+No obvious risks.
 
-\paragraph{Number of descriptors with encrypted introduction points
-(3.1.5.)}
+\subsubsection{Time from establishing a rendezvous point to receiving the
+server rendezvous (2.2.2.)}
 
-\subparagraph{Details}
+\textbf{Details:}
+%
+Relays report the time from receiving an \verb+ESTABLISH_RENDEZVOUS+ cell
+to receiving the corresponding \verb+RENDEZVOUS1+ cell.
+%
+\textbf{Benefits:}
+%
+The time between receiving an \verb+ESTABLISH_RENDEZVOUS+ cell from the
+client and the corresponding \verb+RENDEZVOUS1+ cell from the server tells
+us a lot about performance of the rendezvous protocol.
+The rendezvous point is the only place in the protocol that witnesses
+events near the beginning and near the end of the connection establishment
+process.
+If we ever want to improve the substeps inbetween, this metric is the only
+way to measure effectiveness of improvements in the deployed network.
+%
+\textbf{Risks:}
+%
+Again, there are at least no obvious risks from gathering this statistic.
 
-Relays can look at published hidden-service descriptor and count
-descriptors with plain-text vs. encrypted introduction point sections.
+\subsubsection{Time from server rendezvous to first client data (2.3.1.)}
 
-\subparagraph{Benefits}
+\textbf{Details:}
+%
+Relays report the time from receiving a \verb+RENDEZVOUS1+ cell to seeing
+the first \verb+RELAY+ cell sent from the client.
+%
+\textbf{Benefits:}
+The time from receiving a \verb+RENDEZVOUS1+ cell from the server (and
+relaying it as \verb+RENDEZVOUS2+ cell to the client) and receiving the
+first \verb+RELAY+ cell from the client is another performance indicator
+of the protocol.
+%
+\textbf{Risks:}
+%
+There are no obvious risks from learning the time between these two
+substeps in the rendezvous protocol.
 
-We would learn what fraction of services uses authentication features.
-This statistic won't be available after implementing rend-spec-ng.
+\subsection{Failure statistics}
 
-\paragraph{Number of descriptors published over circuits built with TAP
-vs. nTor (3.1.6.)}
+Should we report number and type of failures in the protocol, if these
+statistics are not sufficient to actually debug a problem?
+Could be a starting point to look at actual logs from relays.
+But is this what statistics are for?
 
-\subparagraph{Details}
+\subsubsection{Number of failed attempts to establish an introduction
+point (1.1.3.)}
 
-Relays remember whether an incoming circuit was built using TAP or nTor.
-Whenever they receive a descriptor publication request they increment a
-counter for either TAP or nTor.
-See ticket 13466 for details.
+\textbf{Details:}
+%
+A relay can not decline to be an introduction point.
+However, an \verb+ESTABLISH_INTRO+ cell might be malformed (wrong public
+key, bad signature, etc...).
+The relay would count the number of declined \verb+ESTABLISH_INTRO+ cells
+and report them along with the total number of received
+\verb+ESTABLISH_INTRO+ cells.
+Or it would report successes and failures, rather than totals and
+failures.
+%
+\textbf{Benefits:}
+%
+Wrong \verb+ESTABLISH_INTRO+ cells shows either a very bad bug in the code
+or a deliberate action (data mangling, unknown attack, DoS, ...).
+%
+% [dgoulet]: After an IRC discussion with arma and asn, I remember that
+% this one could be "cool to have" but without more information that we
+% can't collect for privacy reasons, this stat would not help at all in
+% the end game. The question remains if we should simply keep it or not
+% even if right now we don't see a added value?
+% [karsten:] right, this is a fine question, not only limited to this
+% statistic.  I added a new paragraph to the section start for "general
+% considerations for gathering hidden-service statistics".
+%
+\textbf{Risks:}
+%
+No obvious risks.
 
-\subparagraph{Benefits}
+\subsubsection{Reasons for terminating established introduction points
+(1.1.5.)}
 
-We would learn what fraction of hidden services run older tor versions
-(0.2.3.x or older).
+\textbf{Details:}
+%
+Relays report frequencies of circuit terminations requested by services
+vs. different types of failures.
+%
+\textbf{Benefits:}
+%
+If there are more than a small percentage of failures, decide how to make
+things more robust.
+%
+\textbf{Risks:}
+%
+No obvious risks.
 
-\paragraph{Number of descriptors published to the wrong directory
+\subsubsection{Number of descriptors published to the wrong directory
 (3.1.7.)}
 
-\subparagraph{Details}
-
+\textbf{Details:}
+%
 A relay reports the number of published descriptors that it is not
 responsible for.
 
-\subsubsection{Statistics on clients fetching descriptors from
-hidden-service directories}
-
-\paragraph{Number of descriptor fetch requests (3.2.1.)}
-
-\subparagraph{Details}
-
-A relay reports the total number of descriptor fetch requests, regardless
-of the requested hidden service identity.
-
-\subparagraph{Risks}
-
-An adversary can use this statistic to evaluate the popularity of an HS.
-An adversary can also use this stat to detect big changes in the numbers
-of visitors of popular HSes.
-Of course, there will be noise in the statitics since multiple services
-correspond to each directory, but the adversary could reduce the noise
-after observing the same service rotating to different directories, and
-also by examining the statistics of all 6 directories that correspond to
-the service.
-This doesn't seem like a problem that is solvable with simple obfuscation
-of stats, and I suggest we don't do this statistic at all.
-
-\paragraph{Number of descriptor fetch requests by hidden service identity
-(3.2.2.)}
-
-\subparagraph{Details}
-
-Relays report the distribution of descriptor fetch requests to hidden
-service identities.
-
-\paragraph{Number of descriptor fetch requests for non-existent descriptor
-(3.2.3.)}
-
-\subparagraph{Details}
+\subsubsection{Number of descriptor fetch requests for non-existent
+descriptor (3.2.3.)}
 
+\textbf{Details:}
+%
 Relays count the number of fetch requests for hidden service identities
 they don't have in their cache.
 We need to enumerate the reasons why a client would ask for the wrong
@@ -916,109 +1017,77 @@ descriptor.
 For example: a) clock sync issues, b) different network view between, c)
 ``the hidden service hasn't published recently'', d) ``the hidden service
 is offline for months''.
+%
+\textbf{Benefits:}
+%
+This seems like a statistic that could potentially find bugs in Tor.
+%
+\textbf{Risks:}
+%
+This statistic could reveal things that we don't really understand and
+might reveal information about specific services.
 
-\subparagraph{Benefits}
 
-This seems like a statistic that could potentially find bugs in Tor.
 
-\subparagraph{Risks}
+\subsubsection{Number of discarded client introductions by reason
+(1.2.3.)}
 
-This statistic could reveal things that we don't really understand and
-might reveal information about specific services.
+\textbf{Details:}
+%
+How many \verb+INTRODUCE1+ cells have been discarded because of unknown
+service/malformed (?)/whatever-can-go-wrong, by introduction point?
+%
+\textbf{Benefits:}
+%
+Anything exceeding a small portion of discarded \verb+INTRODUCE1+ cells
+shows either a very bad bug in the code or a deliberate action (data
+mangling, unknown attack, DoS, ...).
+%
+% [dgoulet]: That is again a "cool to have" stat but not sure how it would
+% help us investiguate. It can I guess trigger an alarm but apart from
+% that...
+% [karsten]: right, see section start.
+%
+\textbf{Risks:}
+%
+No obvious risks.
+More precisely, if absolute numbers are reported, the risk is the same as
+the risk of reporting the number of received \verb+INTRODUCE1+ cells; if
+only fractions are reported, it's not that bad.
+%
+\subsubsection{Number of server rendezvous with unknown rendezvous cookie
+(2.2.3.)}
 
-\paragraph{Number of descriptors fetched over circuits built with TAP vs.
-nTor (3.2.4.)}
+\textbf{Details:}
+%
+Relays report the number of \verb+RENDEZVOUS1+ cell with unknown
+rendezvous cookie.
+%
+\textbf{Benefits:}
+%
+The number of \verb+RENDEZVOUS1+ cell that cannot be matched with a
+previously established rendezvous circuit can be interesting for analyzing
+problems in the protocol.
+We might even distinguish between rendezvous cookies that were previously
+known to the relay and those that seem entirely unrelated.
+The benefit gained from this statistic is not huge though.
+%
+\textbf{Risk:}
+%
+No obvious risks.
 
-\subparagraph{Details}
+\section{Recommendation}
+\label{sec:recommendation}
 
-Relays remember whether an incoming circuit was built using TAP or nTor.
-Whenever they receive a descriptor fetch request they increment a counter
-for either TAP or nTor.
-See ticket 13466 for details.
+\section*{Next steps in writing this report}
 
-\subparagraph{Benefits}
-
-We would learn what fraction of hidden-service clients run older tor
-versions (0.2.3.x or older).
-
-%- How many HSes is the HSDir hosting descriptors for? (harder to do with
-%rend-spec-ng) (assuming that each HS desc is for one HS, this is already
-%covered above.)
-%- How many updates for the same HS desc did the HSDir see? (already covered
-%above, it seems.)
-
-\section{Evaluation}
-
-Adding new statistics to something as sensitive as hidden services has two
-sides: one side is the benefit from gathering data that can be used to
-improve them, but the other side is potential harm to users.
-The following table assigns points to both benefits and risks.
-Each statistic can earn between 0 and 2 benefit points and between 0 and 2
-(negative) risk points.
-The sum of both points provides us with a priority list from statistics
-that make a lot of sense and don't pose much risk to statistics that are
-mostly useless and at the same time very risky.
-
-\begin{longtable}{p{1cm}p{1cm}p{1cm}p{12cm}}
-B    & R    & S \\
-$+$  & $0$  & $+$  & Number of attempts to establish an introduction point
-(1.1.1.) \\
-$0$  & $0$  & $0$  & Time from establishing a circuit to becoming an
-introduction point (1.1.2.) \\
-$+$  & $0$  & $+$  & Number of failed attempts to establish an introduction
-point (1.1.3.) \\
-$+$  & $0$  & $+$  & Lifetime of introduction circuits (1.1.4.) \\
-$+$  & $0$  & $+$  & Reasons for terminating established introduction points
-(1.1.5.) \\
-$+$  & $0$  & $+$  & Number of introduction circuits built with TAP vs. nTor
-(1.1.6.) \\
-$+$  & $-$  & $0$  & Total number of introductions received from clients
-(1.2.1.) \\
-$+$  & $-$  & $0$  & Number of introductions received by established
-introduction point (1.2.2.) \\
-$+$  & $-$  & $0$  & Number of discarded client introductions by reason
-(1.2.3.) \\
-$0$  & $0$  & $0$  & Time between establishing introduction point and receiving
-the first client introduction (1.2.4.) \\
-$+$  & $0$  & $+$  & Number of client introductions coming in via circuits
-built with TAP vs. nTor (1.2.5.) \\
-$+$  & $0$  & $+$  & Number of established rendezvous points (2.1.1.) \\
-$0$  & $0$  & $0$  & Time from circuit creation to establishing rendezvous
-point (2.1.2.) \\
-$+$  & $0$  & $+$  & Number of rendezvous point establishment requests coming
-in via circuits built with TAP vs. nTor (2.1.3.) \\
-$++$ & $0$  & $++$ & Number of server rendezvous (2.2.1.) \\
-$++$ & $0$  & $++$ & Time from establishing a rendezvous point to receiving the
-server rendezvous (2.2.2.) \\
-$+$  & $0$  & $+$  & Number of server rendezvous with unknown rendezvous cookie
-(2.2.3.) \\
-$+$  & $0$  & $+$  & Number of server rendezvous coming in via circuits built
-with TAP vs. nTor (2.2.4.) \\
-$++$ & $0$  & $++$ & Time from server rendezvous to first client data (2.3.1.)
-\\
-$++$ & $-$  & $+$  & Amount of data sent over connected rendezvous circuits in
-either direction (2.3.2.) \\
-$+$  & $-$  & $0$  & Time from first client data to tearing down circuit
-(2.3.3.) \\
-$++$ & $-$  & $+$  & Number of cached descriptors (3.1.1.) \\
-$+$  & $-$  & $0$  & Number of descriptor updates per service (3.1.2.) \\
-$0$  & $0$  & $0$  & Size of hidden service descriptors (3.1.3.) \\
-$+$  & $0$  & $+$  & Number of introduction points contained in descriptors
-(3.1.4.) \\
-$+$  & $0$  & $+$  & Number of descriptors with encrypted introduction points
-(3.1.5.) \\
-$+$  & $0$  & $+$  & Number of descriptors published over circuits built with
-TAP vs. nTor (3.1.6.) \\
-     &      &      & Number of descriptors published to the wrong directory
-(3.1.7.) \\
-$+$  & $--$ & $-$  & Number of descriptor fetch requests (3.2.1.) \\ $+$  &
-$--$ & $-$  & Number of descriptor fetch requests by hidden service identity
-(3.2.2.) \\
-$+$  & $0$  & $+$  & Number of descriptor fetch requests for non-existent
-descriptor (3.2.3.) \\
-$+$  & $0$  & $+$  & Number of descriptors fetched over circuits built with TAP
-vs. nTor (3.2.4.) \\
-\end{longtable}
+\begin{itemize}
+\item Add results from private testing network to list of statistics.
+\item Figure out which of the failure statistics actually make sense, by
+looking at the code.
+\item Decide how to evaluate helpfulness and harmfulness of statistics, in
+an objective way, ideally using the stated evaluation criteria.
+\end{itemize}
 
 \end{document}
 
diff --git a/2015/hidden-service-stats/protocol.odg b/2015/hidden-service-stats/protocol.odg
new file mode 100644
index 0000000..729c8eb
Binary files /dev/null and b/2015/hidden-service-stats/protocol.odg differ
diff --git a/2015/hidden-service-stats/protocol.pdf b/2015/hidden-service-stats/protocol.pdf
new file mode 100644
index 0000000..352b4ef
Binary files /dev/null and b/2015/hidden-service-stats/protocol.pdf differ





More information about the tor-commits mailing list