commit f586165e9f9e0f5e1c2b62c85c1ac776248837b6 Author: A. Johnson aaron.m.johnson@nrl.navy.mil Date: Tue Dec 23 10:00:21 2014 -0600
adding section on obfuscation techniques --- 2015/hidden-service-stats/hidden-service-stats.tex | 60 +++++++++++++++++++- 1 file changed, 59 insertions(+), 1 deletion(-)
diff --git a/2015/hidden-service-stats/hidden-service-stats.tex b/2015/hidden-service-stats/hidden-service-stats.tex index 7760e83..d4b586e 100644 --- a/2015/hidden-service-stats/hidden-service-stats.tex +++ b/2015/hidden-service-stats/hidden-service-stats.tex @@ -353,7 +353,8 @@ for the future that says \emph{why} they are a bad idea. We start with statistics that are not specific to the three roles of relays in the hidden-service protocol, but that apply to all of them.
-\subsubsection{Time from circuit extension to circuit purpose change} +\subsubsection{Time from circuit extension to circuit purpose change} +\label{subsubsec:time_circ_ext_to_purpose_change}
% (the following distinction cannot be made, AFAIK. here's what happens: % we receive a CREATE (?) cell from another relay that establishes the @@ -489,6 +490,7 @@ We would learn what fraction of clients and what fraction of services run older tor versions (0.2.3.x or older).
\subsubsection{Time from circuit purpose change to tearing down circuit} +\label{subsubsec:time_circ_purpose_change_to_teardown}
\textbf{Details:} % @@ -527,6 +529,7 @@ relay got chosen X times instead of the measured average Y.
\subsubsection{Time from establishing introduction point to tearing down circuit (1.1.4.)} +\label{subsubsec:time_intro_to_teardown}
\textbf{Details:} % @@ -545,6 +548,7 @@ available for a short time only, and what fraction is available most of the time.
\subsubsection{Number of descriptor publish request (3.1.1.)} +\label{subsubsec:num_descriptor_publish}
\textbf{Details:} % @@ -589,6 +593,7 @@ This is a bit related to differential privacy as we understand it, but much more basic.
\subsubsection{Number of descriptor updates per service (3.1.2.)} +\label{subsubsec:num_decriptor_updates}
\textbf{Details:} % @@ -1076,6 +1081,59 @@ The benefit gained from this statistic is not huge though. % No obvious risks.
+\section{Obfuscation methodology} +The published statistics shouldn't reveal private information to an +adversary when combined with plausible background knowledge. We will use +techniques to provide uncertainty about any specific hidden service, +client, or connection, while maintaining good accuracy in the aggregate +statistics. These techniques include +\begin{itemize} +\item Releasing aggregate statistics over time, such as total counts or +averages in a given period +\item Adding noise (i.e. random inaccuracy) +\item Limiting accuracy to a certain granularity via rounding (aka +``binning'') +\item Adding time-delay to the release of statistics such that the output +doesn't reveal information about ongoing activity +\item Using cryptographic techniques to hide the source of information, +such as anonymizing reports from individual relays +\end{itemize} + + +\subsection{Adversary knowledge} +We can expect that the adversary may know things such as +\begin{itemize} +\item The addresses of a large number of publicly-available services +(e.g. by crawling the Web) +\item A minimum amount of traffic received by a given hidden service +(e.g. due to sending that traffic himself) +\item The introduction points of a service (by obtaining the descriptor) +\item The availability of the service (by attempting to connect +periodically) +\item Roughly the number of client connections and amount of client +traffic (possibly leaked by the service itself, e.g. a web forum) +\end{itemize} + +\subsection{Counts} + +\subsection{Distributions} +For many statistics, it would be very helpful to understand the +distribution of values. For example, such information about descriptor +fetches could reveal if most hidden services are never used or if +there are a few hidden services that constitute most HS activity. +Releasing information about the distribution of statistics could be useful +for the following statistics: +\begin{itemize} +\item Time from circuit extension to circuit purpose change +(Sec.~\ref{subsubsec:time_circ_ext_to_purpose_change}) +\item Time from circuit purpose change to tearing down circuit +(Sec.~\ref{subsubsec:time_circ_purpose_change_to_teardown} +\item Time from establishing introduction point to tearing down +circuit (Sec.~\ref{subsubsec:time_intro_to_teardown}) +\item Number of descriptor updates per service +(Sec.~\ref{subsubsec:num_decriptor_updates}) +\end{itemize} + \section{Recommendation} \label{sec:recommendation}
tor-commits@lists.torproject.org