Anonymity-preserving collection of usage data of a hidden service authoritative directory

Karsten Loesing karsten.loesing at gmx.net
Sun Apr 29 15:09:55 UTC 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

coming back to the discussion on anonymity-preserving collection of
usage data of a hidden service authoritative directory.

It took me two days, but I think the implementation is finally done now.
:D I uploaded a SVN patch with the necessary changes here:

    http://88.84.144.63/hsusage-patch

It is my first C code, so when reviewing it, could you please tell me
whatever I can do better the next time? I testet my code (of course
using PuppeTor :) ), but maybe there is still some storage leak or
whatever. Thanks!

The refined specification and some implementation notes are at the
bottom of this mail.

Btw: When reading the code I found the command-line option
"--ignore-missing-torrc" that I didn't find in the docs.

- --Karsten



- --- specification ---

This proposal contains a specification and implementation for collecting
data of hidden service authoritative directories.

This data could be vital for designing a decentralized storage of hidden
service descriptors. Are there 10 or 1000 hidden services running at a
time? Are fetch requests distributed equally over all hidden services or
are there hot spots? Those questions cannot be answered without some
real data.

Obviously, such a collection needs to be done in an anonymity-preserving
way. Though the anonymity of hidden services does not rely primarily on
the integrity of the directory operator, it plays a role. The operator
can find out which hidden service is online or attack its introduction
points.

The proposal is to add a new status file "hsusage" that is written in
regular intervals by hidden service authoritative directories to their
data directory. It contains status information comparable to the network
status with entries that are built like write-history and read-history
in server descriptors. For each entry there is one aggregated value per
interval of 900 seconds (15 minutes) for a total of 96 intervals (1 day):

"publish-total-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
   total number of valid publish requests observed in the interval

"publish-novel-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
   total number of valid publish requests that contain a novel
   descriptor, i.e. one with a currently unknown service ID

"publish-top-5-percent-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
"publish-top-10-percent-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
"publish-top-20-percent-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
   total number of valid publish requests containing a descriptor for
   one of the top 5 (10, 20) percent of all services (ordered by number
   of publish requests); can help to figure out which share of publish
   request (probably non-novel publish requests) comes from the top
   available services

"fetch-total-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
   total number of valid fetch requests observed in the interval

"fetch-successful-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
   total number of valid and successful fetch requests observed in the
   interval

"fetch-top-5-percent-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
"fetch-top-10-percent-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
"fetch-top-20-percent-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM... NL
   total number of valid fetch requests asking for one of the top 5 (10,
   20) percent of all services (ordered by number of fetch requests);
   can help to figure out whether there are hot spots under the services

"desc-total-history"
   total number of current descriptors at the end of the interval

- --- implementation ---

The implementation was done in a way that keeps changes to the current
code small and aggregates most of the extensions at a single place.

These are the changes to existing code:

- - Report fetch requests (directory.c)
- - Report publish requests (rendcommon.c)
- - Lookup descriptor cache size (rendcommon.c)
- - Write statistics to disk in periodical intervals (main.c)
- - Initialize new statistics when initializing reputation history
  (rephist.c)
- - Function prototypes (or.h)

The major part of the implementation is appended to the existing code in
rephist.c.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGNLVD0M+WPffBEmURAoMYAJ94JeEiuCNg32l9y9BCND/2dQeqSQCgoISr
aywUhJ1gML4OYnhtiV6ety4=
=aLIT
-----END PGP SIGNATURE-----



More information about the tor-dev mailing list