[tor-project] Constructing a real-world dataset for studying website fingerprinting

Tom Ritter tom at ritter.vg
Fri Apr 21 01:34:48 UTC 2023


On Thu, 20 Apr 2023 at 17:16, Jansen, Robert G CIV USN NRL (5543)
Washington DC (USA) via tor-project <tor-project at lists.torproject.org>
wrote:

> The primary
> information being measured is the directionality of the first 5k cells
> sent on a
> measurement circuit, and a keyed-HMAC of the first domain name requested
> on the
> circuit.
>


I suppose this is kind of a non-question, since you wouldn't be doing it
otherwise, but I am surprised that associating the traffic patterns to a
single key, that of the first domain name, is sufficient.  Every page or
query made to that domain (e.g. duckduckgo) will have the same key, with
potentially a lot of entirely disparate traffic patterns.

Obviously this is limited by what you can technically achieve in this
scenario: you have the plaintext DNS requests, and everything else is going
to be TLS-encrypted. The alternative would be to instrument a tor
client/browser and find volunteers to opt-in to their data collection.

-tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-project/attachments/20230420/e05bce2f/attachment.htm>


More information about the tor-project mailing list