Running a high-performance pluggable transports Tor bridge (FOCI 2023 short paper)

Linus Nordberg and I wrote a short paper that was presented at FOCI 2023. The topic is how to use all the available CPU capacity of a server running a Tor relay. This is how the Snowflake bridges are set up. It might also be useful for anyone running a relay that is bottleneck on the CPU. If you have ever run multiple relays on one IP address for better scaling (if you are one of the relay operators affected by the recent AuthDirMaxServersPerAddr change), you might want to experiment with this setup. The difference is that all the instances of Tor have the same relay fingerprint, so they operate like one big relay instead of many small relays. https://www.bamsoftware.com/papers/pt-bridge-hiperf/

On Mon, Sep 04, 2023 at 02:09:50AM -0600, David Fifield wrote:
The workshop presentation video (22 minutes) of this paper has just become available on YouTube. The paper homepage has a copy of the video too. https://www.youtube.com/watch?v=UkUQsAJB-bg&list=PLWSQygNuIsPc8bOJ2szOblMK4i... The other FOCI 2023 issue 2 videos are online as well: https://www.youtube.com/playlist?list=PLWSQygNuIsPc8bOJ2szOblMK4i6T79S1m

Hi David
Thank you for the paper and the presentation. Chapter 3 (Multiple Tor processes) shows the structure:
mypt - HAproxy = multiple tor services
At the end of chapter 3.1 it is written
the loss of country- and transport-specific metrics
How will the metrics data be pulled out of the multiple tor services to fetch *all* metrics data? Or will only one of them be looked at, without full data representation? I ask primary about an obfs4 setup. Which might apply for snowflake and friends too. -- Cheers, Felix

On Mon, Dec 11, 2023 at 08:13:17PM +0100, Felix wrote:
The key is that every instance of tor must have a different nickname. That way, even though they all have the same relay identity key, Tor Metrics knows to count all the descriptors separately. So, for instance, on one snowflake bridge (identity 2B280B23E1107BB62ABFC40DDCC8824814F80A72), we use nicknames: flakey1, flakey2, …, flakey12 and on another bridge (identity 8838024498816A039FCBBAB14E6F40A0843051FA) we use nicknames: crusty1, crusty2, …, crusty12 Instructions for setting up nicknames can be found at https://gitlab.torproject.org/tpo/anti-censorship/team/-/wikis/Survival-Guid... It used to be the case that Tor Metrics did not understand the descriptors of this kind of multi-instance bridge. If you had N instances, it would count only 1 of them per time period. But Tor Metrics has now known about this kind of bridge (multiple descriptors per time period with the same identity key but different nicknames) for more than a year: https://gitlab.torproject.org/tpo/network-health/metrics/website/-/issues/40... https://gitlab.torproject.org/tpo/network-health/metrics/website/-/merge_req... Relay Search still does not know about multi-instance bridges, though. If you look up such a bridge, it will display one of the multiple instances more or less at random. In the case of the current snowflake bridges, you have to multiply the numbers on Relay Search pages by 12 to get the right numbers. https://metrics.torproject.org/rs.html#details/2B280B23E1107BB62ABFC40DDCC88... https://metrics.torproject.org/rs.html#details/8838024498816A039FCBBAB14E6F4... There's a special repository for making graphs of snowflake users. This was necessary in the time before Tor Metrics natively understood multi-instance bridges, and I still use it because it offers some extra flexibility over what metrics.torproject.org provides. With some small changes, the same code could work for other pluggable transports, or even single bridges. https://gitlab.torproject.org/dcf/snowflake-graphs This is a sample of the graph output: https://forum.torproject.org/t/snowflake-daily-operations-november-2023-upda...
participants (3)
-
David Fifield
-
Felix
-
Xiang Lo