Hi,
Many people already replied, but here are my (late) two cents.
> 1) If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?
"Optimal" depends on your preferences and goals. Some examples:
- IP address efficiency: run 8 relays per IPv4 address.
- Use the best ports: 256 relays (443) or 512 relays (443+80).
- Lowest kernel/system congestion: 1 locked relay per core/SMT thread combination, ideally on high clocked CPUs.
- Easiest to manage: as few relays as possible.
- Memory efficiency: only run middle relays on very high clocked CPUs (4-5 Ghz).
- Cost efficiency: run many relays on 1-2 generations old Epyc CPUs with a high core count (64 or more).
There are always constraints. The hardware/CPU/memory and bandwidth/routing capability available to you are probably not infinite. Also the Tor Project maximizes bandwidth contributions to 20% and 10% for exit relay and overall consensus weight respectively.
With 256 IP addresses on modern hardware, it will be very hard to not run in to one of these limitations long before you can make it 'optimal'. Hardware wise, one modern/current gen high performance server only running exit relays will easily push enough Tor traffic to do more than half of the total exit bandwidth of the Tor network.
My advice would be:
1) Get the fastest/best hardware with current-ish generation CPU IPC capabilities you can get within your budget. To lower complexity with keeping congestion in control, one socket is easier to deal with than a dual socket system.
(tip for NIC: if your switch/router has 10 Gb/s or 25 Gb/s ports, get some of the older Mellanox cards. They are very stable (more so than their Intel counterparts in my experience) and extremely affordable nowadays because of all the organizations that throw away their digital sovereignty and privacy of their employees/users to move to the cloud).
3) Start with 1 Tor relay per physical core (ignoring SMT). When the Tor relays have ramped up (this takes 2-3 months for guard relays) and there still is considerable headroom on the CPU (Tor runs extremely poorly at scale sadly, so this would be my expectation) then move to 1 Tor relay per thread (SMT included).
(tip: already run/'train' some Tor relays with a very limited bandwidth (2 MB/s or something) parallel to your primary ones and pin them all to 1-2 cores to let them ramp up in parallel to your primary ones. This makes it *much* less cumbersome to scale up your Tor contribution when you need/want/can do that in the future).
4) Assume at least 1 GB of RAM per relay on modern CPUs + 32 GB additionally for OS, DNS, networking and to have some headroom for DoS attacks. This may sound high, especially considering the advice in the Tor documentation. But on modern CPUs (especially with a high clockspeed) guard relays can use a lot more than 512 MB of RAM, especially when they are getting attacked. Middle and exit relays require less RAM.
Don't skimp out on system memory capacity. DDR4 RDIMMs with decent clockspeeds are so cheap nowadays. For reference: we ran our smaller Tor servers (16C@3.4Ghz) with 64 GB of RAM and had to upgrade it to 128 GB because during attacks RAM usage exceeded the amount available and killed processes.
5) If you have the IP space available, use one IPv4 address per relay and use all the good ports such as 443. If IP addresses are more scarce, it's also not bad to run 4 or 8 relays per IP address. Especially for middle and exit relays the port doesn't matter (much). Guard relays should ideally always run on a generally used (and generally unblocked) port.
> 2) If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?
That greatly depends on the CPU and your configuration. I can offer 3 references based on real world examples. They all run a mix of guard/middle/exit relays.
1) Typical low core count (16+SMT) with higher clockspeed (3.4 Ghz) saturates a 10 Gb/s connection with ~18.5 physical cores + SMT.
2) Typical higher core count (64+SMT) with lower clockspeed (2.25 Ghz) saturates a 10 Gb/s connection with ~31.5 physical cores + SMT.
3) Typical energy efficient/low performance CPU with low core count (16) with very low clockspeed (2.0 Ghz) used often in networking appliances saturates a 10 Gb/s connection with ~75 physical cores (note: no SMT).
The amount of IP addresses required also depends on multiple factors. But I'd say that you would need between the amount and double the amount of relays of the mentioned core+SMT count in order to saturate 10 Gb/s. This would be 37-74, 63-126 and 75-150 relays respectively. So between 5 and 19 IPv4 addresses would be required at minimum, depending on CPU performance level.
RAM wise the more relays you run, the more RAM overhead you will have. So in general it's better to run less relays at a higher speed each than run many at a low clock speed. But since Tor scales so badly you need more Relays anyway so optimizing this isn't easy in practice.
> 3) Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?
Double the amount compared to 10 Gb/s.
Good luck with your Tor adventure. And let us know your findings with achieving 10 Gb/s when you get there :-).
Cheers,
tornth
Feb 3, 2025, 18:14 by tor-relays@lists.torproject.org:
Hi All,
Looking for guidance around running high performance Tor relays on Ubuntu.
Few questions:
1) If a full IPv4 /24 Class C was available to host Tor relays, what are some optimal ways to allocate bandwidth, CPU cores and RAM to maximize utilization of the IPv4 /24 for Tor?
2) If a full 10 Gbps connection was available for Tor relays, how many CPU cores, RAM and IPv4 addresses would be required to saturate the 10 Gbps connection?
3) Same for a 20 Gbps connection, how many CPU cores, RAM and IPv4 addresses are required to saturate?
Thanks!