Looking for feedback on my Tor relay configuration decision:
Assuming both options cost the same and provide the same aggregate bandwidth—and by “better,” I mean improved advertised bandwidth, reliability, and diversity—is it preferable to deploy:
Option A: One hardware server on a 40Gbps link, or
Option B: Four hardware servers, each on a 10Gbps link, located in different geographic areas and/or with different upstream providers?
I lean toward Option B because it offers advantages in advertised bandwidth, reliability, and diversity. Additionally, does the ideal choice depend on whether the relays serve as exit nodes or as guard/middle nodes? Also, is there a more comprehensive way to define “better” for this context?
I appreciate any suggestions or refinements!
Hi,
Option B indeed will be more network diverse and assuming you will run less relays per server in this scenario, it's also more system resource efficient because of less overhead/congestion on each server/network. As said before, Tor at scale runs extremely inefficient because of Tor's architecture, so running less relays per system will improve system overhead/congestion. The bigger the system/the more relays, the sooner you run in to congestion issues, which may introduce you to the wondrous world of CPU/NIC thread pinning, crypto offloading and other complexities. This may be a boon or not to you ;).
In addition having four different geographically separated locations improves geographical diversity (making taps more difficult), but do note that if they are all situated in the US that this is more of a benefit to latency/network diversity (assuming four different autonomous systems as well), and not so much a benefit to legal position since they would all be situated in the same jurisdiction.
And also when a server dies, having four of them will be better because 75% of the relays will keep going instead of 100% dying. This is also great when rebooting for kernel updates or restarting relays for security patches :). Otherwise you could (in the worst-case scenario) instantly disconnect 20% of the users/lower network capacity by 20% at once. On the other hand, Tor has been made in such a way that clients reconnect automatically so this shouldn't matter much.
And of course it's harder to shut the relays down if they are spread out. But this is where the advantages end I think. For the downsides (with some nuance): 1. Managing four physical locations can be a pain (depending on physical distance between data centers).
2. Managing four different servers increases time spent on system administration, which may be a disadvantage. It's also technologically more challenging so it may offer a better learning experience :).
3. Assuming the same hardware generation/type, one system will practically always be much more power efficient than having multiple. Any high end server platform has a quite significant base power consumption. In my country electricity is extremely expensive so this would be an important factor to consider. But if you live in Finland, Sweden or some of the low power cost parts of the US this shouldn't be too important.
4. Running in four data centers is probably more expensive than running in one.
5. Running in four data centers requires you to keep up good relations with four data centers. Maintaining relationships (especially when they are getting abuse notices, legal threats etc. because of you) can be challenging and the less you have, the easier it will be. On the other hand it's also an advantage: if you have put all your eggs in one basket (one DC/AS) and they want to get rid of you, it's a more difficult migration compared to a situation where you have three other DCs where you can (temporarily) put your banned server or migrate your relay keys to.
Then about which types of relays relating to A and B. I don't think this matters much, but here it goes: 1. For exit relays DNS query latency is important, so this would be easier with option A since you can then run one high performance DNS recursor locally. With option B you would need to run (if latency is valued) at least four DNS recursors (decreasing cache hits considerably, in turn decreasing DNS privacy of users), but this has the added bonus of having DNS redundancy/failover when you load balance (in failover mode) them between the four servers.
Also for B with exit relays, the 'rules' or 'limitations' opposed by the data center/autonomous system you run in may differ. For example one of them could restrict certain ports while the other is okay with using those ports for Tor. You could make your exit policy according the most restricted DC/AS or differentiate between per DC/AS/server.
2. For guard relays some headroom for (D)DoS attacks is good to have and four servers with ample system memory will probably (generally at least ) have more total headroom than one big server. Also adversaries would need to attack four different autonomous systems instead of one, which probably would make an attack a bit more expensive. 3. And for middle relays I can't think of anything that would (really) matter between A and B.
I hope this provides some food for thought! Cheers,
tornth Mar 31, 2025, 10:42 by tor-relays@lists.torproject.org:
Looking for feedback on my Tor relay configuration decision:
Assuming both options cost the same and provide the same aggregate bandwidth—and by “better,” I mean improved advertised bandwidth, reliability, and diversity—is it preferable to deploy:
Option A: One hardware server on a 40Gbps link, or
Option B: Four hardware servers, each on a 10Gbps link, located in different geographic areas and/or with different upstream providers?
I lean toward Option B because it offers advantages in advertised bandwidth, reliability, and diversity. Additionally, does the ideal choice depend on whether the relays serve as exit nodes or as guard/middle nodes?
Also, is there a more comprehensive way to define “better” for this context?
I appreciate any suggestions or refinements!
Amazing response!
Agreed and am going with Option B. Geographic and datacenter diversity seems worth the effort for additional administrative overhead.
To add some pricing, from my recent research and some diligent negotiations the last 30 days, both option A and option B are roughly $2,000 / mo in the US. Before negotiations, list and opening conversation prices are 3-5x higher, $6,000/mo to $10,000/mo.
Had ChatGPT summarize into a blog post with a link back to this email thread: https://1aeo.com/blog/high-capacity-relays-single-vs-multiple-servers.html
Will send across some more questions / decision points and some research I've done for feedback as well on separate email threads.
Thanks!
On Monday, March 31st, 2025 at 3:15 AM, mail--- via tor-relays tor-relays@lists.torproject.org wrote:
Hi,
Option B indeed will be more network diverse and assuming you will run less relays per server in this scenario, it's also more system resource efficient because of less overhead/congestion on each server/network. As said before, Tor at scale runs extremely inefficient because of Tor's architecture, so running less relays per system will improve system overhead/congestion. The bigger the system/the more relays, the sooner you run in to congestion issues, which may introduce you to the wondrous world of CPU/NIC thread pinning, crypto offloading and other complexities. This may be a boon or not to you ;).
In addition having four different geographically separated locations improves geographical diversity (making taps more difficult), but do note that if they are all situated in the US that this is more of a benefit to latency/network diversity (assuming four different autonomous systems as well), and not so much a benefit to legal position since they would all be situated in the same jurisdiction.
And also when a server dies, having four of them will be better because 75% of the relays will keep going instead of 100% dying. This is also great when rebooting for kernel updates or restarting relays for security patches :). Otherwise you could (in the worst-case scenario) instantly disconnect 20% of the users/lower network capacity by 20% at once. On the other hand, Tor has been made in such a way that clients reconnect automatically so this shouldn't matter much.
And of course it's harder to shut the relays down if they are spread out.
But this is where the advantages end I think. For the downsides (with some nuance):
Managing four physical locations can be a pain (depending on physical distance between data centers).
Managing four different servers increases time spent on system administration, which may be a disadvantage. It's also technologically more challenging so it may offer a better learning experience :).
Assuming the same hardware generation/type, one system will practically always be much more power efficient than having multiple. Any high end server platform has a quite significant base power consumption. In my country electricity is extremely expensive so this would be an important factor to consider. But if you live in Finland, Sweden or some of the low power cost parts of the US this shouldn't be too important.
Running in four data centers is probably more expensive than running in one.
Running in four data centers requires you to keep up good relations with four data centers. Maintaining relationships (especially when they are getting abuse notices, legal threats etc. because of you) can be challenging and the less you have, the easier it will be. On the other hand it's also an advantage: if you have put all your eggs in one basket (one DC/AS) and they want to get rid of you, it's a more difficult migration compared to a situation where you have three other DCs where you can (temporarily) put your banned server or migrate your relay keys to.
Then about which types of relays relating to A and B. I don't think this matters much, but here it goes:
- For exit relays DNS query latency is important, so this would be easier with option A since you can then run one high performance DNS recursor locally. With option B you would need to run (if latency is valued) at least four DNS recursors (decreasing cache hits considerably, in turn decreasing DNS privacy of users), but this has the added bonus of having DNS redundancy/failover when you load balance (in failover mode) them between the four servers.
Also for B with exit relays, the 'rules' or 'limitations' opposed by the data center/autonomous system you run in may differ. For example one of them could restrict certain ports while the other is okay with using those ports for Tor. You could make your exit policy according the most restricted DC/AS or differentiate between per DC/AS/server.
For guard relays some headroom for (D)DoS attacks is good to have and four servers with ample system memory will probably (generally at least ) have more total headroom than one big server. Also adversaries would need to attack four different autonomous systems instead of one, which probably would make an attack a bit more expensive.
And for middle relays I can't think of anything that would (really) matter between A and B.
I hope this provides some food for thought!
Cheers,
tornth
Mar 31, 2025, 10:42 by tor-relays@lists.torproject.org:
Looking for feedback on my Tor relay configuration decision:
Assuming both options cost the same and provide the same aggregate bandwidth—and by “better,” I mean improved advertised bandwidth, reliability, and diversity—is it preferable to deploy:
Option A: One hardware server on a 40Gbps link, or
Option B: Four hardware servers, each on a 10Gbps link, located in different geographic areas and/or with different upstream providers?
I lean toward Option B because it offers advantages in advertised bandwidth, reliability, and diversity. Additionally, does the ideal choice depend on whether the relays serve as exit nodes or as guard/middle nodes?
Also, is there a more comprehensive way to define “better” for this context?
I appreciate any suggestions or refinements!
tor-relays@lists.torproject.org