commit bc6855ecce9af8335329bc39f9196c1e6e1ede01 Author: Tom van der Woerdt info@tvdw.eu Date: Mon Oct 12 20:05:51 2015 +0200
Add proposal for load-balancing hidden services --- proposals/000-index.txt | 2 + proposals/255-hs-load-balancing.txt | 157 +++++++++++++++++++++++++++++++++++ 2 files changed, 159 insertions(+)
diff --git a/proposals/000-index.txt b/proposals/000-index.txt index c1f42e2..9debfba 100644 --- a/proposals/000-index.txt +++ b/proposals/000-index.txt @@ -175,6 +175,7 @@ Proposals by number: 252 Single Onion Services [DRAFT] 253 Out of Band Circuit HMACs [DRAFT] 254 Padding Negotiation [DRAFT] +255 Controller features to allow for load-balancing hidden services [DRAFT]
Proposals by status: @@ -197,6 +198,7 @@ Proposals by status: 252 Single Onion Services 253 Out of Band Circuit HMACs 254 Padding Negotiation + 255 Controller features to allow for load-balancing hidden services NEEDS-REVISION: 190 Bridge Client Authorization Based on a Shared Secret OPEN: diff --git a/proposals/255-hs-load-balancing.txt b/proposals/255-hs-load-balancing.txt new file mode 100644 index 0000000..eaab035 --- /dev/null +++ b/proposals/255-hs-load-balancing.txt @@ -0,0 +1,157 @@ +Filename: 255-hs-load-balancing.txt +Title: Controller features to allow for load-balancing hidden services +Author: Tom van der Woerdt +Created: 2015-10-12 +Status: draft + +1. Overview and motivation + +To address scaling concerns with the onion web, we want to be able to +spread the load of hidden services across multiple machines. +OnionBalance is a great stab at this, and it can currently give us 60x +the capacity by publishing 6 separate descriptors, each with 10 +introduction points, but more is better. This proposal aims to address +hidden service scaling up to a point where we can handle millions of +concurrent connections. + +The basic idea involves splitting the 'introduce' from the +'rendezvous', in the tor implementation, and adding new events and +commands to the control specification to allow intercepting +introductions and transmitting them to different nodes, which will then +take care of the actual rendezvous. External controller code could +relay the data to another node or a pool of nodes, all which are run by +the hidden service operator, effectively distributing the load of +hidden services over multiple processes. + +By cleverly utilizing the current descriptor methods through +OnionBalance, we could publish up to sixty unique introduction points, +which could translate to many thousands of parallel tor workers after +implementing this proposal. This should allow hidden services to go +multi-threaded with a few small changes, and continue scaling for a +long time. + + +2. Specification + +We propose two additions to the control specification, of which one is +an event and the other is a new command. We also introduce two new +configuration options. + + +2.1. HiddenServiceAutomaticRendezvous configuration option + +The syntax is: + "HiddenServiceAutomaticRendezvous" SP [1|0] CRLF + +This configuration option is defined to be a boolean toggle which, if +zero, stops the tor implementation from automatically doing a rendezvous +when an INTRODUCE2 cell is received. Instead, an event will be sent to +the controllers. If no controllers are present, the introduction cell +should be dropped, as acting on it instead of dropping it could open a +window for a DoS. + +This configuration option can be specified on a per-hidden service +level, and can be set through the controller for ephemeral hidden +services as well. + + +2.2. HiddenServiceTag configuration option + +The syntax is: + "HiddenServiceTag" SP [a-zA-Z0-9] CRLF + +To identify groups of hidden services more easily across nodes, a +name/tag can be given to a hidden service. Defaults to the storage path +of the hidden service (HiddenServiceDir). + + +2.3. The "INTRODUCE" event + +The syntax is: + "650" SP "INTRODUCE" SP HSTag SP RendezvousData CRLF + + HSTag = the tag of the hidden service + RendezvousData = implementation-specific, but must not contain + whitespace, must only contain human-readable + characters, and should be no longer than 2048 bytes + +The INTRODUCE event should contain sufficient data to allow continuing +the rendezvous from another Tor instance. The exact format is left +unspecified and left up to the implementation. From this follows that +only matching versions can be used safely to coordinate the rendezvous +of hidden service connections. + + +2.4. "PERFORM-RENDEZVOUS" command + +The syntax is: + "PERFORM-RENDEZVOUS" SP HSTag SP RendezvousData CRLF + +This command allows a controller to perform a rendezvous using data +received through an INTRODUCE event. The format of RendezvousData is +not specified other than that it must not contain whitespace, and +should be no longer than 2048 bytes. + + +2.5. The RendezvousData blob + +The "RendezvousData" blob is opaque to the controller, however the tor +implementation should of course know how to deal with it. Its contents +is the minimal amount of data required to process the INTRODUCE2 cell +on another machine. + +Before proposal 224 is implemented, this could consist of the +INTRODUCE2 cell payload, the key to decrypt the cell with if the cell +is not already decrypted (which may be preferable, for performance +reasons), and data necessary for other machines to recognize what to do +with the cell. + +After proposal 224 is implemented, the blob would contain any +additional keys needed to perform the rendezvous handshake. + +Implementations do not need to handle blobs generated by other versions +of the software. Because of this, it is recommended to include a +version number which can be used to verify that the blob is from a +compatible implementation. + + +3. Compatibility and security + +The implementation of these methods should, ideally, not change +anything in the network, and all control changes are opt-in, so this +proposal is fully backwards compatible. + +Controllers handling this data must be careful to not leak rendezvous +data to untrusted parties, as it could be used to intercept and +manipulate hidden services traffic. + + +4. Example + +Let's take an example where a client (Alice) tries to contact Bob's +hidden service. To do this, Bob follows the normal hidden service +specification, except he sets up ten servers to do this. One of these +publishes the descriptor, the others have this disabled. When the +INTRODUCE2 cell arrives at the node which published the descriptor, it +does not immediately try to perform the rendezvous, but instead outputs +this to the controller. Through an out-of-band process this message is +relayed to a controller of another node of Bob's, and this transmits +the "PERFORM-RENDEZVOUS" command to that node. This node finally +performs the rendezvous, and will continue to serve data to Alice, +whose client will now not have to talk to the introduction point +anymore. + + +5. Other considerations + +We have left the actual format of the rendezvous data in the control +protocol unspecified, so that controllers do not need to worry about +the various types of hidden service connections, most notably proposal +224. + +The decision to not implement the actual cell relaying in the tor +implementation itself was taken to allow more advanced configurations, +and to leave the actual load-balancing algorithm to the implementor of +the controller. The developer of the tor implementation should not +have to choose between a round-robin algorithm and something that could +pull CPU load averages from a centralized monitoring system.