[tor-dev] Load Balancing in 2.7 series - incompatible with OnionBalance ?

Thu Oct 22 18:55:09 UTC 2015

On 22 Oct (16:30:55), Alec Muffett wrote:
> 
> info at tvdw.eu wrote:
> 
> > Hi Alec,
> 
> Hi Tom! I love your proposal, BTW. :-)
> 
> > Most of what you said sounds right, and I agree that caching needs TTLs (not just here, all caches need to have them, always).
> 
> Thank you!
> 
> > However, you mention that one DC going down could cause a bad experience for users. In most HA/DR setups I've seen there should be enough capacity if something fails, is that not the case for you? Can a single data center not serve all Tor traffic?
> 
> It's not the datacentre which worries me - we already know how to deal with those - it's the failure-based resource contention for the limited introduction-point space that is afforded by a maximum (?) of six descriptors each of which cites 10 introduction points.
> 
> A cap of 60 IPs is a clear protocol bottleneck which - even with your excellent idea - could break a service deployment.
> 
> Yes, in the meantime the proper solution is to split the service three ways, or even four, but that's administrative burden which less well-resourced organisations might struggle with.
> 
> Many (most?) will have a primary site and a single failover site, and it seems perverse that they could bounce just ONE of those sites and automatically lose 50% of their Onion capacity for up to 24 hours UNLESS they also take down the OTHER site for long enough to invalidate the OnionBalance descriptors.
> 
> Such is not the description of a high-availability (HA) service, and it might put people off.
> 
> > If that is a problem, I would suggest adding more data centers to the pool. That way if one fails, you don't lose half of the capacity, but a third (if N=3) or even a tenth (if N=10).
> 
> ...but you lose it for 1..24 hours, even if you simply reboot the Tor daemon.
> 
> > Anyway, such a thing is probably off-topic. To get back to the point about TTLs, I just want to note that retrying failed nodes until all fail is scary:
> 
> I find that worrying, also. I'm not sure what I think about it yet, though.
> 
> > what will happen if all ten nodes get a 'rolling restart' throughout the day? Wouldn't you eventually end up with all the traffic on a single node, as it's the only one that hadn't been restarted yet?
> 
> Precisely.
> 
> > As far as I can see the only thing that can avoid holes like that is a TTL, either hard coded to something like an hour, or just specified in the descriptor. Then, if you do a rolling restart, make sure you don't do it all within one TTL length, but at least two or three depending on capacity.
> 
> Concur.
> 
> 
> desnacked at riseup.net wrote:
> 
> > Please see rend_client_get_random_intro_impl(). Clients will pick a random intro point from the descriptor which seems to be the proper behavior here.
> 
> That looks great!
> 
> > I can see how a TTL might be useful in high availability scenarios like the one you described. However, it does seem like something with potential security implications (like, set TTL to 1 second for all your descriptors, and now you have your clients keep on making directory circuits to fetch your descs).
> 
> Okay, so, how about:
> 
> IDEA: if ANY descriptor introduction point connection fails AND the descriptor's ttl has been exceeded THEN refetch the descriptor before trying again?
> 
> It strikes me (though I may be wrong?) that the degenerate case for this would be someone with an onion killing their IP in order to force the user to refetch a descriptor - which is what I think would happen anyway?
> 
> At very least this proposal would add a work factor.

Something also I mentionned on IRC with the TTL is the circuit behavior
creation that changes quite a bit.

For instance, if FB's descriptor has a TTL of 2 hours, this means that
there will be an HSDir fetch every two hours followed by an IP+RP dance.
Seems like it makes me, a malicious client Guard, more able to identify
every client going to facebook considering that you guys are the only
one using a TTL of 2.

Let's use your idea of "if one IP fails and TTL expired then re-fetch".
This could also make it "easier" to identify people connecting to
Facebook. As your client guard, I see you do the fetch + IP/RP dance (3
circuits in short period of time where two are killed). I wait 2 hours
and then kill all circuits passing through me from you. If I can see
again that distinctive HS pattern (3 circuits), I'll get closer to know
that you are accessing FB. (I can do that several other times to
confirm).

All in all, a TTL in the descriptor changes things enough imo to know
*which* descriptor a client is using since as your guard I can induce
your client to behave according to the TTL and make you reveal patterns.

Seems like we need a common behavior for all HS client here and that
would be "if _any_ IP/RP fails, re-fetch" but that's going to be quite
heavy on the network I think.

But let's keep thinking about crazy ideas here like "client keeps
circuit to HSDir until rotation and if IP/RPs dies, ask if descriptor
has changed by sending a hash of its current desc. and if so, fetch else
keep going with current set of IPs." ? (with netflow padding, this will
be much more difficult to be recognized by a malicious guard)

(IMO, this is definitely a problem that we need to solve for load
balancing and performance so let's keep throwing ideas until we get to
something useful we could use to draft a proposal.)

Cheers!
David

> 
> > For this reason I'd be interested to see this specified in a formal Tor proposal (or even as a patch to prop224). It shouldn't be too big! :)
> 
> I would hesitate to add it to Prop 224 which strikes me as rather large and distant.  I'd love to see this by Christmas :-P
> 
> 
> teor2345 at gmail.com wrote:
> 
> > Do we connect to introduction points in the order they are listed in the descriptor? If so, that's not ideal, there are surely benefits to a random choice (such as load balancing).
> 
> Apparently not (re: George) :-)
> 
> > That said, we believe that rendezvous points are the bottleneck in the rendezvous protocol, not introduction points.
> 
> Currently, and in most current deployments, yes.
> 
> > However, if you were to use proposal #255 to split the introduction and rendezvous to separate tor instances, you would then be limited to:
> > - 6*10*N tor introduction points, where there are 6 HSDirs, each receiving 10 different introduction points from different tor instances, and N failover instances of this infrastructure competing to post descriptors. (Where N = 1, 2, 3.)
> > - a virtually unlimited number of tor servers doing the rendezvous and exchanging data (say 1 server per M clients, where M is perhaps 100 or so, but ideally dynamically determined based on load/response time).
> > In this scenario, you could potentially overload the introduction points.
> 
> Exactly my concern, especially when combined with overlong lifetimes of mostly-zombie descriptors.
> 
> - alec
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 603 bytes
Desc: Digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20151022/f6e086e9/attachment-0001.sig>