[tor-dev] Proposal XXX: FlashFlow: A Secure Speed Test for Tor (Parent Proposal)

teor teor at riseup.net
Sun May 17 02:43:21 UTC 2020


> On 16 May 2020, at 16:05, Mike Perry <mikeperry at torproject.org> wrote:
> 
>> On 4/23/20 1:48 PM, Matt Traudt wrote:
>> 
>> 5.4 Other Changes/Investigations/Ideas
>> 
>> - How can FlashFlow data be used in a way that doesn't lead to poor
>>  load balancing given the following items that lead to non-uniform
>>  client behavior:
>>    - Guards that high-traffic HSs choose (for 3 months at a time)
>>    - Guard vs middle flag allocation issues
>>    - New Guard nodes (Guardfraction)
>>    - Exit policies other than default/all
>>    - Directory activity
>>    - Total onion service activity
>>    - Super long-lived circuits
>> - What is the explanation for dennis.jackson's scary graphs in this [2]
>>  ticket?  Was it because of the speed test? Why? Will FlashFlow produce
>>  the same behavior?
> 
> It will also be wise to provide a way for relays to signify that they
> are on the same machine. I bet concurrent machine deployments are one of
> the top contributors to the long tail of bad perf we saw caused by the
> Flashflow experiment[2]. If flashflow measures each such relay as having
> the full link capacity instead of a shared fraction, this is obviously
> going to result in overload on those relays, leading to a long tail of
> bad perf when they are chosen and are also overloaded. It is unlikely
> that we can deploy a FlashFlow that has this long tail perf problem
> without fixing this and related balancing issues (though hopefully most
> will be smoothed over by sbws).
> 
> This is a little tricky, because we might not want rogue relays joining
> each others "machines" (similar to the Family problem), but for testing
> something as simple as how MyFamily works would be great. Ideally,
> though, relays would ask or detect that they are concurrently running in
> nearby IP space and either warn the operator to set the flag, or set it
> automatically.
> 
> We actually have this work included in a future performance funding
> proposal, but the timeline on that getting approved (or even rejected)
> is so far out that we should figure out a way to do this before that,
> especially if Flashflow development is going to begin soon.

We could assume that relays on the same IPv4 /24 or IPv6 /48 share a
network link, and re-do the experiment.

Then we could tweak the network size based on those results. We'd
need to compromise between "false sharing" and "missed sharing".

Then individual operators could fine-tune that initial heuristic using the
"same network link" config.

(This is similar to how MyFamily works: Tor assumes that relays in the
same IPv4 /16 and IPv6 /32 have the same network operator. Then
individual relay operators can declare extra families using MyFamily.)

T


More information about the tor-dev mailing list