[tor-dev] Proposal 328: Make Relays Report When They Are Overloaded

David Goulet dgoulet at torproject.org
Wed Mar 3 13:14:08 UTC 2021


On 02 Mar (20:58:43), Mike Perry wrote:
> 
> 
> On 3/2/21 6:01 PM, George Kadianakis wrote:
> > 
> > David Goulet <dgoulet at torproject.org> writes:
> > 
> >> Greetings,
> >>
> >> Attached is a proposal from Mike Perry and I. Merge requsest is here:
> >>
> >> https://gitlab.torproject.org/tpo/core/torspec/-/merge_requests/22
> >>
> > 
> > Hello all,
> > 
> > while working on this proposal I had to change it slightly to add a few
> > more metrics and also to simplify some engineering issues that we would
> > encounter. You can find the changes here:
> >            https://gitlab.torproject.org/asn/torspec/-/commit/b57743b9764bd8e6ef8de689d14483b7ec9c91ec
> > 
> > Mike, based on your comments in the #40222 ticket, I would appreciate
> > comments on the way the DNS issues will be reported. David argued that
> > they should not be part of the "overload-general" line because they are
> > not an overload and it's not the fault of the network in any way. This
> > is why we added them as separate lines. Furthermore, David suggested we
> > turn them into a threshold "only report if 25% of the total requests
> > have timed out" instead of "only report if at least one time out has
> > occured" since that would be more useful.
> 
> I'm confused by this confusion. There's pretty clear precedent for
> treating packet drops as a sign of network capacity overload. We've also
> seen it experimentally specifically with respect to DNS, during Rob's
> experiment. We discussed this on Monday.
> 
> However, I agree there's a chance that a single packet drop can be
> spurious, and/or could be due to ephemeral overload as TCP congestion
> causes. But 25% is waaaaaaaaaay too high. Even 1% is high IMO, but is
> more reasonable. We should ask some exits what they see now. The fact
> that our DNS scanners are not currently seeing this at all, and the
> issue appeared only for the exact duration of Rob's experiment, suggests
> that DNS packets drops are extremely rare in healthy network conditions.

Ok, likely 25% is way too high indeed.

The idea behind this was simply that a network hiccup or a temporary faulty
DNS server would not move away traffic from the Exit for a 72h period
(reminder that the "overload-general" sticks for 72h in the extrainfo once
hit).

> 
> Furthermore, revealing the specific type of overload condition
> increases the ability for the adversary to use this information for
> various attacks. I'd rather it be combined in all cases, so that the
> specific cause is not visible. In all cases, the reaction of our systems
> should be the same: direct less load to relays with this line. If we
> need to dig, that's what MetricsPort is for.
> 
> In fact, this DNS packet drop signal may be particularly useful in
> traffic analysis attacks. Its reporting, and likely all of this overload
> reporting, should probably be delayed until something like the top of
> the hour after it happens. We may even want this delay to be a consensus
> parameter. Something like "Report only after N minutes", or "Report only
> N minute windows", perhaps?

Yes definitely and I would even add a random component in this so not all
relays will report an overload in a predictable timeframe and thus "if the
line appear, I know it was hit N hours ago" type of calculation.

Cheers!
David

-- 
QlSpNB+aSzOYvM3E0etjbW84Wyx4/7PrwKfWOtmEgE0=
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20210303/53bb0707/attachment.sig>


More information about the tor-dev mailing list