[tor-dev] Proposal Waterfilling

Mon Mar 5 22:30:00 UTC 2018

Hello,

I recently took the time to read the waterfilling paper. I’m not sure its a good idea even for the goal of increasing the cost of traffic correlation attacks. It depends on whether it is easier for an adversary to run many small relays of total weight x or a few large relays of total weight y, where x = y*c with c the fraction of a Guard-flagged relay used in the guard position (I believe that c=2/3 currently, as Wgg=7268 and Wmg=2732). Just to emphasize it: waterfilling requires *less bandwidth* to achieve a given guard probability as is needed in Tor currently.

Based on prices I’ve seen (~$2/IP/month vs. ~$500/Gbps/month), its significantly cheaper to add a new relay than it is to add bandwidth commensurate with the highest-bandwidth relays. If an adversary finds it easier to compromise machines, then waterfilling might help as it lowers the guard probability of high-bandwidth relays. However, for adversaries with the resources to posses zero-day vulnerabilities against the well-run high-bandwidth relays, it seems to me that those adversaries would easily also have the resources to run relays instead, and in fact it would probably be cheaper for them to run relays as zero-days are expensive. Adversaries with botnets, which have many IPs but generally low bandwidth, would benefit from waterfilling, as it would increase the number of clients choosing them as guards that they can then attack. Waterfilling doesn’t clearly make things better or worse against network-level adversaries.

Thus, it doesn’t seem to me that waterfilling protects Tor’s users against their likely adversaries, and in fact is likely to make things less secure in a few important cases.

Best,
Aaron

> On Jan 31, 2018, at 5:01 PM, teor <teor2345 at gmail.com> wrote:
> 
> 
> On 1 Feb 2018, at 07:15, Florentin Rochet <florentin.rochet at uclouvain.be> wrote:
> 
>>> On 18/01/18 01:03, teor wrote:
>>> 
>>>> I've added this concern within the 'unanswered questions' section. This
>>>> proposal assumes relay measurement are reliable (consensus weight).
>>> How reliable?
>>> 
>>> Current variance is 30% - 40% between identical bandwidth authorities, and
>>> 30% - 60% between all bandwidth authorities.
>>> 
>>> Sources:
>>> https://tomrittervg.github.io/bwauth-tools/#apples-to-apples-comparison
>>> https://tomrittervg.github.io/bwauth-tools/#updated-01
>>> 
>>> Is this sufficient?
>> 
>> My apologies, I was not enough specific: we assume bandwidth
>> measurements reliable as an hypothesis to make the claim that
>> Waterfilling is not going to reduce or improve the performance. If these
>> measurements are not reliable enough, then Waterfilling might make
>> things better, worse or both compared to the current bandwidth-weights
>> is some unpredictable way.
> 
> This variance is measurement error. In this case, discretization error is
> less than 1%.
> 
> We need to know whether measurement inaccuracy makes the network
> weights converge or diverge under your scheme.
> 
> It looks like they converge on the current network with the current
> bandwidth authorities. This is an essential property we need to keep.
> 
>> All of this depends on the bandwidth
>> authority. Anyway, I willingly agree that we need some kind of tools
>> able to report on such modification. Besides, those tools could be
>> reused for any new proposal impacting the path selection, such as
>> research protecting against network adversaries or even some of the
>> changes you already plan to do (such as Prop 276).
> 
> Yes, we are hoping to introduce better tools over time.
> 
>>> <skip>
>>>> …
>>>> 
>>>> - The upper bound in (a) is huge, and would be appreciated for an
>>>> adversary running relays. The adversary could manage to set relays with
>>>> almost 2 times the consensus weight of the water level, and still being
>>>> used at 100% in the entry position. This would reduce a lot the benefits
>>>> of this proposal, right?
>>> I do not know how much the benefits of the proposal depend on the exact
>>> water level, and how close relays are to the water level.
>>> 
>>> …
>>> 
>>> How much variance will your proposal tolerate?
>>> Because current variance is 30% - 60% anyway (see above).
>> 
>> The variance is not a problem if the water level is adapted
>> (re-computed) at each consensus.
> 
> I'm not sure we're talking about the same thing here.
> The variance I am talking about here is measurement error and
> discretization error. Re-computation doesn't change the error.
> (And going from relay measurement to consensus bandwidth can take hours.)
> 
> See my comment above about convergence: we need to converge in
> the presence of discretization error, too.
> 
>>> …
>>> 
>>>> With your explanations below (weight change on clients), and given that
>>>> the consensus diff size is a thing, I am leaning to believe that the
>>>> weight calculation should be done on clients. Anyway, I have added a
>>>> remark about this possibility within the proposal.
>>> Another alternative is to apply proposal 276 weight rounding to these
>>> weights as well.
>>> 
>>> https://gitweb.torproject.org/torspec.git/tree/proposals/276-lower-bw-granularity.txt
>>> 
>>> I think this may be our best option. Because running all these divisions on
>>> some mobile clients will be very slow and cost a lot of power.
>> 
>> Added this to the proposal. We might also "divide" the algorithm: what
>> about computing the weights on dirauths but broadcasting only the pivot
>> (the index of the relay at the water level). Clients can then resume the
>> computation and produce the weights themselves with a reduced cost.
>> Strength:
>>  - The weight calculation would be O(n) on clients (n being the size of
>> the guard set) instead of O(n*log(n))
>>  - No impact on the consensus diff (well, except 1 line, the pivot value).
>> Weakness:
>>  - We still have O(n) divisions on the client, each time we download a
>> new consensus.
> 
> Why not list the waterfilling level on a single line in the consensus?
> 
> That way:
> * authorities do the expensive calculation
> * clients can re-weight relays using a simple calculation:
> 
> if it is less than or equal to the waterfilling level:
>  use the relay's weight as its guard weight
>  use 0 as its middle weight
> otherwise:
>  use the waterfilling level as the relay's guard weight
>  use the relay's weight minus the waterfilling level as its middle weight
> 
> This is O(n) and requires one comparison and one subtraction in the worst case.
> 
> T
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
>