[tor-dev] Proposal Waterfilling

Wed Mar 7 22:12:19 UTC 2018

Hello,

On 2018-03-07 14:31, Aaron Johnson wrote:
> Hello friends,
>
>> 1) The cost of IPs vs. bandwidth is definitely a function of market
>> offers. Your $500/Gbps/month seems quite expensive compared to what
>> can be found on OVH (which is hosting a large number of relays): they
>> ask ~3 euros/IP/month, including unlimited 100 Mbps traffic. If we
>> assume that wgg = 2/3 and a water level at 10Mbps, this means that,
>> if you want to have 1Gbps of guard bandwidth,
>> - the current Tor mechanisms would cost you 3 * 10 * 3/2 = 45 euros/month
>> - the waterfilling mechanism would cost you 3 * 100 = 300 euros/month
>
> The question of what the cheapest attack is can indeed be estimated by
> looking at market prices for the required resources. Your cost
> estimate of 3.72 USD/Gbps/month for bandwidth seems off by two orders
> of magnitude.
>

Let me merge your second answer here:

> I see that I misread your cost calculation, and that you estimated $37.20/Gbps/month instead of $3.72/Gbps/month. This still seems low by an order of magnitude. Thus, my argument stands: waterfilling would appear to decrease the cost to an adversary of getting guard probability compared to Tor’s current weighting scheme.

There is still something wrong.  Let's assume the adversary wants to run
1 Gbps of real guard bandwidth.

With vanilla Tor, the cheapest (considering only OVH) is:

VPS SSD 1 (https://www.ovh.com/fr/vps/vps-ssd.xml): You need 10 of them
to reach 1Gbps of bandwidth, but you need 15 of them to actually relay 1
Gbps in the guard position (due to wgg = 2/3 roughly). This is our
calculation above: 3*10*3/2 = 45 euros/month (or a few more dollars).

With Waterfilling, we assume above a water level of 10 Mbits, so we need:

100 VPS SSD 1 relaying 1Gbps at the guard position, which the cost turns
to be 3*100 = 300 euros/month.

> The numbers I gave ($2/IP/month and $500/Gbps/month) are the amounts
> currently charged by my US hosting provider. At the time that I
> shopped around (which was in 2015), it was by far the best bandwidth
> cost that I was able to find, and those costs haven’t changed much
> since then.
>
> Currently on OVH the best I could find for hosting just now was
> $93.02/per month for 250Mbps unlimited
> (https://www.ovh.co.uk/dedicated_servers/hosting/1801host01.xml). This
> yields $372.08/Gbps/month. I am far from certain that this is the best
> price that one could find - please do point me to better pricing if
> you have it!
>
> I also just looked at Hetzter - another major Tor-friendly hosting
> provider. The best I could find was 1Gbps link capped at 100TB/month
> for $310.49 (https://wiki.hetzner.de/index.php/Traffic/en). 1Gbps
> sustained upload is 334.8Terabytes (i.e. 1e12 bytes) for a 31-day
> month. If you exceed that limit, you can arrange to pay $1.24/TB.
> Therefore I would estimate the cost to be $601.64/Gbps/month. Again, I
> maybe missing an option more tailored to a high-traffic server, and I
> would be happy to be pointed to it :-)
>
> Moreover, European bandwidth costs are among the lowest in the world.
> Other locations are likely to have even higher bandwidth costs
> (Australia, for example, has notoriously high bandwidth costs).
>
>> We do not believe that this is conclusive, as the market changes, and
>> there certainly are dozens of other providers.
>
> I do agree that the market changes, and in fact I expect the cost fo
> IPs to plummet as the shift to IPv6 becomes pervasive. The current
> high cost of IPv4 addresses is due to their recent scarcity. In any
> case, a good question to ask would be how Tor should adjust to changes
> in market pricing over time.
>
>> The same applies for 0-day attacks: if you need to buy them just for
>> attacking Tor, then they are expensive. If you are an organization in
>> the business of handling 0-day attacks for various other reasons,
>> then the costs are very different. And it may be unclear to determine
>> if it is easier/cheaper to compromise 1 top relay or 20 mid-level relays.
>
> I agree that the cost of compromising machines is unclear. However, we
> should guess, and the business of 0-days has provided some signals of
> their value in terms of their price. 0-days for the Tor software stack
> are expensive, as, for security reasons, (well-run) Tor relays run few
> services other than the tor process. I haven’t seen great data on
> Linux zero-days, but recently a Windows zero-day (Windows being the
> second most-common Tor relays OS) appeared to cost $90K
> (https://www.csoonline.com/article/3077447/security/cost-of-a-windows-zero-day-exploit-this-one-goes-for-90000.html).
> Deploying a zero-day does impose a cost, as it increases the chance of
> that exploit being discovered and its value lost. Therefore, such
> exploits are likely to be deployed only on high-value targets. I would
> argue that Tor relays are unlikely to be such a target because it is
> so much cheaper to simply run your own relays. An exception could be a
> specific targeted investigation in which some suspect is behind a
> known relay (say, a hidden service behind a guard), because running
> other relays doesn’t help dislodge the target from behind its existing
> guard.
>
>> And we are not sure that the picture is so clear about botnets
>> either: bots that can become guards need to have high availability
>> (in order to pass the guard stability requirements), and such high
>> availability bots are also likely to have a bandwidth that is higher
>> than the water level (abandoned machines in university networks,
>> ...). As a result, waterfilling would increase the number of high
>> availability bots that are needed, which is likely to be hard.
>
> This doesn’t seem like a good argument to me: “bots that become guards
> must have high availability, and thus they likely have high
> bandwidth”. How many bots would become guards in the first place? And
> why would availability (by which I understand you to mean uptime)
> imply bandwidth?

Our argument is speculative but here it is: Many computers in botnets
have a diurnal behavior (home computers), which could not get the guard
flag. Computers which have good uptime are located in more or less large
companies or universities, because people are lazy to turn them off or
the internal sysadmins don't want them to do that. Those structures have
also good connectivity, and would probably be *above* the water level.

Anyway, I think our two opinions about botnets are just speculative, and
we might be both wrong, right or a bit of the two. I suggest we debate
the other points :)

> The economics matter here, and I don’t know too much about botnet
> economics, but my impressions is that they generally include many
> thousands of machines and that each bot is generally quickly shut down
> by its service provider once it starts spewing traffic (i.e. acting as
> a high-bandwidth Tor relay). Thus waterfilling could benefit botnets
> by giving them more clients to attack while providing a small amount
> of bandwidth that falls below the radar of their ISP. This is a
> speculative argument, I admit, but seems to me to be somewhat more
> logical than the argument you outlined.
>
>> 2) Waterfilling makes it necessary for an adversary to run a larger
>> number of relays. Apart from the costs of service providers, this
>> large number of relays need to be managed in an apparently
>> independent way, otherwise they would become suspicious to community 
>> members, like nusenu who is doing a great job spotting all anomalies.
>> It seems plausible that running 100 relays in such a way that they
>> look independent is at least as difficult as doing that with 10 relays.
>
> Why is running a large number of relays more noticeable than running a
> high-bandwidth relay? Actually, it seems, if anything, *less*
> noticeable. An attacker could even indicate that all the relays are in
> the same family, and there is no Tor policy that would kick them out
> of the network for being “too large” of a family. If Tor wants to
> limit the size of single entities, then they would have to kick out
> some large existing families (Team Cymru, torservers.net
> <http://torservers.net>, and the Chaos Communicration Congress come to
> mind), and moreover such a policy could apply equally well to total
> amounts of bandwidth as to total number of relays.
>

That depends on the kind of policy that the Tor network could put in
place. If we decide that large families become a threat in
end-positions, we may just aggregate all the bandwidth of the family,
and apply Waterfilling. That would not kick them off, but would create a
kind of 'quarantine'. Same kind of suggestion than the one just below.

>> 3) The question of the protection from relays, ASes or IXPs is
>> puzzling, and we do not have a strong opinion about it. We focused on
>> relays because they are what is available to any attacker, compared
>> to ASes or IXPs which are more specific adversaries. But, if there is
>> a consensus that ASes or IXPs should rather be considered as the main
>> target, it is easy to implement waterfilling at the AS or IXP level
>> rather than at the IP level: just aggregate the bandwidth relayed per
>> AS or IXP, and apply the waterfilling level computation method to
>> them. Or we could mix the weights obtained for all these adversaries,
>> in order to get some improvement against all of them instead of an
>> improvement against only one and being agnostic about the others.
>
> This suggestion of applying waterfilling to individual ASes is
> intriguing, but would require some a more developed design and
> argument. Would the attacker model be one that has a fixed cost to
> compromise/observe a given AS?

Yes, we agree that this specific point requires more research. We just
laid those possibilities in our paper but we did not explore this avenue
with details yet. So, to answer your question, I *think* the attacker
model should be rather different than some budget, as it is not clear to
me what budget is needed to take control over an AS. But, if you have a
different opinion, I would be interested to heat it :) In our paper, we
suggest to use the guessing entropy to evaluate the number of AS needed
to compromise your path. That's still something, but probably not sound
enough by itself.

>
>>
>> 4) More fundamentally, since the fundamental idea of Tor is to mix
>> traffic through a large number of relays, it seems to be a sound
>> design principle to make the choice of the critical relays as uniform
>> as possible, as Waterfilling aims to do. A casual Tor user may be
>> concerned to see that his traffic is very likely to be routed through
>> a very small number of top relays, and this effect is likely to
>> increase as soon as a  multi-cores compliant implementation of Tor
>> rises (rust dev). Current top relays which suffer from the main CPU
>> bottleneck will probably be free to relay even more bandwidth than
>> they already do, and gain an even more disproportionate consensus
>> weight. Waterfilling might prevent that, and keep those useful relays
>> doing their job at the middle position of paths.
>
> I disagree that uniform relay selection is a sound design principle.
> Instead, one should consider various likely attackers and consider
> what design maximizes the attack cost (or maybe maximizes the minimum
> design cost among likely attackers). In the absence of detailed
> attacker information, a good design principle might be for clients to
> choose “diverse” relays

This is what Waterfilling does: increase the cost of a well-defined
attacker and offer clients to choose into a more "diverse" network.

Thanks again for all your opinions and arguments,

Florentin

> , where diversity should take into account country, operator,
> operating system, AS, IXP connectivity, among other things.
>
> Best,
> Aaron
>
>
>
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev