On 1 Sep 2015, at 07:45, Philipp Winter <phw@nymity.ch> wrote:

We sometimes see attacks from relays that are hosted on cloud platforms.
I have been wondering if the benefit of having cloud-hosted relays
outweighs the abuse we see from them.

To get an idea of the benefit, I analysed the bandwidth that is
contributed by cloud-hosted relays. I first obtained the network blocks
owned by three cloud providers (Amazon AWS, Google Cloud, Microsoft
Azure), and determined the percent of bandwidth they contributed in July
2015. The results show that there were typically ~200 cloud-hosted
relays online:
<https://nymity.ch/sybilhunting/png/cloud-hosted_relays_2015-07.png>
The spike shortly after hour 200 was caused by a lot of Amazon relays
named "DenkoNet". The spike at the very beginning was caused by a
number of relays that might very well belong together, too, based on
their uptime pattern.

What counts, however, is bandwidth. Here's the total bandwidth fraction
contributed by cloud-hosted relays over July 2015:
<https://nymity.ch/sybilhunting/png/cloud-hosted_bandwidth_2015-07.png>
There were no Google Cloud relays to contribute any bandwidth. Amazon
AWS-powered relays contributed the majority of bandwidth, followed by
Microsoft Azure-powered relays. Here's a summary of the time series in
percent:

Min. Mean Median Max.
0.2% 0.8% 0.79% 1.5%

In an average consensus in July 2015, cloud-hosted relays contributed
only around 0.8% of bandwidth. Note, however, that this is just a lower
bound. The netblocks I used for the analysis could have changed, and I
didn't consider providers other than Google, Amazon, and Microsoft.

There are also cloud-hosted bridges. Tor Cloud, however, has shut down,
and the number of EC2 bridges is declining:
<https://metrics.torproject.org/cloudbridges.html?graph=cloudbridges&start=2015-01-01&end=2015-07-31>

Can we preserve cloud-hosted bridges independently of whatever we decide to do to cloud-hosted relays?

The harm caused by cloud-hosted relays is more difficult to quantify.
Getting rid of them also wouldn't mean getting rid of any attacks. At
best, attackers would have to jump through more hoops.

If we were to decide to permanently reject cloud-hosted relays, we would
have to obtain the netblocks that are periodically published by all
three (and perhaps more) cloud providers:
<https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html>
<https://msdn.microsoft.com/en-us/library/azure/Dn175718.aspx>
<https://cloud.google.com/appengine/kb/general?hl=en#static-ip>

Note that this should be done periodically because the netblocks are
subject to change.