We sometimes see attacks from relays that are hosted on cloud platforms. I have been wondering if the benefit of having cloud-hosted relays outweighs the abuse we see from them.
To get an idea of the benefit, I analysed the bandwidth that is contributed by cloud-hosted relays. I first obtained the network blocks owned by three cloud providers (Amazon AWS, Google Cloud, Microsoft Azure), and determined the percent of bandwidth they contributed in July 2015. The results show that there were typically ~200 cloud-hosted relays online: https://nymity.ch/sybilhunting/png/cloud-hosted_relays_2015-07.png The spike shortly after hour 200 was caused by a lot of Amazon relays named "DenkoNet". The spike at the very beginning was caused by a number of relays that might very well belong together, too, based on their uptime pattern.
What counts, however, is bandwidth. Here's the total bandwidth fraction contributed by cloud-hosted relays over July 2015: https://nymity.ch/sybilhunting/png/cloud-hosted_bandwidth_2015-07.png There were no Google Cloud relays to contribute any bandwidth. Amazon AWS-powered relays contributed the majority of bandwidth, followed by Microsoft Azure-powered relays. Here's a summary of the time series in percent:
Min. Mean Median Max. 0.2% 0.8% 0.79% 1.5%
In an average consensus in July 2015, cloud-hosted relays contributed only around 0.8% of bandwidth. Note, however, that this is just a lower bound. The netblocks I used for the analysis could have changed, and I didn't consider providers other than Google, Amazon, and Microsoft.
There are also cloud-hosted bridges. Tor Cloud, however, has shut down, and the number of EC2 bridges is declining: https://metrics.torproject.org/cloudbridges.html?graph=cloudbridges&start=2015-01-01&end=2015-07-31
The harm caused by cloud-hosted relays is more difficult to quantify. Getting rid of them also wouldn't mean getting rid of any attacks. At best, attackers would have to jump through more hoops.
If we were to decide to permanently reject cloud-hosted relays, we would have to obtain the netblocks that are periodically published by all three (and perhaps more) cloud providers: https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html https://msdn.microsoft.com/en-us/library/azure/Dn175718.aspx https://cloud.google.com/appengine/kb/general?hl=en#static-ip
Note that this should be done periodically because the netblocks are subject to change.
Cheers, Philipp