In order to find an answer to the question "Why not just use cloudflare?" for mirrors, I ran one of my mirrors on cloudflare for a week.
Attached are the results of the test. The week included a new release of tor browser, which generally creates some load on my mirror.
The results are that using cloudflare doesn't offload the binaries, which are what make up the bulk of traffic on the mirror. From past web log analysis, the vast majority of traffic is downloading binaries, as linked from 3rd party forums and sites around the world. It seems almost no one uses my mirrors to read documentation or look at anything else on the website. Cloudflare would be great if everyone wanted to look at the documentation or other html pages.
I've started to look at CDN providers to see if there are affordable services which can offload the entire site itself.
Constructive thoughts and pointers to CDN providers are welcome.
Andrew Lewman:
In order to find an answer to the question "Why not just use cloudflare?" for mirrors, I ran one of my mirrors on cloudflare for a week.
Do you know about their data retention policies? Do they log IP addresses? How long do they keep the data? Can we trust what they would say?
Also, the relay faq discourages the use of services such as cloudflare for the reason oppressive regimes often block these services..
On September 9, 2014 7:08:50 PM EEST, Lunar lunar@torproject.org wrote:
Andrew Lewman:
In order to find an answer to the question "Why not just use cloudflare?" for mirrors, I ran one of my mirrors on cloudflare for a
week.
Do you know about their data retention policies? Do they log IP addresses? How long do they keep the data? Can we trust what they would say?
-- Lunar lunar@torproject.org
tor-mirrors mailing list tor-mirrors@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-mirrors
On 09/09/2014 12:08 PM, Lunar wrote:
Do you know about their data retention policies? Do they log IP addresses? How long do they keep the data? Can we trust what they would say?
I assume they log everything and keep it forever.
Andrew Lewman:
On 09/09/2014 12:08 PM, Lunar wrote:
Do you know about their data retention policies? Do they log IP addresses? How long do they keep the data? Can we trust what they would say?
I assume they log everything and keep it forever.
For what I know, this would be different from the current policy of www.torproject.org
I think protecting users' privacy should be in the equation.
On 09/09/2014 11:36 PM, Lunar wrote:
Do you know about their data retention policies? Do they log IP addresses? How long do they keep the data? Can we trust what they would say?
I assume they log everything and keep it forever.
For what I know, this would be different from the current policy of www.torproject.org I think protecting users' privacy should be in the equation.
With the current structure of mirrors, we already rely on 3rd parties with whatever policy they have. I don't think Andrew is actually suggesting to move the main site or main mirrors over there.
Personally, I would do everything to avoid these services, especially since they regularly block Tor users, centralize the Internet, often provide bad SSL gateways (with changing certs etc), and I hate to have to allow active content coming from some generic CDN. I am clearly against using commercial CDNs for anything, but if someone wants to run a mirror there, and even if that someone happens to be Andrew, I can't really argue against it. It might actually allow some users to reach a mirror for which other mirrors are blocked.
On 09/09/2014 06:09 PM, Moritz Bartl wrote:
With the current structure of mirrors, we already rely on 3rd parties with whatever policy they have. I don't think Andrew is actually suggesting to move the main site or main mirrors over there.
Correct. The point was to run an experiment and see what cloudflare will actually cache and serve. Lots of mirror ops have asked about using cloudflare.
I'll highlight the key sentence in my email:
"The results are that using cloudflare doesn't offload the binaries, which are what make up the bulk of traffic on the mirror."
Since the binaries are served up by the actual site and not cloudflare, there isn't much point in using cloudflare. The only real point I see is what Moritz highlights:
"It might actually allow some users to reach a mirror for which other mirrors are blocked."
Unless some company/country are going to block all of cloudflare or a CDN, our mirrors can still be reachable. This is the same idea that David Fifeld is counting on with the meek transport using Google App Engine. Blocking all of Google seems a huge cost vs the gain of stopping some tor users.
The alternative is that cloudflare/cdn/google kick us off their systems to avoid being blocked.
On Tue, Sep 09, 2014 at 09:05:21PM -0400, Andrew Lewman wrote:
The alternative is that cloudflare/cdn/google kick us off their systems to avoid being blocked.
Or they do what Livejournal does -- run the site on a second address too, and migrate all the sketchy users over to that one, so it's easy for censoring governments to block the 'sketchy user' address and leave the normal one alone.
https://www.usenix.org/conference/foci14/workshop-program/presentation/ander...
--Roger
On Tue, Sep 09, 2014 at 09:05:21PM -0400, Andrew Lewman wrote:
Unless some company/country are going to block all of cloudflare or a CDN, our mirrors can still be reachable. This is the same idea that David Fifeld is counting on with the meek transport using Google App Engine. Blocking all of Google seems a huge cost vs the gain of stopping some tor users.
On that note, it's worth looking at what GreatFire.org is doing for some of their mirror sites: https://github.com/greatfire/wiki.
Here is one of the URLs: https://a248.e.akamai.net/f/1/1/1/dci.download.akamai.com/35985/159415/1/f/ This URL is from an Akamai reseller, http://cachesimple.com/, who have a plan starting at $50/month. The long URL is an explicit form of what normally happens implicitly through SNI at the Akamai CDN (see page 5 of https://research.microsoft.com/en-us/um/people/ratul/akamai/freeflow.pdf for Akamai URL structure). The important thing is that all the blockable content is encrypted in the path component. The censor only gets to see the domain name a248.e.akamai.net, which is some kind of magic Akamai HTTPS domain that's used for tons of stuff. I think a mirror like this would be very hard to block.
I know of another Akamai reseller that would probably work, http://www.hpcloud.com/products-services/cdn. That one apparently gives you URLs that look like https://a248.e.akamai.net/cdn.hpcloudsvc.com/.... This one would also for sure serve the files itself from HP's cloud storage.
Other GreatFire URLs are: https://objects.dreamhost.com/freeweibo/index.html https://edgecastcdn.net/00107ED/g/ The blockable information is hidden in the path component behind the generic shared-SSL domains objects.dreamhost.com and edgecastcdn.net.
As far as I know, https://fw2.azurewebsites.net/ https://d1stdkq55ggsv7.cloudfront.net/ don't have the same claim to unblockability because the important information is in the domain. I guess the rationale here is it's easy to get a new name when an old one gets blocked.
David Fifield
On Sat, Sep 13, 2014 at 11:00:25PM -0700, David Fifield wrote:
On Tue, Sep 09, 2014 at 09:05:21PM -0400, Andrew Lewman wrote:
Unless some company/country are going to block all of cloudflare or a CDN, our mirrors can still be reachable. This is the same idea that David Fifeld is counting on with the meek transport using Google App Engine. Blocking all of Google seems a huge cost vs the gain of stopping some tor users.
On that note, it's worth looking at what GreatFire.org is doing for some of their mirror sites: https://github.com/greatfire/wiki.
Here is one of the URLs: https://a248.e.akamai.net/f/1/1/1/dci.download.akamai.com/35985/159415/1/f/ This URL is from an Akamai reseller, http://cachesimple.com/, who have a plan starting at $50/month. The long URL is an explicit form of what normally happens implicitly through SNI at the Akamai CDN (see page 5 of https://research.microsoft.com/en-us/um/people/ratul/akamai/freeflow.pdf for Akamai URL structure). The important thing is that all the blockable content is encrypted in the path component. The censor only gets to see the domain name a248.e.akamai.net, which is some kind of magic Akamai HTTPS domain that's used for tons of stuff. I think a mirror like this would be very hard to block.
I found out that the a248.e.akamai.net domain name is DNS-poisoned in China, since late September 2014. https://en.greatfire.org/https/a248.e.akamai.net (Click on one of the calendar dates to see details.)
Their wiki page https://github.com/greatfire/wiki replaced Akamai with Level 3: https://secure.footprint.net/pingfan/fw
David Fifield
tor-mirrors@lists.torproject.org