[tor-mirrors] Space needed for a mirror

Dave Warren dw at thedave.ca
Wed Jul 28 00:32:32 UTC 2021


(Responding to a couple messages in one, since it is all quoted here anyway)

On Fri, Jul 23, 2021, at 18:12, hackerncoder wrote:
> On 7/23/21 11:29 PM, Roger Dingledine wrote:
> > On Fri, Jul 23, 2021 at 10:40:30AM -0600, Dave Warren wrote:
> >> https://2019.www.torproject.org/docs/running-a-mirror.html.en indicates that
> >> the website and distribution directory currently require 30GB and to expect
> >> up to 50GB.
> >>
> >> dist is currently over 80GB. Is this normal/expected?
> > 
> > Yes. It depends mainly on how many versions of Tor Browser are on dist
> > at once, and with a stable version and an alpha version, and new releases
> > coming out, sometimes there are quite a few versions published at once.
> > 
> >> Just wondering if this is temporary, or if I should provision a bit more
> >> disk space?
> > 
> > It's worse than that -- the running-a-mirror page that you point to is
> > on the old website, and there is no equivalent on the new website. We
> > have no plans currently for how to make good use of third-party website
> > mirrors.

I'm aware it was outdated, but it was the most modern thing that I could find, so it was a reasonable starting point for discussion (and certainly would cause someone to correct me if there was a more recent reference).

I've never really seen the point of full website mirrors, but /dist/ download mirrors seemed like it could have value. And at least at one point there was enough traffic to justify it. The website mirror took approximately zero extra resources, so it didn't make sense to not set it up too.

When I needed a bunch of disk space for another project some moons ago I ended up proxying requests to my /dist/ to the official site with a little local cache. This worked pretty well, and at the time was pulling "enough" downloads per month that it felt like it was worth keeping, although there is no way to know how many are "real" (otherwise censored) users. Once things went back to normal I returned to mirroring normally, and don't currently track anything closely enough to get any indication of utilization.

Does anyone running a mirror have useful bandwidth/hits/something analytics? I could start tracking requested URLs and number of bytes transferred again easily enough myself too, but maybe someone has done the work.


> As you said, github, gitlab, archive.org are probably more scalable, and 
> maybe harder to block (it's practically domain fronting). Not only that, 
> but they aren't run by random people. And the Tor Project controls 
> updates... for good... or bad [1].

I must admit, with the list of mirrors being public anyway, I've wondered why someone actively trying to block Tor wouldn't just pull the mirror list and automate it into their firewall. On the other hand, there was legitimate traffic.


> What I see is a nearly gone thing. No maintainer, outdated website, 
> better?/other ways for distribution. I personally think, and I say this 
> hosting a mirror [2], it should be shut down for good. People will 
> probably continue to create new mirrors... I did. Is it worth their time 
> and effort?

Sadly this is probably the case.

I was working on setting up ClamAV when they pulled the plug on that one and switched to Cloudflare, which makes sense for their use-case. I still run a SpamAssassin rule mirror, and I'm surprised they haven't done the same, although I think the pool of mirror operators is reasonably stable.

Tor needs to be distributed a bit more widely than any one single provider or CDN, but I certainly wouldn't start a project getting dozens of small/independent mirrors if it didn't already exist. On the other hand, since the infrastructure and volunteers are already here, I'm not sure if it makes sense to pull the plug? But the list of mirrors should be utilized somewhere, somehow.

It also occurs to me that if I were building something new today, some mechanism for tor nodes themselves to proxy http and https requests from the public internet would be relatively straightforward to implement, creating a wide network of sources for the files without requiring individual mirror operators, without replication, without disk space consumption, etc. But again, probably more trouble than it would actually be worth at this point.


More information about the tor-mirrors mailing list