[tor-talk] Making a Site Available as both a Hidden Service and on the www - thoughts?

Ben ben at gerbil.it
Sun May 17 11:26:41 UTC 2015

Hi all,

I've got a (www) site that I'm debating making available as a Hidden
Service, and I was wondering what peoples thinking on doing this was

It's a publicly available site, so it's not privacy for me (as the
operator) that I'm looking for by setting up a HS - it's got a
reasonable readership and it's occurred to me that it'd be good to give
readers the option of accessing via a HS so that they can make their own
choices regarding anonymity.

Some of the content could be considered controversial in some of the
stranger jurisdictions of the world, it's not impossible that it might
even be blocked in some of the even-stranger ones.

The site in question can already be accessed via a tor exit node, but
that brings a third party into the mix (the exit operator). Although the
site uses HTTPS there's still always the outside chance. Using a HS cuts
that out (and frees up a little exit capacity to boot).

So I've been scrawling a few perceived challenges and wondered whether
anyone can think of anything I've missed (or has suggestions on those I

Site uses HTTPS
I'll hit this one first as it ties in with some other bits - the site
uses HTTPS and will redirect requests on port 80 to 443. Obviously the
certificate is going to be invalid if you're hitting a .onion.

So, the basic setup I was thinking is to create a reverse proxy for the
.onion which then sends the requests onto the https main site (same server) with the Host header set to the Site's FQDN. 

So the HS can be accessed over port 80 without triggering the redirects.
The Tor client and reverse proxy would be running on the same server, so
the plaintext bit would be over loopback.

Duplicate Content Penalty

If Google crawls the site through the public FQDN, and then manages to
hit the HS via tor2web (or a similar service) - I'll probably get stung
in their indexes with a duplicate content penalty. 

Thinking it might be best to have the HS serve a different robots.txt to
avoid/mitigate this

Anti-abuse scripts

There are some off-the-shelf protections built into the site. Given they
were designed for the www, they can (and do) ban any IP that's seen as a
repeat offender. 

Either an exclusion needs to be made, or the HS will sometimes show
'nice' visitors a potentially rude message :)

Whether or not an exclusion is made, things that don't _need_ to be
available via the HS can be blocked off at the reverse proxy (for
example management back-ends).

There are other sections of the site that could, potentially, be blocked
off to lessen the attack surface available to an adversary - for example
are you likely to use the shop section (physical items) if you're using
a HS? You don't want me to see your IP, so it seems unlikely you'd give
me a shipping address and payment info?

This is probably the part I'm most undecided on - I don't want to
un-necessarily restrict access to certain areas, but I also want to make
sure that the existing protections aren't weakened too much (in case I
or someone else make a mistake).

Minor Tweaks might be needed

There are some base assumptions that have already been made within the
site - Javascript has been used sparingly if at all, but setting up a HS
brings a few extra (potential) complications.

All (internal) URL references will need to be relative - including the
URL's issued in redirects (where applicable).

Basically, care would need to be taken to make sure readers aren't
accidentally directed off the HS and onto the www site.

Sessions won't be transferable

This is a good thing in my book, but if a user hits (for example) the HS
first and then visits the www site (for whatever reason), the domain
name will have changed so the session cookie won't be sent.

That's not a bad thing, but to avoid getting dinged with support
requests, probably worth making very clear somewhere on the site.

Analytics Could Get Fun

This one won't affect me, but as I'll include it anyway - any back-end
analytics software based on source IP will (potentially) show a lot of
traffic/hits from a single IP. Depending on how popular the HS option is
(i.e. how many who were accessing via www switch to the HS) it might completely invalidate the data (though the same can be said, to an extent for anyone coming via an exit node).

Potentially helping an adversary

This is a pretty minor thing, it's more a point of principle than

By having a service available over both the www and a HS, my site
becomes a potential target for anyone looking to test a method of
identifying where a HS is hosted (as they can verify their findings
against the www service).

Now, obviously, there's no reason they couldn't set up their own service
to do the exact same thing, it's more the principal of not wanting to
help someone de-anonymise.

It goes without saying that the server won't be hosting any HS that are
intended only to be HS's. 

I'll probably look at setting a HS up with authentication required for
testing, but I'm trying to think it through first to avoid dropping a

What I definitely don't want to do is set up, announce and then find
I've missed something and have to disable the HS until I can fix the
issue :)

Anyone have any thoughts of what else there is to trip up on?

More information about the tor-talk mailing list