[tor-talk] Making a Site Available as both a Hidden Service and on the www - thoughts?
alecm at fb.com
Mon May 18 23:58:42 UTC 2015
Sorry I have not been able to get back to this thread until now, but I wanted to jump in with some observations, notes and a couple of corrections.
Given the length of the thread I shall try to be brief, though I shall also try to run chronologically from the start of the thread:
== Kludging your onion site to use HTTPS? ==
There are a bunch of reasons to use HTTPS atop your onion site. The reasons could be addressed rationally, but to do so "properly" would require some combination of:
a) fixing all the browsers to understand "Onion >= SSL"
b) fixing all the CMSes to understand that "Onion >= SSL"
...both of which seem complicated, so instead so we went with the third option:
c) implement SSL over Onion, using a legitimate certificate.
This led to a bunch of benefits; see the "tor-tips" link below for extended details. (TL;DR - secure cookies, CSP, fewer legacy code mods...)
== Duplicate Content Penalty Risk ==
Wow - this is a novel consideration that we missed! We effectively solved this by liaising with the Tor2web team (props!) eventually deciding that the best way to minimize crossover between the two universes was to block access to the Facebook onion site via Tor2web. This was announced by Tor2web here:
- and the block is easy: Tor2web already issue an "X-Tor2web" header which you can detect, and then reject the connection with a helpful message. If you choose this route then the risk is mitigated somewhat.
Securedrop have done the same:
Re: the suggested fix, yes one could certainly serve a different "robots.txt” from the onion site - very cute idea! - but if you have users with logins then there is a MITM risk which is perhaps best reduced by separating the universes properly. Hence.
== Faking an origin IP address ==
As observed elsewhere, we tell our infrastructure that any traffic inbound from the Facebook onion site is sourced from the DHCP broadcast network (169.254/whatever).
This is a hack which may or may not work for you, but it is sweet for us - 169.254 is an space "halfway down the stairs”, having the advantage of being a non-routable network yet not in the RFC1918 address space that might be used/hardcoded elsewhere internally.
It also means that our anti-abuse mechanisms have an otherwise sane IP address to work with.
== Relative URLs ==
Facebook generates a _lot_ of absolute URLs but we also mostly comprise dynamically-rendered content; traffic which comes from the onion site is tagged with a flag to denote that "when you are rendering a URL for this request, use the '.onion' address rather than the '.com' one".
The primary exception to this process is when you are stringifying a URL which is used to access an internal machine for the purposes of (say) a database lookup. You generally don't want to onionify those.
Software plug: XHP for dynamic PHP HTML rendering = Awesome Benefits. https://github.com/facebook/xhp-lib
== Redirections "offsite" from onion site to WWW site ==
Here's a cute idea which we haven't tried yet, but are considering: if you are running with "real" SSL on your onion site you can enable "Content Security Policy" (CSP)
…and it may be possible to configure CSP on your onion site such that any link-clicks that go to your WWW/non-onion site are reported (via POST) to an onion endpoint, permitting you to (ideally) go fix the URL-rendering leak. Not tried it yet, though.
== Sessions won't be transferable ==
Alas. But probably a good thing, see above re: Tor2web.
== Analytics shows a single, heavily-trafficked IP, laden with badness ==
True, but actually... it wasn't so bad.
== "Onion is better than SSL" ==
It does depend on your threat model, and if the threat model of your interlocutor is themed around "problems which are most intuitively solved by introducing a certificate authority", then I wish you courage, fortitude and patience. My position now is "why not have both?" - and I expend time trying to fix internet policy to permit both in the widest range of circumstances.
This strikes me as wise because Mozilla:
...will soon be gating new features to work _only_ on HTTPS.
== if someone connects to you from Tor, redirect them to the Onion ==
I recall Roger's comment in this blogpost, quote:
<<< As an addendum to that optimism, I would be really sad if Facebook added a hidden service, had a few problems with trolls, and decided that they should prevent Tor users from using their old [...] address. So we should be vigilant in helping Facebook continue to allow Tor users to reach them through either address. >>>
…where I subsequently commented:
<<< We currently see no benefit in intentionally blocking legitimate access via Tor exit nodes, not least because Tor exit nodes are publicly listed and easily identified and there appears to be little value in prioritising one form of Tor-sourced traffic over another. It is possible that this stance may change in future but I find it difficult to comprehend what benefit would come from doing so. >>>
...which still stands. Consider carefully any activity which might surprise people who access your site over Tor.
Perhaps a more autonomy-enhancing way to address this would be for Tor to big-up plugins which inform the user of their options?
== Help! I am foo.onion but my static resources are www.foo.com! ==
This is why the onion site for Facebook now uses subdomains; we have essentially modified our code to swap the ".com" address to the ".onion” one when stringifying any URIs that will be rendered by a browser.
The TL;DR here is that supporting "cdn.foo.onion" is hardly more complex than supporting "cdn.foo.fr" or "cdn.foo.de" in a consistent manner. We prune the rightmost two elements of the hostname and graft them with their onion equivalent quite efficiently; but see the database note under “Relative URLs”, above.
== Scalability ==
Facebook runs three Onion addresses - Web, CDN and SBX - which mirror the Facebook TLDs on a (mostly) 1:1 basis.
We are currently running an experiment where each Onion address is implemented by two separate tor daemons (with the same key material) in separate datacentres. We're seeking failover functionality and increased functional bandwidth; this work is inspired in part by working with Ceysun Sucu and Steven Murdoch at UCL.
The results are really too early to draw any conclusions, but I think today’s observations are worth mentioning; two of the onion addresses are currently splitting their traffic about 60-40 between the two datacentres, whereas the third is more like 90-10. These proportions change over time. We plan some more tests to determine how long it would take a server failure to heal / migrate all traffic over to the other site.
We post operational status updates at:
...if you would like to follow along with what's happening.
== Reducing 3 hops to 1 for a non-hidden onion site ==
I am working on this very sporadically. I'd welcome help, not only coding, but also in consideration of potential attacks (e.g.: ones comparable to DNS cache poisoning) which would be a nuisance to address. It would be unfortunate to trade speed for significant risk. Hat-tip to the Cloudflare team at the RWC conference in London earlier this year for poking me to consider this / related issues.
== Risk of non-Tor browsers stupidly trying to DNS resolve .onion? ==
I'm collaborating with Jake/others regarding this:
== “One thing: do you want to hide the origin server for the hidden service" ==
Oh, there are many, *many* more reasons to have an onion site for your website than just that. :-)
* tor-tips: https://storify.com/AlecMuffett/tor-tips
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
More information about the tor-talk