<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">So, for fun, a real conversation that happened about an hour after we launched:</div><div class=""><div class=""><br class=""></div><div class="">- "This is great - the box hosting the Onion is only 4.6% busy!"</div><div class="">- "That's not so good."</div><div class="">- "Wat?"</div><div class="">- "The box has 20 cores and Tor is basically single-threaded." </div><div class="">- "Oh. Right."</div><div class=""><br class=""></div><div class="">...i.e. we were about 92% busy, but everything worked out okay in the end. :-)</div><div class=""><br class=""></div><div class="">We launched with totally stock, unmodified 2.6 tor code and ran it for a year. </div><div class=""><br class=""></div><div class="">This was adequately performant, though the user experience was quite affected by latency. </div><div class=""><br class=""></div><div class="">For clarity's sake, we actually run daemons for three onion addresses - one serves the "www" role, another is "cdn" and the third is "sbx" / for uploads.</div><div class=""><br class=""></div><div class="">Basic maths is correct that we actually run 3 addresses x 2 datacentres = 6 daemons, all on separate hardware, so that the servers that run the tor daemons don't have to think about very much at all. Maillist software tends to swallow attachments, so instead there's a diagram of how this is all laid out in my (slightly out of date) notes at <a href="https://storify.com/AlecMuffett/tor-tips" class="">https://storify.com/AlecMuffett/tor-tips</a></div><div class=""><br class=""></div><div class="">All the Tor daemons have to do is pass HTTPS traffic outbound to a VIP which fans out to our SSL termination tier.</div><div class=""><br class=""></div></div><div class=""><div class="">About halfway through the year the onion site was impacted by scheduled DR-testing; this led to a "how do we fix this?" discussion, and we decided "why not just run two copies of each onion, in separate datacentres? what's the worst that could happen?" - and that's what we still do. </div><div class=""><br class=""></div><div class="">Running replica daemons seems to mostly work. People receive and cache a descriptor for one-or-other datacentre, and then use it for a while, yielding a coarse load-balancing effect.  If one goes offline, the other eventually picks up the slack.</div></div><div class=""><br class=""></div><div class="">A few months ago, we integrated RSOS into 2.6.10 (thanks, teor!) and deployed it. </div><div class=""><br class=""></div><div class="">With RSOS the latency issues in the UX are much-reduced, and it's arguable that the onion site runs as-fast-as-or-perhaps-marginally-faster-than the site when accessed over normal Tor. The argument for why the onion site might be a little faster is essentially "same number of hops but no resource-contention for exit-node-usage". It's a general sense from usage rather than some scientific claim of performance.</div><div class=""><br class=""></div><div class=""><br class=""></div><div><blockquote type="cite" class=""><div class="">On 28 Jan 2016, at 03:19, Mike Tigas <<a href="mailto:mike@tig.as" class="">mike@tig.as</a>> wrote:</div><div class="">...<br class="">Before settling on a proxy, I thought of the ways I could maybe handle this.<br class=""><br class="">1) You update your application to generates .onion URIs when it sees<br class="">that a request is coming from the onion service.<br class=""></div></blockquote><div><br class=""></div><div><br class=""></div><div>This is what we do; when a request is inbound to our reverse-proxy tier (see "proxygen" in the diagram linked in the storify above) and is sourced from the servers which handle the Tor daemons, inbound "Host:" headers are rewritten from "onionaddress.onion" to "<a href="http://sitename.com" class="">sitename.com</a>" (preserving the subdomain) and an extra "magic" header is injected to denote that the responses to this request need "Onionification".</div><div><br class=""></div><div>Then when the request is actually handled by the web tier, essentially everything proceeds as normal for the rest of the site. When a URI/cookie/JS is being rendered to send back to the requester, the "magic" header is checked-for and (if found) the ".onion" TLD is used, rather than the ".com" one.  There are a couple of gotchas - eg: don't onionify URIs which are used for internal data fetches necessary to serve the request - but generally the code is remarkably straightforward.</div><div><br class=""></div><div>It's simply like serving a different TLD in a consistent manner.</div><div><br class=""></div><div><br class=""></div><blockquote type="cite" class=""><div class="">2) An HTTP proxy at the onion service rewrites your application's<br class="">responses to turn your clearnet URIs into onion URIs.<br class=""></div></blockquote><div><br class=""></div><div><br class=""></div><div>This was what we did as a proof of concept; I fired up an instance and built "mitmproxy" (<a href="http://mitmproxy.org" class="">mitmproxy.org</a>) on it, and did something like:</div><div><br class=""></div><div><div>  # configure Tor to forward Hidden Service to localhost:443</div><div>  # for inspiration only, this probably won't work/is a bad idea/go read the manual</div><div>  SITE="<a href="http://domain.com" class="">domain.com</a>"</div><div>  ONION="somekindofonion.onion"</div><div>  mitmproxy -p 443 -P "<a href="https://www.${SITE}" class="">https://www.${SITE}</a>" --anticache \</div><div>    --replace ":~hq:${ONION}:${SITE}" \</div><div>    --replace ":~hs:${SITE}:${ONION}" \</div><div>    --replace ":~bs ~t \"application/json\":${SITE}:${ONION}" \</div><div>    --replace ":~bs ~t \"application/x-javascript\":${SITE}:${ONION}" \</div><div>    --replace ":~bs ~t \"text/css\":${SITE}:${ONION}" \</div><div>    --replace ":~bs ~t \"text/html\":${SITE}:${ONION}" \</div><div>    --replace ":~s:${SITE}:${ONION}"</div></div><div><br class=""></div><div>...the idea being to listen to port 443 locally, connecting that onwards to the backend site.</div><div><br class=""></div><div>It took an afternoon to test, and I was impressed how much stuff "just worked".  Maybe 90% of the site.</div><div><br class=""></div><div>Given our volumes we felt it would not be viable for us to do rewriting for every request, hence fixing the codebase a-la "solution-1", above.  YMMV.</div><div><br class=""></div><div>Some day I would like maybe to adapt Wordpress to support such "solution-1" rewriting.  Wordpress strikes me as a good target platform for Onion-enablement.  Maybe a plugin would work.</div><div><br class=""></div><div>== Increasing Aggregate Bandwidth ==</div><div><br class=""></div><div>So, at the moment, we are running 2x daemons with RSOS per onion.</div><div><br class=""></div><div>Upcoming plans are approximately:</div><div><br class=""></div><div>1) build an "RSOS Onionbalance Appliance" - 10 RSOS daemons, with random-ish Onion addresses, on a single box (use more of those cores!) and wrap them in onionbalance to publish a unified descriptor for them. Deploy and test that.</div><div><br class=""></div><div>2) deploy a second replica Onionbalance Appliance for coarse loadbalancing/failover.</div></div><div class=""><br class=""></div><div class="">3) build a RSOS Onionbalance-NG cluster - up to 60 daemons across several servers, using new UCL-inspired upcoming OnionBalance features to publish up to 6 distinct descriptors at different points into the HSDir. (NB: 6 descriptors * 10 IntroPoints per descriptor = 60 daemon limit)</div><div class=""><br class=""></div><div class="">4) Talk more to Tom about Rendezvous-Callback Handoff, and integrate that into one of the above architectures when it eventually lands.</div><div class=""><br class=""></div><div class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">—</div><div class="">Alec Muffett</div><div class="">Security Infrastructure</div><div class="">Facebook Engineering</div><div class="">London</div><div class=""><br class=""></div></div></div></div></div></body></html>