[tbb-dev] Sandboxing Tor Browser - A New Beginning (?)

6 Jul 2018

      [ Trying to keep the scope of my reply limited.
  nb: Linux centric since that's what I know/have/wrote. ]

On 07/05/2018 06:46 PM, Tom Ritter wrote:
[snip]
...
I think there's a lot of focus in this document on applying sandbox
policies by the OS before/as the process starts and not following the
'drop privileges' model. But dropping privileges is much more
flexible. I think some confusion with this model is the notion that
the process is 'sandboxing itself' and of course one can't trust a
process that is simultaneously compromised and attempting to perform
security operations so that model must be broken - but this notion is
incorrect. The process - before it is able to be compromised by
attacker input, before it processes anything from the web - instructs
the OS to apply the sandbox to itself, and cannot later opt-out of
that restriction. It's true that something could go wrong during the
sandbox application and result in a the process staying elevated - but
I think that is easy to code defensively for and even test at runtime.
Well, yeah.  Matt's architecture is derived sandboxed-tor-browser, whose
design assumptions were along the lines of "Firefox's codebase is a
gigantic mess of kludges upon kludges that shouldn't be trusted, or
altered if at all possible".

Having the sandbox be a entirely separate component:
 a) Happened to mesh nicely with the way sandboxing is typically
implemented on my target platform.
 b) Enabled using a "safe" language.
 c) Enabled far more rapid development than otherwise possible.
 d) Made it easier to validate correctness.
 e) Slowed my inevitable descent into insanity by keeping my interaction
with the Firefox code to a minimum.
...
Our weaknesses, as I see them:
1) We're tracking ESR for now
2) Un-upstreamed patches are painful for us
3) We are not well suited to undertake large browser architecture
projects, especially if it diverges from Firefox
4) We are neither browser architecture experts, nor sandboxing
experts. In particular this makes us ill-suited to predict feature
breakage or performance degradation from a hypothesized
sandbox/architecture change.
The number of upstream (Firefox) changes that were needed to be
to get the Linux sandbox to work was exactly 0.  There was one fix I
backported from a more current firefox release, and two upstream firefox
bugs that I worked around (all without altering the
firefox binary at all).

The design and code survived more or less intact from 7.0.x to at least
the entirely of the 7.5 stable series (I don't run alpha, I assume
there's some changes required for 8.0, but the code's deprecated and I
can't be bothered to check.  It would have survived if the time and
motivation was available).
...
Spelled out:
 - The FF Parent Process talks to and controls the Content Processes
(using existing IPC mechanisms) and maybe/probably interfaces with the
Networking process and other Helper Processes in unknown future ways.
The content processes probably talk to the Network process directly,
they might also talk to the other helper processes.
- The Networking process talks to Tor using SOCKS with (probably) a
domain socket or named pipe
- Tor Control requests are sent from the Parent Process to the broker
which filters them and then passes them to Tor over the control port.
- The broker is most likely the least sandboxed process and may
provide additional functionality to the parent process; for example
perhaps it passes a writable file handle in a particular directory so
the user can save a download.
How do you envision updates to work in this model?  Having the sandbox
be externalized and a separate component makes it marginally more
resilient to the updater/updates being malicious (though I would also
agree that it merely shifts the risks onto the sandbox update mechanism).

It is also not clear to me how to do things like "peek at the
executable's ELF header to only bind mount the minimum number of shared
libraries required for the executable to run" from within the executable
itself.
...
As with all things sandboxing, we need to be sure there are not IPC
mechanisms to a privledged process that bypasses the sandbox
restrictions. On the networking side, there is
https://searchfox.org/mozilla-central/source/dom/network/PUDPSocket.ipdl
- which i think is used by WebRTC.
The old sandboxed-tor-browser code doesn't care about such things.
...
...
3) Disk Avoidance
In general, I'd like #7449 mitigated.
[snip]
What about restricting write access to temp directories?  That seems
like the quickest and most compatible option (although it wouldn't
catch every possible occurrence of this issue.)
The old Linux code used mount namespaces to limit writes to:

 * A subset of the profile directory.
 * tmpfs that will get torn down and discarded upon exit.
 * The Downloads/Desktop directories.

This catches most cases, though the browser can still mess up it's own
profile dir.  There is an option that takes this a step further and
shadowed the profile directory into another tmpfs filesystem rendering
the system amnesiac except for the Downloads/Desktop directories (which
was what I used day to day), but for obvious reasons that was disabled
by default.
...
...
7) Cross-Origin Fingerprinting Unlinkability
[snip]
So the Browser already brokers access to features on the Web Platform
and can be thought of (and really is) a 'sandbox' for webpages. You're
only fingerprintable based on what the Web Platform provides. If it
provides something fingerprintable; we need to either fix that
fingerprint or block access to it at that layer.
Extra things that the Linux sandbox did:

 * Included an extension whitelist, so that users can't add new extensions.
 * Disabled addon auto updates, because addons.mozilla.org is not to be
trusted.
 * Forced software rendering (though WebGL in that configuration is
busted on certain systems for unrelated reasons).
 * Disabled access to the audio subsystem unless configured to allow it.
 * (Probably other things, I don't remember.)
...
I would argue that the role sandboxing should provide here is
fingerprinting protection in the face of code execution in the content
process. Assume an attacker who has a goal: identify what person is
using Tor Browser. Their actions are going to be goal oriented. A
website cannot get your MAC address. But if a website exploits a
content process and is evaluating what is the best cost / benefit
tradeoff for the next step in their exploit chain: the lowest cost is
going to be 'zero' if the content process does not restrict access to
a OS/computer feature that would identify the user.
The goal should be to provide fingerprint protection in the face of
"code execution anywhere in the Tor Browser".
...
So the goal of sandboxing in this area would be to restrict access to
any hardware/machine/OS identifiers like OS serial number, MAC
address, device ids, serial numbers, etc. After that (the 'cookies' of
device identifiers if you will), the goal would be to restrict access
to machine-specific features that create a unique fingerprint: like
your GPU (which I illustrate because it can render slightly unique
canvas data) or your audio system (which I illustrate because it can
apparently generate slightly unique web audio data.)
The filesystem also provides a considerable amount of identifying
information about a user.  It's not much of a sandbox if the adversary
can just exfiltrate the contents of the user's home directory over tor.

Anyway, as far as I can tell, the differences in what you're suggesting
vs the existing/proposed architecture boils down to "how much of firefox
should be trusted?".  To this day, I remain in the "as little as
possible" camp, but "nie mój cyrk, nie moje małpy".

Regards,

-- 
Yawning Angel