Proposal: GETINFO controller option for connection information

Damian Johnson atagar1 at gmail.com
Wed Apr 14 16:16:04 UTC 2010


Time to take the defibrillator paddles to this proposal once again. As per
Nick's request this is a bit more focused on the motivation for getting
connection related information. The proposed use cases are just some naive
examples I've come up with. If anyone with a stronger security background
(which wouldn't take much...) has the time I'd love comments like "WTF?!?
This idiot's looking for the completely wrong things! This is obviously
worthless if he doesn't look for X."

Also, could we move forward on the other (less controversial) items? For
instance, bandwidth totals tend to be a very highly requested piece of
information and pipe's already provided a nice patch to get it (
http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html). For
reference, here's the not-so-controversial GETINFO options I proposed:

  "info/relay/bw-limit" -- Effective relayed bandwidth limit (currently
    RelayBandwidthRate if set, otherwise BandwidthRate).

  "info/relay/burst-limit" -- Effective relayed burst limit.

  "info/relay/read-total" -- Total bytes relayed (download).

  "info/relay/write-total" -- Total bytes relayed (upload).

  "info/uptime-process" -- Total uptime of the tor process (in seconds).

  "info/uptime-reset" -- Time since last reset (startup or sighup signal, in
    seconds).

  "info/descriptor-used" -- Count of file descriptors used.

  "info/descriptor-limit" -- File descriptor limit (getrlimit results).

  "ns/authority" -- Router status info (v2 directory style) for all
    recognized directory authorities, joined by newlines.

I'm not planning on converting the following to the customary 80-character
width until it's at least past being a first draft for a couple reasons:
  1. I find editing fixed-width documents to be a time consuming pain in the
ass.
  2. I've yet to hear why we do this. Is it just to cater to mail clients
too dumb to know how to line wrap?

that said, keeping my fingers crossed that this starts going somewhere!
-Damian

PS. For previous discussions of this proposal see:
http://marc.info/?t=126101683100002&r=1&w=1

----------------------------------------

Filename: xxx-connection-getinfo-option.txt
Title: GETINFO controller option for connection information
Author: Damian Johnson
Created: 14-Apr-2010
Status: Draft

Overview:

    This details an additional GETINFO option for tor controllers that would
provide information concerning a relay's current connections.

Motivation:

    All Internet facing applications (tor included) are possible vectors for
attack on the operator's system. With hundreds of connections to relatively
unknown destinations tor is already the bane of any network based IDS, and
unless tor can be proved infallible and bug free (which would be quite a
feat!) it cannot be blindly trusted.

    While it is impossible to guard against every potential future
vulnerability, controllers can attempt to mitigate this threat by both
auditing tor's behavior and providing indicator of its activity to savvy
users. Connection related information is a useful tool for both of these
purposes.

    In terms of auditing, the following are some conditions controllers can
check for with connection information:
      - Persistent unestablished circuits. For instance a circuit has an
outbound connection without a corresponding inbound counterpart. If such a
connection was active (had substantial traffic) this would be troubling
enough to alert the user.
      - Relatively asymmetric traffic on circuits. Ie, if the controller
sees 10 kb/s inbound on a circuit and 5 mb/s outbound this could be a good
indicator that someone's using tor to issue a dos, fetch data from the local
system, etc.
      - Any connections to the local network when ExitPolicyRejectPrivate is
set, indicating that tor's being used to proxy connections to the local lan.
      - Peculiar patterns of connections, for instance numerous outbound
connections to a single IP, or if 99% of all bandwidth belonging to a single
circuit.
      - Scrubbed connection data limits our ability to check for obedience
to the exit policy, but for strictly non-exit relays we can still alert the
user if any non-relay outbound connections occur.

    Of course if we're working from the assumption that tor has been
compromised, then the information provided from the control port cannot be
blindly trusted. Hence connection data should be validateable against the
system's connection querying utilities (netstat, ss, lsof, etc - which are
more likely to be under a host based IDS, if present). This requires that
the system's been completely compromised (elevated permissions) before
controllers can be tricked, rather than just tor.

    While automated detection is handy for detecting known behavior that
might indicate issues, visualization gives us the possibility of finding
much more thanks to our tinfoil hat wearing user base. A clear display of
tor's current behavior gives assurance that tor's functioning as it should,
plus a level of transparency desirable from anyone with even the slightest
bit of paranoia. Tor is a guest process in the system of relay operators and
we should not hide what it does without legitimate reason.

    Another (albeit unintended) benefit of visualizing tor's behavior is
that it becomes a helpful tool in puzzling out how tor works. For instance,
tor spawns numerous client connections at startup (even if unused as a
client). As a newcomer to tor these asymmetric (outbound only) connections
mystified me for quite a while until until Roger explained their use to me.
The proposed TYPE_FLAGS would let controllers clearly label them as being
client related, making their purpose a bit clearer.

    At the moment connection data can only be retrieved via commands like
netstat, ss, and lsof. However, fetching it via the control port provides
several advantages:

      - scrubbing for private data
          Raw connection data has no notion of what's sensitive and what is
not. The relay's flags and cached consensus can be used to take educated
guesses concerning which connections could possibly belong to client or exit
traffic, but this is both difficult and inaccurate.

      - additional information
          All connection querying commands strictly provide the ip address
and port of connections, and nothing else. However, for auditing and
visualization the far more interesting attributes are the connection's
bandwidth usage, uptime, and the circuit to which it belongs.

      - improved performance
          Querying connection data is an expensive activity, especially for
busy relays or low end processors (such as mobile devices). Tor already
internally knows its circuits and connections, allowing for vastly quicker
lookups.

      - cross platform capability
          The connection querying utilities mentioned above not only aren't
available under Windows, but differ widely among different *nix platforms.
FreeBSD in particular takes a very unique approach, dropping important
options from netstat and assigning ss to a spreadsheet application instead.
A controller interface, however, would provide a uniform means of retrieving
this information.

Security Implications:

    The original version of this proposal left the responsibility of
scrubbing connection data with client applications (vidalia, arm, etc).
However, this was deemed unacceptable by Sebastian and Nick in previous
discussions. The proposal now includes dropping the ip address/port of
client and exit connections from the controller's response. That said, I
think it's a mistake to drop those connections entirely since some of their
attributes *are* of legitimate usefulness:

    - Existence
      At the very least it'd be nice if Tor indicated their existence (ie,
I'd say "yea, an exit connection exists on this circuit but we won't tell
you where it goes."). This would be useful, for instance, if the relay
operator has misconfigured their firewall to block some of the outbound
ports permitted by their exit policy (arm would show this as RELAY -> YOU ->
UNESTABLISHED, and provide a warning to indicate the issue).

    - Bandwidth
      For auditing the most interesting attribute of connections, imho, is
the bandwidth. If, says 10 KB/s is coming in and 1 MB/s is going out on a
circuit that's a good indicator that something is *very* wrong (I'd start
suspecting a security issue, personally). If we rounded all bandwidth
measurements (say, to the nearest KB) would this be sufficient to prevent
entry/exits from correlating this data to attack anonymity?

    - Uptime
      If connections are being cycled abnormally quickly (say, all
connection longevity is under thirty seconds) this could indicate the ISP
(or other middlemen like the great firewall) are sending reset packets to
kill the relay's attempts to make exit connections.

Specification:

   The following addition would be made to the control-spec's GETINFO
section:

  "conn/<Circuit identity>/<Connection identity>" -- Provides entry for the
    associated connection, formatted as:
      CONN_ID CIRC_ID OR_ID IP PORT L_PORT TYPE_FLAGS READ WRITE UPTIME

    none of the parameters contain whitespace, and additional results must
be
    ignored to allow for future expansion. Parameters are defined as
follows:
      CONN_ID - Unique identifier associated with this connection.
      CIRC_ID - Unique identifier for the circuit this belongs to (0 if this
        doesn't belong to any circuit). At most their may be two connections
        (one inbound, one outbound) with any given CIRC_ID except in the
case
        of exit connections.
      OR_ID - Relay fingerprint, 0 if connection doesn't belong to a relay.
      IP/PORT - IP address and port used by the associated connection, 0 if
        connection is used for relaying client or exit traffic.
      L_PORT - Local port used by the connection, 0 if connection is used
for
        relaying client or exit traffic.
      TYPE_FLAGS - Single character flags indicating directionality and type
        of the connection (consists of one from each category, may become
        longer for future expansion).
          Connection Directionality:
            I: inbound, i: listening (unestablished inbound),
            O: outbound, o: unestablished outbound
          Usage Type:
            C: client traffic, R: relaying traffic,
            X: control, H: hidden service, D: directory
          Destination:
            T: inter-tor connection, t: outside the tor network
        For instance, "IRt" would indicate that this was an established
        1st-hop (or bridged) relay connection.
      READ/WRITE - Total bytes read/written over the life of this
connection.
      UPTIME - Time the connection's been established in seconds.

  "conn/all" -- Newline separated listing of all current connections.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20100414/10bf846e/attachment.htm>


More information about the tor-dev mailing list