Hi all. This proposal doesn&#39;t seem to be going anywhere so thought I should give it one last nudge before moving on to more worthwhile work. The issue&#39;s sticking point seems to be a difference of opinion about what constitutes relay evilness. Nick, Jake, and Sebastian all believe in a hard line stance against any retrieval of connection information (netstat, lsof, etc). I disagree, and think this is harmless unless stored or communicated. Unless this can be resolved I think it&#39;s obvious the proposal isn&#39;t going anywhere.<br>

<br>Please note that I&#39;m discussing relay to relay connections at the moment. If we can&#39;t even agree on that then client and exit connections are a moot point (and besides, I agree they should definitely be hidden from relay operators - personally I think it&#39;s the responsibility of client applications like vidalia and arm to scrub this data, but that&#39;s a different discussion...).<br>

<br>Just to be clear I agree this proposal should be killed if it poses a threat to Tor users. However, I don&#39;t believe it does and still have yet to hear an example of any sort of threat it aggravates. Without that I&#39;m a bit puzzled at the source of objections. If the chief issue is legal or not wanting to risk the appearance of supporting snooping that&#39;s fine (strikes me as political posing if there&#39;s no actual benefits to users, but cest la vi).<br>

<br>Contrary to Nick&#39;s impression of my last response I&#39;m not a Scooby-Doo villain laughing maniacally as I scheme against Tor&#39;s users. I think transient connection data is good for auditing and transparency, but welcome correction if it&#39;s dangerous (before including it in arm I&#39;d tried to ask about risks and objections at Toorcamp but no one seemed interested...). As for this proposal, I think it has some tasty benefits that could help arm quite a bit including:<br>

- better performance<br>- added information, juiciest from an auditing perspective being bandwidth measurements and association of connections to circuits<br>- the ability to discern client and exit connections so they can be scrubbed (I&#39;ve tried correlating against consensus data to do this, but that was pretty inaccurate)<br>

<br>My bias is toward safety for relay operators and I&#39;m glad to see others biased toward user privacy pushing back. Hopefully we&#39;ll be able to find something acceptable to all parties concerned but if not it won&#39;t be the end of the world. Cheers! -Damian<br>

<br><div class="gmail_quote">On Sun, Dec 20, 2009 at 2:24 PM, Damian Johnson <span dir="ltr">&lt;<a href="mailto:atagar1@gmail.com">atagar1@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Hi Sebastian, thanks for the feedback!<div class="im"><br><br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">As always, I&#39;m very uncomfortable with giving away users&#39;/destinations&#39;

ip addresses or ports. I do realize that the same information can be

obtained from netstat and friends, but I still think we should actively

discourage the use and acquisition of this data. I realize that this is

against the intentions of this proposal, but I hope that it is still

useful even without client/destination identifying information.<br></blockquote><br></div>Disagree for the following reasons:<br>- As mentioned on IRC: all Internet facing applications (browsers, email clients, tor) are attack vectors for my system. Tor&#39;s developers are good, but I&#39;m not so sure that they&#39;re infallible (sorry Nick) and hence the process can&#39;t be blindly trusted - that&#39;s why I think transparency is the best way to go. With hundreds of connections to relatively unknown destinations tor is already the bane of network based IDS so it would be nice if we could provide some accounting to system administrators that tor is behaving as it should. For instance say the tor process claims a big outbound connection taking 90% of your bandwidth that can&#39;t be accounted for as belonging to a circuit. If you aren&#39;t using it as a client that would be... bad.<br>


<br>- I agree that for correlation attacks this data is of concern in the event that numerous relays store or share this information. However, for an individual relay operator having this data shouldn&#39;t pose *any* threat to tor users (if it does... we have an issue). From what I can tell this proposal doesn&#39;t do anything that makes correlation attacks more dangerous since netstat running in a cron job is all they need (assuming they own a big chunk of the relays).<br>


<br>- Tor was designed with a certain level of distrust of relays. Beyond that the best we can do is discourage them from risky behaviour (ie, running outdated versions, looking at exit traffic, sharing connection data, etc). By including connection types controllers will have the opportunity to tell relay operators &quot;Oi! Please don&#39;t look at these exit connections unless you have a damn good reason.&quot;. As it stands I don&#39;t have a way of telling them apart, and hence can&#39;t even hide them by default.<br>


<br>- As you mentioned we can&#39;t (and imho shouldn&#39;t) prevent relay operators from seeing the connections made to/from their own system. This proposal doesn&#39;t seem to exasperate any privacy issues while providing some nice benefits (performance and some handy bits of extra data that&#39;ll make security anomalies far easier to detect).<div class="im">

<br>

<br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">This is, I think, a misunderstanding of what a connection is. More below.<br></blockquote>


<br></div>No, the hidden service question isn&#39;t. I&#39;m assuming that when hosting hidden services there&#39;s some connections dedicated to providing that service. If so, a TYPE_FLAG should probably be included since they don&#39;t really belong to any of the other groups. Changed proposal to include one till someone tells me this is wrong.<br>


<br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">Here, the connection identity needs to either include the CIRC_ID, or this is ambigious...<br>


</blockquote><br>Thanks for the catch! Made the following three corrections:<br><br>- Changed signature to &quot;conn/&lt;Circuit identity&gt;/&lt;Connection identity&gt;&quot; to avoid ambiguity. I&#39;m assuming that in general people will use the &quot;conn/all&quot; to discover the circuit/connection ids (actually, can&#39;t think of a use for getting a single connection - just including it to conform with other control-spec GETINFO options).<br>


<br>- Noted that more than two connections could have the same circuit ID in the case of exit connections.<br><br>- Including a L_PORT (local port) parameter - wasn&#39;t mentioned but definitely an oversight.<div class="im">

<br><br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">

These flags seem to be mostly redundant. Again, they don&#39;t necessarily

work because a connection can be used for many things. As for the Ee

flag, I don&#39;t really see the purpose, we certainly shouldn&#39;t look at

exit traffic going through the connection to decide if it is encrypted

or not.</blockquote><br></div>Yea, I wasn&#39;t sure if they should be like argument flags (given a default if excluded) or always explicitly stated. Opted for the later since in general explicit is better than implicit, and this way implementers (like TorCtl) won&#39;t need to hard code any defaults. Both minor points and glad to discuss more if people disagree.<br>


<br>Yes, if this was only associated with a connection it wouldn&#39;t work, but circuit/connection combinations should be unique so issue fixed there.<br><br>As for the Ee flag I&#39;m suspecting that it would have use for client connections since any unencrypted traffic there is sniffable. This isn&#39;t important to the use cases I care about so we can drop it if others think it&#39;s a bad idea.<br>


<br>Here&#39;s the revised proposal:<br><br>-------------------------------------------------------------------------------<br><br>  &quot;conn/&lt;Circuit identity&gt;/&lt;Connection identity&gt;&quot; -- Provides entry for the<br>


    associated connection, formatted as:<br>      CONN_ID CIRC_ID OR_ID IP PORT L_PORT TYPE_FLAGS READ WRITE UPTIME BUFF<div class="im"><br><br>    none of the parameters contain whitespace, and additional results must be<br>

    ignored to allow for future expansion. Parameters are defined as follows:<br>

      CONN_ID - Unique identifier associated with this connection.<br>      CIRC_ID - Unique identifier for the circuit this belongs to (0 if this<br>        doesn&#39;t belong to any circuit). At most their may be two connections<br>

</div>

        (one inbound, one outbound) with any given CIRC_ID except in the case<br>        of exit connections.<div class="im"><br>      OR_ID - Relay fingerprint, 0 if connection doesn&#39;t belong to a relay.<br>      IP/PORT - IP address and port used by the associated connection.<br>

</div>

      L_PORT - Local port used by the connection.<div class="im"><br>      TYPE_FLAGS - Single character flags indicating directionality and type<br>        of the connection (consists of one from each category, may become<br>

        longer for future expansion).<br>

          I: inbound, i: listening (unestablished inbound),<br>            O: outbound, o: unestablished outbound<br></div>          C: client related, R: relay related, X: control, H: hidden service,<div class="im"><br>

            D: directory<br>

          T: inter-tor connection, t: outside the tor network<br>          E: encrypted traffic, e: unencrypted traffic<br>        For instance, &quot;IRtE&quot; would indicate that this was an established<br>        1st-hop (or bridged) relay connection.<br>

</div><div class="im">

      READ/WRITE - Total bytes read/written over the life of this connection.<br>      UPTIME - Time the connection&#39;s been established in seconds.<br>      BUFF - Bytes of data buffered for this relay connection.<br>

<br>

  &quot;conn/all&quot; -- Newline separated listing of all current connections.<br><br>  &quot;info/relay/bw-limit&quot; -- Effective relayed bandwidth limit (currently<br>    RelayBandwidthRate if set, otherwise BandwidthRate).<br>


<br>  &quot;info/relay/burst-limit&quot; -- Effective relayed burst limit.<br><br>  &quot;info/relay/read-total&quot; -- Total bytes relayed (download).<br><br>  &quot;info/relay/write-total&quot; -- Total bytes relayed (upload).<br>


<br>  &quot;info/relay/buffer-cap&quot; -- Maximum buffer size for relay connections.<br><br>  &quot;info/uptime-process&quot; -- Total uptime of the tor process (in seconds).<br><br>  &quot;info/uptime-reset&quot; -- Time since last reset (startup or sighup signal, in<br>


    seconds).<br><br>  &quot;info/descriptor-used&quot; -- Count of file descriptors used.<br><br>  &quot;info/descriptor-limit&quot; -- File descriptor limit (getrlimit results).<br><br>  &quot;ns/authority&quot; -- Router status info (v2 directory style) for all<br>


    recognized directory authorities, joined by newlines.<br><br></div>-------------------------------------------------------------------------------<br><br>Cheers! -Damian<div><div></div><div class="h5"><br><br><div class="gmail_quote">

On Sat, Dec 19, 2009 at 11:43 PM, Sebastian Hahn <span dir="ltr">&lt;<a href="mailto:hahn.seb@web.de" target="_blank">hahn.seb@web.de</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Hi Damian,<br>

<br>

please find my comments inline below.<br>

<br>

On Dec 17, 2009, at 3:24 AM, Damian Johnson wrote:<br>

<br>

[snip]<br>

<div>&gt;  - Anything dangerous? Doubt it, but the bandwidth measurements should probably<br>

&gt;  either be rounded or provided occasionally (say, every second) to address<br>

&gt;  correlation attacks. I&#39;m sure Sebastian will enthusiastically sink some<br>

&gt;  paranoia into this later. ;)<br>

<br>

</div>As always, I&#39;m very uncomfortable with giving away users&#39;/destinations&#39; ip addresses or ports. I do realize that the same information can be obtained from netstat and friends, but I still think we should actively discourage the use and acquisition of this data. I realize that this is against the intentions of this proposal, but I hope that it is still useful even without client/destination identifying information.<br>


<div><br>

&gt; - When hosting hidden services I&#39;d imagine some connections are dedicated to<br>

&gt;  them. If so, lets add a flag to indicate them.<br>

<br>

</div>This is, I think, a misunderstanding of what a connection is. More below.<br>

<br>

[snip]<br>

<div>&gt;    &quot;conn/&lt;Connection identity&gt;&quot; -- Provides entry for the associated<br>

&gt;      connection, formatted as:<br>

&gt;        CONN_ID CIRC_ID OR_ID IP PORT TYPE_FLAGS READ WRITE UPTIME BUFF<br>

&gt;<br>

&gt;      none of the parameters contain whitespace, and additional results must be<br>

&gt;      ignored to allow for future expansion. Parameters are defined as follows:<br>

&gt;        CONN_ID - Unique identifier associated with this connection.<br>

&gt;        CIRC_ID - Unique identifier for the circuit this belongs to (0 if this<br>

&gt;          doesn&#39;t belong to any circuit). At most their may be two connections<br>

&gt;          (one inbound, one outbound) with any given CIRC_ID.<br>

<br>

</div>Here, the connection identity needs to either include the CIRC_ID, or this is ambigious. Tor mutliplexes many circuits over the same connection, so there is no way to infer the circuit id from a connection id. Also, for exit connections, there may be more than two connections with the same circuit id. What this means: We either want a seperate query to learn about circuits, or we want the conn_id to list all the circuits that it has attached, or we want to only allow queries of this kind when circ id and conn id are both known to the controller<br>


<div><br>

&gt;        OR_ID - Relay fingerprint, 0 if connection doesn&#39;t belong to a relay.<br>

&gt;        IP/PORT - IP address and port used by the associated connection.<br>

&gt;        TYPE_FLAGS - Single character flags indicating directionality and type<br>

&gt;          of the connection (consists of one from each category, may become<br>

&gt;          longer for future expansion).<br>

&gt;            I: inbound, i: listening (unestablished inbound),<br>

&gt;              O: outbound, o: unestablished outbound<br>

&gt;            C: client related, R: relay related, X: control, D: directory<br>

&gt;            T: inter-tor connection, t: outside the tor network<br>

&gt;            E: encrypted traffic, e: unencrypted traffic<br>

&gt;          For instance, &quot;IRtE&quot; would indicate that this was an established<br>

&gt;          1st-hop (or bridged) relay connection.<br>

<br>

</div>These flags seem to be mostly redundant. Again, they don&#39;t necessarily work because a connection can be used for many things. As for the Ee flag, I don&#39;t really see the purpose, we certainly shouldn&#39;t look at exit traffic going through the connection to decide if it is encrypted or not.<br>


<div><br>

&gt;        READ/WRITE - Total bytes read/written over the life of this connection.<br>

&gt;        UPTIME - Time the connection&#39;s been established in seconds.<br>

&gt;        BUFF - Bytes of data buffered for this relay connection.<br>

&gt;<br>

&gt;    &quot;conn/all&quot; -- Newline separated listing of all current connections.<br>

&gt;<br>

&gt;    &quot;info/relay/bw-limit&quot; -- Effective relayed bandwidth limit (currently<br>

&gt;      RelayBandwidthRate if set, otherwise BandwidthRate).<br>

&gt;<br>

&gt;    &quot;info/relay/burst-limit&quot; -- Effective relayed burst limit.<br>

&gt;<br>

&gt;    &quot;info/relay/read-total&quot; -- Total bytes relayed (download).<br>

&gt;<br>

&gt;    &quot;info/relay/write-total&quot; -- Total bytes relayed (upload).<br>

&gt;<br>

&gt;    &quot;info/relay/buffer-cap&quot; -- Maximum buffer size for relay connections.<br>

&gt;<br>

&gt;    &quot;info/uptime-process&quot; -- Total uptime of the tor process (in seconds).<br>

&gt;<br>

&gt;    &quot;info/uptime-reset&quot; -- Time since last reset (startup or sighup signal, in<br>

&gt;      seconds).<br>

&gt;<br>

&gt;    &quot;info/descriptor-used&quot; -- Count of file descriptors used.<br>

&gt;<br>

&gt;    &quot;info/descriptor-limit&quot; -- File descriptor limit (getrlimit results).<br>

&gt;<br>

&gt;    &quot;ns/authority&quot; -- Router status info (v2 directory style) for all<br>

&gt;      recognized directory authorities, joined by newlines.<br>

&gt;<br>

<br>

</div>These all sound sane.<br>

<font color="#888888"><br>

<br>

Sebastian</font></blockquote></div><br>

</div></div></blockquote></div><br>