[tor-dev] Proposal: Controller events to better understand connection/circuit usage

Thu Feb 28 11:34:03 UTC 2013

[Nick, can you merge my torspec proposal218 branch, please?  Thanks!]

On 2/28/13 2:08 AM, Rob Jansen wrote:
> On Mon, Feb 25, 2013 at 10:28 AM, Karsten Loesing <karsten at torproject.org>
>  wrote:
> 
>> On 2/23/13 11:20 PM, Rob Jansen wrote:
>>> On Feb 23, 2013 4:22 PM, "Karsten Loesing" <karsten at torproject.org>
>> wrote:
>>>> Your understanding of n_circ_id and p_circ_id matches mine, but are you
>>>> sure there's a UID for circuits other than origin circuits?  I think you
>>>> mean origin_circuit_t->global_identifier.  But there's no such field for
>>>> or_circuit_t or circuit_t.  Or do you mean something else?
>>>>
>>> Ok. Though I thought my original patch moved the global Id to the base
>>> circuit struct. Perhaps I didn't. Anyway, I'm not sure it matters...
>>
>> Your original patch did move the ID to circuit_t, but I thought we
>> wanted to avoid numbering non-origin circuits (mostly because that
>> affects relays in non-TestingTorNetwork mode and could lead to busy
>> relays running out of IDs at some point), which is why I took this
>> change out.
>>
> 
> Right, I remember this now. I could imagine ways to get around the 'running
> out of ids' problem, like resetting the id counter after a given timeframe.
> In fact, we could just let the counter overflow and loop back to 0 on its
> own (assuming its a unsigned int type) since we really only need them to be
> unique approximately for the life of a circuit. If we see the ID come up
> again after an hour, we can assume its a new circuit.

I'd rather not want to change behavior in non-simulation mode.  If we
really need a UID for non-origin circuits, we could add a separate
uint64_t global_identifier to circuit_t and only use that in simulating
mode.  I'm not sure that we need it though.

>> Also, even if we moved the field to circuit_t, the new CELL_STATS events
>> would be the only ones using these UIDs, because all other events are
>> for origin circuits only.  I don't see how these IDs would help us.  We
>> never learn what UID the next or previous node in a circuit picks for a
>> given circuit.
> 
> Not currently, but couldn't we implement the functionality where relays log
> or export their circuit UIDs and next/prev IDs over the control port?
> Though I'm not sure if you'd want this in mainline Tor if its only useful
> in simulation mode...

We could have relays log local circuit UIDs together with p_circ_id,
p_conn_id, n_circ_id, and n_conn_id in simulation mode.  But I don't see
how that would facilitate circuit tracking compared to the current
approach using ORCONN and CELL_STATS events.

Here's an example:

- fileclient creates a circuit with
  - UID 14,
  - n_conn_id 15, and
  - n_circ_id 19403.

- tokenglobal is the first hop in this circuit and reports
  - (new) UID 12345,
  - p_conn_id 32,
  - p_circ_id 19403,
  - n_conn_id 18, and
  - n_circ_id 6710.

- tokenrelay is the second hop and reports
  - (new) UID 23456,
  - p_conn_id 17,
  - p_circ_id 6710,
  - n_conn_id 16,
  - n_circ_id 34402.

- exit2 is the third and final hop and reports
  - (new) UID 34567,
  - p_conn_id 15, and
  - p_circ_id 34402.

We need ORCONN events to know that fileclient's n_conn_id 15 is the same
connection as tokenglobal's p_conn_id 32.  How would the new UIDs make
this easier?

>>>> tl;dr: I _think_ it's possible to reconstruct circuits from ORCONN and
>>>> CELL_STATS events as they are currently specified in proposal 218.
>>>>
>>> Great, but do we really expect every Tor controller parser to get this
>>> right? It seems complicated enough that there should be an easier way.
>>> Maybe it's just wishful thinking on my part.
>>
>> I agree that reconstructing circuits from ORCONN and CELL_STATS events
>> is far from trivial.  I don't really see how to make it simpler though.
>
> What about linking all of the IDs and UIDs as described above?

(See above.)

>> From an earlier mail in this thread:
>>>> Finally, Rob, should I look into CIRC_BW events you suggested a while
>>>> ago?  If yes, what did you have in mind how that event would look like,
>>>> and when/by whom would it be emitted?
>>>
>>> If we want to do this, it would likely be an aggregation of all STREAM_BW
>>> events for a circuit, but only during the time when those streams
>> belonged
>>> to that circuit. I don't think it makes sense to emit it for every
>>> STREAM_BW event though, so what if we aggregate and emit it once per
>>> second? A format similar to the STREAM_BW format should work fine.
>>
>> Done.  I specified and implemented such a CIRC_BW event.
>>
>>
> Great!
> 
> 
>>
>> Here's the updated proposal 218 (Nick, please don't merge this yet):
>>
>>
>> https://gitweb.torproject.org/user/karsten/torspec.git/blob/refs/heads/proposal218:/proposals/218-usage-controller-events.txt
> 
> 
> In section 5.3/5.4/5.5, these events are emitted in the
> second_elapsed_callback(), right? I wanted to verify that a relay who
> hasn't sent anything in a few seconds and then starts sending again will
> emit the event at the end of the second after which it resumed sending,
> rather than the first bytes after it resumed sending.

Yes, all these events are emitted in second_elapsed_callback().

> The last word in 5.4 should be 'reading' instead of 'read'.

Fixed.

> Is the specification of 5.5 and 5.6 complex enough to warrant including
> example outputs?

Sure, can't hurt.  Added a few examples.

> In 5.6, were we planning on explaining how buckets can go negative and how
> that affects the reporting of the TB_EMPTY events?

I tried a better explanation:

"""
  ReadBucketEmpty (WriteBucketEmpty) is the time in millis that the read
  (write) bucket was empty since the last refill.  LastRefill is the
  time in millis since the last refill.

  If a bucket went negative and if refilling tokens didn't make it go
  positive again, there will be multiple consecutive TB_EMPTY events for
  each refill interval during which the bucket contained zero tokens or
  less.  In such a case, ReadBucketEmpty or WriteBucketEmpty are capped
  at LastRefill in order not to report empty times more than once.
"""

> Here's the tor branch:
>>
>>
>> https://gitweb.torproject.org/karsten/tor.git/shortlog/refs/heads/morestats3
>>
>> Here's a Shadow log file containing all new events:
>>
>> https://people.torproject.org/~karsten/volatile/scallion.log.gz
>>
>> Here's the Java program that I used to parse the Shadow log file:
>>
>> https://people.torproject.org/~karsten/volatile/ParseProposal218Events.java
>>
>>
> Ugh. I really don't look forward to writing parsers for this. I guess there
> may be few projects that actually require this information, and those that
> do can use your code:)

I hope that some of the complexity goes away once we use a parsing
library like Stem.  And I think we should provide parsing code to other
people looking into this.

>> And finally, here's the output, which should be easier to understand now:
>>
>> https://people.torproject.org/~karsten/volatile/scallion.txt
>>
>> Search for "Circuit [fileclient-60.1.0.0]:14" to find the circuit I
>> mentioned earlier in this thread.
>>
>>
>> Can you review the proposal changes and tell me if they make sense to you?
>>
>> Best,
>> Karsten
>>
>>
> See comments above. Overall, it looks good. I'm still wondering about the
> UIDs / IDs issue. It may be that we don't want to include that for other
> reasons, in which case the current implementation is fine.

Okay, in that case let's consider proposal 218 done for the moment,
unless we come up with a better idea to solve circuit-tracking thing.

I asked Nick above to merge my changes, so that I can put proposal 218
on the sponsor F year 3 wiki page as one result of the February 28
milestone.  (That doesn't mean we cannot make it even better after
February.)

Thanks!
Karsten