[tor-scaling] Inprocess Cell Time Tracing

David Goulet dgoulet at torproject.org
Wed Aug 7 18:51:03 UTC 2019


Greetings!

In order to start tracking down
https://trac.torproject.org/projects/tor/ticket/31340, I've started doing some
measurements within the tor process itself.

The idea is to try to find "lattency patterns" by measuring the transit time
of a cell from wire-to-wire or in more technical tor terms from the inbuf to
the kernel (flushed from the outbuf). (Everything inprocess, not touching the
network here).

A while back, while pastly/rob were working on KIST cell scheduler, we worked
on a tracing branch in tor that does such a job. I basically re-wrote it to
fit our needs and hooked up the LTTng userspace tracer (lttng.org) to it.
Tracing cell timings on a 25MB/sec Guard node requires a very very high
performance tracer, the least intrusive we can get. Fortunately, LTTng is
quite good there ;).

I'll skip the technical details on the above but one thing to know, this is
not that trivial to do. My hope is that we are able to upstream the tor
changes so they can be maintained.

To the good stuff. I end up with 5 tracepoints each of them recording the cell
command and circuit id. And the LTTng tracer gives us a nanosecond precise
timestamp for each event:

    - inbuf
      When a cell is put from the kernel into the inbuf (inbound cell).

    - queue
      When the cell is put in a circuit queue from the inbuf.

    - recognized
      When the cell is recognized by tor as in the cell stops here.

    - outbuf
      When the cell is put from the circuit queue into the outbuf. At this
      stage, if the cell comes from the inbuf, we are relaying that cell.

    - flush
      When a cell is taken from the outbuf to the kernel (wire).

I ran this for approximately 120 seconds on a fast Guard relay (25MB/sec)
which gave me ~3M data points (or cells) and an LTTng trace of ~200MB in size.
I also ran this with tor using the KIST scheduler and the Vanilla scheduler.

For now, in this email, I'll concentrate on the timings of *RELAY* cells from
inbuf to kernel. The reason is that I wanted to try to find patterns for cell
we relayed continously.

We'll dive right in the graphs. For all of them, X axis is the elapsed time in
seconds so 0 is the start of the data collection. Y axis is the process time
from inbuf to kernel in msec. So all stats you'll see are in msec.

- log10() scale of the timings for KIST scheduler.
  https://people.torproject.org/~dgoulet/volatile/scaling/inbuf_to_kernel/kist-log.png

  Kept events: 2968379 [100.00%]
  Min: 0.14 - 1st Q.: 5.18 - Median: 7.93 - Mean: 67.60 - 3rd Q.: 10.51 - Max: 113045.6

- log10() scale of the timings for Vanilla scheduler.
  https://people.torproject.org/~dgoulet/volatile/scaling/inbuf_to_kernel/vanilla-log.png

  Kept events: 3168174 [100.00%]
  Min: 0.02 - 1st Q.: 0.13 - Median: 0.23 - Mean: 63.68 - 3rd Q.: 0.47 - Max: 113985.23

Obvious clusters can be seen but overall there is a strong cluster in terms of
speed. But the stats seems to suggest they aren't far from each other in terms
of timings.

Now, some more worrying things.

I wanted to see how the most busy circuit of the data collection behaves so I
isolated it and graph its data points:

- log10() scale with KIST scheduler
  https://people.torproject.org/~dgoulet/volatile/scaling/inbuf_to_kernel/kist-log-busy-circ.png

  Total number of cells: 200703
  Min: 0.47 - 1st Q.: 5.43 - Median: 7.81 - Mean: 8.09 - 3rd Q.: 10.14 - Max: 379.62

- log10() scale with Vanilla scheduler
  https://people.torproject.org/~dgoulet/volatile/scaling/inbuf_to_kernel/vanilla-log-busy-circ.png

  Total number of cells: 235068
  Min: 0.03 - 1st Q.: 0.15 - Median: 0.22 - Mean: 0.80 - 3rd Q.: 0.36 - Max: 246.48

So this is way more interesting to me because we can clearly see a smooth
pattern with KIST and ~3 sets of clusters on Vanilla. Furthermore, on Vanilla,
you see roughly every 45 seconds a spike in delays.

Finally, for now, I've isolated the 1st quartile of circuit based on the
number of cells seen on a circuit or in other words very quiet circuits (the
cutoff for cells RELAY cells were ~4 seen). And then I've taken the 3rd
quartile which means the busiest circuits.

I've put them together in a single graph. Orange is 1st Q and Grey is 3rd Q:

- log10() with KIST scheduler
  https://people.torproject.org/~dgoulet/volatile/scaling/inbuf_to_kernel/kist-log-q1-q3.png

  1st Q:
    Total number of cells: 3078
    Min: 0.63 - 1st Q.: 49.48 - Median: 329.58 - Mean: 4174.12 - 3rd Q.: 2445.33 - Max: 106639.76
  3rd Q:
    Total number of cells: 2950371
    Min: 0.14 - 1st Q.: 5.18 - Median: 7.91 - Mean: 55.14 - 3rd Q.: 10.48 - Max: 101803.27

- log10() with Vanilla scheduler
  https://people.torproject.org/~dgoulet/volatile/scaling/inbuf_to_kernel/vanilla-log-q1-q3.png

  1st Q:
    Total number of cells: 3876
    Min: 0.03 - 1st Q.: 45.39 - Median: 246.91 - Mean: 3497.09 - 3rd Q.: 1436.36 - Max: 113985.23
  3rd Q:
    Total number of cells: 3148495
    Min: 0.02 - 1st Q.: 0.13 - Median: 0.23 - Mean: 50.65 - 3rd Q.: 0.46 - Max: 105282.47

Notice the _much_ higher latency on quiet circuits (1st Q). Meaning the quiet
circuit for some reasons stay inprocess much longer than the busy ones which
is VERY surprising to me because in theory EWMA should show the very contrary.

So yeah, would love to hear anyone's opinion on all this.

One of the next thing I want to learn is _where_ is that time all spent. As in
from inbuf->queue or queue->outbuf or outbuf->flush. My instincts tells me
outbuf->flush especially with KIST considering TCP congestion.

Food for thoughts here! Please let me know if you would like to see specific
things, I now have a set of scripts to analyze the tracing data and plot it.

(Raw data: https://people.torproject.org/~dgoulet/volatile/scaling/inbuf_to_kernel/data/)

(If you want the LTTng raw traces, they are ~200M each compressed but you can
use python bindings to parse/analyze them which is what I do. Not very
complicated, they have a superb doc. Just let me know.)

(The tor branch that produces those traces is heavily in development and not
very trivial... But I'm very happy to share it if need be!)

Cheers!
David

-- 
ZLqb2cqRJBGCBCxZzV/qgv4UpvrLOzvWIZHnu/NtwCw=
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-scaling/attachments/20190807/1a7b9630/attachment.sig>


More information about the tor-scaling mailing list