<div dir="ltr"><div>Hi all, <br></div><div><br></div><div>I've been doing some data analysis this week, using both the torperf dataset and a more recent, higher resolution dataset from Arthur Edelstein. Tomorrow, I plan to write it up and I'll email tor-scaling with a writeup and links to scripts / compressed datasets so everyone can explore. However, I wanted to share these graphs early, as I think they answer the 2015 question at least. <br></div><div><br></div><div><b>Graph 1: Latency heat map by measurement server. (One dot is one measurement, only exit node circuits)</b><br></div><div><div><img src="cid:ii_jxpaqsx50" alt="measurements_big_three.png" style="margin-right: 0px;" width="570" height="570"></div><div><b>Graph 2: As before, zoomed in to 12 months around Jan 2015. </b><br></div><div><div><img src="cid:ii_jxparu0e1" alt="2015.png" width="570" height="570"></div><div>So I think siv's ISP change might have had more impact than previously thought, as neither of the other two measurement servers show any real delta. <br></div><div> DDOS Attacks also show up pretty clearly on Graph 1 and there's some strange discrete banding in the early days. <br></div><div><br></div><div>More to follow tomorrow! <br></div><div><br></div><div>Best,</div><div>Dennis <br></div></div> </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jul 4, 2019 at 7:08 AM George Kadianakis <<a href="mailto:desnacked@riseup.net">desnacked@riseup.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Mike Perry <<a href="mailto:mikeperry@torproject.org" target="_blank">mikeperry@torproject.org</a>> writes:<br>

<br>

> At Mozilla All Hands, we hoped to find a correlation between the amount<br>

> of load on the Tor network and its historical performance.<br>

><br>

> Unfortunately, while there did appear to be periods of time where this<br>

> correlation held, we discovered a major historical discontinuity in this<br>

> correlation. We have some guesses that we need to investigate:<br>

> <a href="https://lists.torproject.org/pipermail/tor-scaling/2019-July/000053.html" rel="noreferrer" target="_blank">https://lists.torproject.org/pipermail/tor-scaling/2019-July/000053.html</a><br>

><br>

<br>

You mean the "start of 2015" artifact right? It would be nice to see<br>

some more zoomed-in graphs. Like did the change happen over a single<br>

day? Is the R code for these graphs somewhere online?<br>

<br>

I'd like to add "changes to bw auth code, nodes or bandwidth weights" as<br>

another possible guess. e.g. I think that's when maatuska got shut down<br>

according to this graph: <a href="https://metrics.torproject.org/totalcw.html?start=2014-12-20&end=2015-03-10" rel="noreferrer" target="_blank">https://metrics.torproject.org/totalcw.html?start=2014-12-20&end=2015-03-10</a><br>

<br>

I also tried to check the onion service traffic during those days and I<br>

noticed that we introduced those graphs almost exactly those days. Could<br>

there have been some change in the metrics infrastructure those days?<br>

Â  Â  Â  <a href="https://metrics.torproject.org/hidserv-rend-relayed-cells.html?start=2014-04-05&end=2015-07-04" rel="noreferrer" target="_blank">https://metrics.torproject.org/hidserv-rend-relayed-cells.html?start=2014-04-05&end=2015-07-04</a><br>

<br>

> So, how can we tell what factors actually really contribute to the<br>

> performance of the Tor network? Let's use statistics.<br>

><br>

> Let's start of calling Tor performance our dependent variable.<br>

><br>

<br>

By "Tor performance" here you mean "latency" and "throughput" which does<br>

not take into account "reliability". I think as a separate investigation<br>

here it would be interesting to see how the below "independent<br>

variables" impact timeout and failure graphs like this one:<br>

<a href="https://metrics.torproject.org/torperf-failures.html?start=2012-04-05&end=2019-07-04&server=public&filesize=50kb" rel="noreferrer" target="_blank">https://metrics.torproject.org/torperf-failures.html?start=2012-04-05&end=2019-07-04&server=public&filesize=50kb</a><br>

<br>

> Based on the brainstorming at Mozilla, and in the meeting on Friday, we<br>

> have a few candidate independent variables that influence performance:<br>

>Â  Â 1. Total Utilization<br>

>Â  Â 2. Bottleneck Utilization (Exit or Guard, whichever is scarce)<br>

>Â  Â 3. Total Capacity<br>

>Â  Â 4. Exit Capacity<br>

>Â  Â 5. Load Balancing<br>

><br>

<br>

I think capacity and utilization based metrics are a big part of the<br>

equation here, but they assume that Tor is a perfect byte-pushing<br>

network of pipes. Seeing how these pipes get chosen (load balancing/path<br>

selection) and how well they get used (scheduler and other<br>

implementation details like bugs) also seems important..<br>

<br>

The first four variables here seem well defined but what is "Load<br>

balancing"? How do we define this in a way that is robust and rankable?<br>

<br>

Perhaps one way could be to play with the utilization concept again but<br>

go per-relay this time, and see how well utilized individual relays are<br>

over time. How do the utilization level differ between slow and fast<br>

relays? What about different relay types?<br>

<br>

---<br>

<br>

Interesting stuff all around! We indeed have tons of data from our<br>

network over more than a decade. We should learn to put more of those<br>

into good use.<br>

_______________________________________________<br>

tor-scaling mailing list<br>

<a href="mailto:tor-scaling@lists.torproject.org" target="_blank">tor-scaling@lists.torproject.org</a><br>

<a href="https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-scaling" rel="noreferrer" target="_blank">https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-scaling</a><br>

</blockquote></div>