[tor-scaling] Nov 20 meeting recap: metrics analysis needed

Karsten Loesing karsten at torproject.org
Tue Jan 21 15:54:58 UTC 2020


On 2020-01-14 03:23, Mike Perry wrote:
> On 12/19/19 4:11 AM, Karsten Loesing wrote:
>> Hi!
>>
>> On 2019-12-02 20:37, Mike Perry wrote:
>>> In particular, I would like to look at 8 hour snapshots from 8/5 to
>>> 8/15, broken out by relay flags, of CDF-TTFB and CDF-DL from
>>> https://trac.torproject.org/projects/tor/wiki/org/roadmaps/CoreTor/PerformanceMetrics
>>
>> This is a very interesting experiment, and I'd like to help with the
>> analysis, also in preparation of providing better tools for future
>> analyses like this one.
>>
>> However, I'm having trouble understanding what graphs you have in mind
>> here. I made a very first graph, even though I think it's just the
>> beginning of an interactive process (that should probably not happen on
>> this mailing list but on a ticket).
>>
>> https://people.torproject.org/~karsten/onionperf-cdf-ttfb-2019-12-19.pdf
>>
>> Some thoughts:
>>
>>  - I'm not sure if CDFs are the best way to visualize the data here.
>> CDFs work great for visualizing *one* distribution or for comparing a
>> handful of distributions when plotting them as separate lines. But we
>> have a few dozen distributions here, and we can't plot all these lines
>> into a single coordinate system.
> 
> Well because this graph set is just one metric, these are all the same
> distribution with the same coordinates. The main problem we appear to be
> having is clipping/scale.
> 
> If we got the axis clipping to be reasonable, we could conceivably just
> overlay everything on fewer graphs. Perhaps all of the graphs from the
> same times of day could be combined as one overlayed pile of CDF lines,
> and we could use colors to represent which CDF lines were for "ON" vs
> "OFF" for the experiment dates. Ie: green CDF lines for Aug 9 - Aug 13
> and red CDF lines for all other dates for this experiment.
> 
>>  - Maybe we could try out different subsets of percentiles in a common
>> line plot to see how the experiment affects TTFB in this case. Something
>> like five-number summary or seven-number summary or whichever handful of
>> percentiles we want to see.
> 
> Worst case, I am fine with having a page of graphs for each metric.
> We're not bound by publication lengths when we do our own analysis.
> However, if we can find a compact representation we like that has all of
> this info, that will be helpful for academia. So it is worthwhile to
> brainstorm.
> 
> If we lose resolution by taking a few percentile ranges, we may miss out
> on some lumpiness that means something though. So I would like to
> reserve that as last resort (or perhaps as a way to capture just the
> long tail past 99%).
> 
>>  - I didn't understand your idea to break out data by relay flags. These
>> are requests over three-hop circuits. How would we split these up by
>> relay flags?
> 
> Ah you are right. The flag separation only makes sense for the balancing
> metrics. Sorry for the confusion.
> 
>>  - This graph shows just the data from a single OnionPerf source. If we
>> add multiple sources, it gets even more overloaded. But we can't really
>> mix numbers from different sources, as they have their very own
>> connection characteristics that would skew results.
>>
>> Happy to make more graphs. It might help to see a sketch or longer
>> description of what you expect to see. Thanks!
> 
> Hrm, I am not a data visualization expert, but what is most important
> for us to understand is the nature of the variance of performance,
> including the length of the long tail.
> 
> From your above plots, it looks like the experiment primarily negatively
> impacted the long tail of perf, and maybe even 95-99% perf, but not
> average perf. But I agree, even this much is hard to tell due to the
> scale needed to display the full tail in CDF form. Perhaps this means a
> clip at like 5-10 seconds for all graphs, to keep the X axis the same
> length, and then some additional way to quantify the length and quantity
> of the tail beyond the clip.
> 
> Basically, we want to be able to see if 0-99% CDF slope became wider or
> got additional lumps, and we want to see if the 1% tail got longer or
> shorter (and ideally also check if it has similar membership and data
> points over time in terms of participant relays and time values, for
> bug-hunting analysis).
> 
> We should definitely play around with a few different graphing methods
> though, to compare various ways of capturing this info.

Alright, I made a set of new graphs, taking your comments above into
account:

https://people.torproject.org/~karsten/onionperf-cdf-ttfb-2020-01-21.pdf

I included explanations and thoughts on these graphs in the graph
captions. Not what captions were made for, but it seemed useful to have
these ideas close to the graphs in this case.

Curious to hear your thoughts!

All the best,
Karsten

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 528 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-scaling/attachments/20200121/1131ca1a/attachment.sig>


More information about the tor-scaling mailing list