Hi Karsten,
we still believe that the statistics are useful. However we also agree with Rob that since more and more relays report data the scatter plot becomes confusing. I think some kind of aggregation would be helpful.
In one of our previous papers we assessed the performance impact of simultaneous TCP connections in overlay networks [1]. There we found that the occurring performance degradation is hard(er) to solve when connections are used bidirectionally -- which is the motivation for these statistics. From the current results I would presume that this is the case in Tor most of the time.
In Future developments the statistics could become helpful.
Cheers, Florian.
[1] D. Marks, F. Tschorsch, and B. Scheuermann, "Unleashing Tor, BitTorrent Co.: How to relieve TCP deficiencies in overlays," in 35th IEEE Conference on Local Computer Networks (LCN'10), 2010, pp. 320–323.
On 16/12/13 05:34, Rob Jansen wrote:
Hey Karsten,
I think the statistics could be useful, though I don't currently utilize them. I think the current presentation is somewhat confusing. Perhaps we can try to brainstorm some alternative ways to present the data if the decision is that we should keep it around.
Best, Rob
On Dec 9, 2013, at 12:43 PM, Karsten Loesing wrote:
Björn, Florian,
a few years back (in 2010, to be precise) we added statistics to little-t-tor reporting what fraction of connections is used uni-/bidirectionally. Quoting dir-spec.txt:
"conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL [At most once]
Number of connections, split into 10-second intervals, that are used uni-directionally or bi-directionally as observed in the NSEC seconds (usually 86400 seconds) before YYYY-MM-DD HH:MM:SS. Every 10 seconds, we determine for every connection whether we read and wrote less than a threshold of 20 KiB (BELOW), read at least 10 times more than we wrote (READ), wrote at least 10 times more than we read (WRITE), or read and wrote more than the threshold, but not 10 times more in either direction (BOTH). After classifying a connection, read and write counters are reset for the next 10-second interval.
These statistics are disabled by default, but when they are enabled, relays publish them in their extra-info descriptors. And quite a few relays do that. Here's a (bad) visualization (that used to be slightly less bad when fewer relays published these statistics):
https://metrics.torproject.org/performance.html#connbidirect
Here's the question: Is there still value in having these statistics? I recall that they were useful in 2010, but will that still be the case in 2013?
If the answer is "yes", never mind.
If the answer is "no", I'd create a ticket and submit a patch to remove code parts from little-t-tor, and I'd remove the not-really-useful graph from the metrics website.
Cc'ing Rob, Aaron, and Roger as the people who typically have an interest in these kinds of statistics. If other tor-dev@ people have an opinion on this, please raise your voice!
All the best, Karsten