Hi,
this is actually a question to BWauth ops.
even though the new onionoo measured flag data is not* yet completely rolled out (since it has to be deployed for > a week to assign the flag to all relays), I would have a question since the data says that all relays in the previously detected AWS family [1] are unmeasured.
I'm wondering whether this is a pure coincidence or on purpose (by BWauths) or by some other yet to be found reason?
thanks!
[1] https://lists.torproject.org/pipermail/tor-talk/2015-July/038623.html
The "spikes" in the following table brought me two that group again.
The table shows unmeasured running relays by the month they joined the network (note this is based on preliminary onionoo data)
The "spikes" are caused by relays mentioned in [1].
+-----------+---------+ | joinmonth | #relays | +-----------+---------+ | 2015-07 | 3 | | 2015-06 | 4 | | 2015-05 | 2 | | 2015-04 | 12 | <<< AWS family part II | 2015-03 | 4 | | 2015-02 | 35 | <<< AWS family part I | 2015-01 | 2 | | 2014-12 | 2 | | 2014-11 | 8 | | 2014-10 | 1 | | 2014-09 | 3 | | 2014-08 | 3 | | 2014-07 | 2 | | 2014-06 | 2 | | 2014-05 | 1 | | 2014-04 | 4 | | 2014-03 | 1 | | 2014-01 | 2 | | 2013-11 | 1 | | 2013-02 | 1 | | 2012-04 | 1 | | 2012-02 | 1 | +-----------+---------+ onionoo data from 2015-08-18 18:00:00 (thecthulhu instance)
I'm wondering whether this is a pure coincidence or on purpose (by BWauths) or by some other yet to be found reason?
I guess I found the answer by looking at another spike (2014-11-01)..
Since these relays came back altogether (they are a group after all) and probably had no measurement within the last 3 days they are unmeasured.. that would make sense.
I'm wondering why the second group has no exit flag https://atlas.torproject.org/#search/TorExitNode
Do exits have to have a measurement and/or certain uptime besides the 2 out of 3 ports rule (80,443,6667) to earn the exit flag?
We're not punishing on purpose, that's for sure. DirAuths may not vote on a relay to exclude it from the consensus, or may vote to give it BadExit, but BWAuths have no such mechanism, and I guess you'll just have to take the words of the individual operators to not do something as evil as try and muck with the measurements manually.
The code here: https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority has a new scanner, scanner 9, for measuring unmeasured relays specifically. It takes at least 2-3 days for a scanner to loop around and measure an entire partition.
This problem _may_ have been with me. Since we're operating so few bwauths, one abstaining causes an issue. I'm not positive but I think mine may have been malfunctioning and not running scanners 1 and 9 correctly. (9 for the past 5 days, 1 for considerably longer.) I don't know why, nor am I certain it was occurring. I kicked things just now, so hopefully in 2 days everything is entirely back to normal and it doesn't occur again...?
Independent of this failure, one theory we have about the bwauths needs to be proofed out. We believe that as relays move between scanner partitions (which are partitioned by speed), they may go one or two cycles without measurement. (Example: relay Alice falls into the the range covered by scanner1. when scanner1 completes and is ready to start over, Alice falls into the range by scanner2 and not scanner1; so scanner1 ignores it. When scanner2 completes at a different time than scanner1, Alice falls into the range covered by scanner1 and therefore scanner2 ignores it. Repeat for as long as you feel unlucky.)
There are likely other failure cases.
Something that could be done, is taking the individual bwauth votes (mine are at https://bwauth.ritter.vg/bwauth/ ) and doing some analysis on them. Unfortunately the files don't contain the id of the scanner that processed them (and I'm much to scared to edit the format in case it breaks parsing somewhere else...) But, one could still look at the time between measurements for every individual relay, and get the median and standard deviation. Look at outliers and see if they are multiples of the median. One could also look at the median time it takes for an unmeasured relay to become measured (this would require consensus data.) If this technique works* and we amend the output format to include the scanner ID, it could even become an alert script that says "Hey, looks like scanner4 is stuck!" or something.
* Big if. Something that would definitely throw off measurements is when a bwauth is restarted, like I just restarted mine. But mine ran pretty solid for
-tom
On 18 August 2015 at 15:08, nusenu nusenu@openmailbox.org wrote:
Hi,
this is actually a question to BWauth ops.
even though the new onionoo measured flag data is not* yet completely rolled out (since it has to be deployed for > a week to assign the flag to all relays), I would have a question since the data says that all relays in the previously detected AWS family [1] are unmeasured.
I'm wondering whether this is a pure coincidence or on purpose (by BWauths) or by some other yet to be found reason?
thanks!
[1] https://lists.torproject.org/pipermail/tor-talk/2015-July/038623.html
The "spikes" in the following table brought me two that group again.
The table shows unmeasured running relays by the month they joined the network (note this is based on preliminary onionoo data)
The "spikes" are caused by relays mentioned in [1].
+-----------+---------+ | joinmonth | #relays | +-----------+---------+ | 2015-07 | 3 | | 2015-06 | 4 | | 2015-05 | 2 | | 2015-04 | 12 | <<< AWS family part II | 2015-03 | 4 | | 2015-02 | 35 | <<< AWS family part I | 2015-01 | 2 | | 2014-12 | 2 | | 2014-11 | 8 | | 2014-10 | 1 | | 2014-09 | 3 | | 2014-08 | 3 | | 2014-07 | 2 | | 2014-06 | 2 | | 2014-05 | 1 | | 2014-04 | 4 | | 2014-03 | 1 | | 2014-01 | 2 | | 2013-11 | 1 | | 2013-02 | 1 | | 2012-04 | 1 | | 2012-02 | 1 | +-----------+---------+ onionoo data from 2015-08-18 18:00:00 (thecthulhu instance)
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
tor-relays@lists.torproject.org