[metrics-bugs] #32265 [Metrics/Exit Scanner]: MS: Format an exit list from a previous exit list and exitmap output

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Nov 20 13:09:42 UTC 2019


#32265: MS: Format an exit list from a previous exit list and exitmap output
----------------------------------+--------------------------------
 Reporter:  irl                   |          Owner:  irl
     Type:  task                  |         Status:  needs_revision
 Priority:  Medium                |      Milestone:
Component:  Metrics/Exit Scanner  |        Version:
 Severity:  Normal                |     Resolution:
 Keywords:                        |  Actual Points:
Parent ID:  #29654                |         Points:
 Reviewer:  karsten               |        Sponsor:
----------------------------------+--------------------------------

Comment (by irl):

 Replying to [comment:5 karsten]:
 > Glad to see that the rewrite is progressing so quickly!
 >
 > Couple remarks/questions:
 >  - Why 48 hours and not 24 hours? Doesn't the current exit scanner keep
 scan results for 24 hours? I might be wrong, though. Let's use whatever
 the current scanner does.

 https://2019.www.torproject.org/tordnsel/exitlist-spec.txt

 It discards relays that were not seen in the last 48 hours in a consensus.

 >  - Rather than downloading exit lists from CollecTor, wouldn't it be
 sufficient to just read the latest exit list previously written by this
 scanner? And if there's none, just assume that no previous scans have
 happened. In theory, this should be all we need to learn.

 Probably, but this was a handy way to get test data and I wanted to try
 out the new Stem functionality. It would be nice to have a method to
 bootstrap a new scanner but this could just mean manually downloading the
 latest exit list and putting it in the right place.

 >  - It seems that `LastStatus` is only taken from exit lists downloaded
 from CollecTor but never set by new measurements. We should make a plan
 what to do with this field. Take it out? Populate it with consensus valid-
 after times?

 Right, this is the tricky bit. Do you know if anything consumes the
 LastStatus or Published timestamps? Ideally we could just drop these but
 for now I'm synthesizing them from the timestamp of the last measurement
 which could be close enough for the consumers.

 >  - Does exitmap with the plugin use previous scans as input to decide
 which relays to scan? I believe that it uses some logic to avoid scanning
 relays too frequently. This has two effects: it doesn't generate more load
 on the network and on single relays than necessary, and it ensures that
 new relays are scanned sooner. As a result, the new scanner could be run
 once or twice per hour, rather than every 2 or 3 hours (at 45 minutes
 runtime).

 No. It scans the entire network every time. It does this asynchronously,
 and doesn't try to prioritize anything. Just whichever circuits are built
 first will be tested first. I was even thinking it could run continuously.
 If exit relays cannot cope with two HTTP requests an hour, perhaps they
 shouldn't be exit relays.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32265#comment:6>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list