[tor-bugs] #28547 [Core Tor/sbws]: Monitor relays that are not measured by each sbws instance

Fri Nov 30 01:29:01 UTC 2018

#28547: Monitor relays that are not measured by each sbws instance
-------------------------------------------------+-------------------------
 Reporter:  teor                                 |          Owner:  (none)
     Type:  defect                               |         Status:  new
 Priority:  Medium                               |      Milestone:  sbws:
                                                 |  1.1.x-final
Component:  Core Tor/sbws                        |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:  tor-bwauth, sbws-1.0-must-           |  Actual Points:
  moved-20181128                                 |
Parent ID:  #25925                               |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------

Comment (by teor):

 Replying to [comment:10 juga]:
 > Replying to [comment:5 teor]:
 > > Replying to [comment:4 juga]:
 > > > The KeyValues that depend on the scanner (not the generator), would
 be only written once a day, so we would need to look at the bandwidth
 files generated at 00:35 UTC. Any potential problem with that?.
 > >
 > > Why can't we report those values every hour, for the last hour?
 >
 > because if we want to report them when running generate, generate reads
 the files that the scanner produces, which are created only once a day.

 How does generate create a new bandwidth file every hour, if the scanner
 only dumps results once a day?
 Is there a document that explains this design?

 > Two solutions to this:
 > 1. change the callback that dumps the result files to do it every hour.
 >    - pro: easy change
 >    - con: the new keyvalues we want to add need to go first to the
 results files (and create new error types for then), then read from
 generate

 - pro: generate can produce accurate results, even if the scanner crashes
 or is restarted during the day

 > 2. generate could be other thread that happens every hour, instead of a
 different process, so that it can access to the results without the need
 to read them back from the results files.
 >    - pro: eliminate the need to have to run an external command, to have
 to write first the results files and then read them again
 >    - con: bigger change

 - con: if sbws restarts, some results for the day are lost
 - con: is this a breaking change? Does the command-line interface to
 generate change?

 > I'm a bit more inclinated to 2, because that would easy further
 refactorings for
 > - not having all bandwidth values triplicated in v3bwfile, relaylist and
 resultdump. I can explain more about htis

 Why is this a problem?

 > - not having to create new ResultError classes to monitor the relays

 sbws should make it easy to add new keys. If it's not easy, we should re-
 design the code so it is easier.

 Here's what I think:

 * sbws needs to persist the results every hour, so that it can read them
 after a restart or crash. Otherwise, we lose a day of data every time sbws
 restarts.
 * as long as sbws can resume from the last hour's results, the
 implementation doesn't matter. Do the easy, simple thing.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28547#comment:11>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online