[metrics-bugs] #30216 [Metrics/Library]: Add bandwidth file parser to metrics-lib

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Apr 24 18:47:19 UTC 2019


#30216: Add bandwidth file parser to metrics-lib
-------------------------------------------------+-------------------------
 Reporter:  irl                                  |          Owner:  karsten
     Type:  enhancement                          |         Status:
                                                 |  needs_review
 Priority:  High                                 |      Milestone:
Component:  Metrics/Library                      |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:  tor-bwauth,tor-dirauth,metrics-      |  Actual Points:
  roadmap-2019-q2                                |
Parent ID:  #21378                               |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------
Changes (by karsten):

 * status:  needs_revision => needs_review


Comment:

 Thanks for the review!

 Replying to [comment:7 irl]:
 > {{{
 >       In version 1.0.0, Header List ends when the first relay bandwidth
 >       is found conforming to the next section.
 > }}}
 >
 > but
 >
 > {{{
 > switch (spaceSeparatedLineParts.length) {
 > }}}
 >
 > is used to determine if you've reached the end of the headers. This
 should instead search the line for "bw=" as future header lines may
 contain spaces.
 >
 > {{{
 >       Additional header Lines MUST NOT use any keywords specified in the
 >       relay measurements format.
 > }}}
 >
 > To be safer, it should be looking for either start of line or space
 preceeding bw=. In PCRE, `(^| )bw=`. This test should also be disabled if
 we've already seen a version declared that expects a terminator.

 Right, I thought about this, too, and you have a point that bandwidth-
 file-spec does not ''explicitly'' state whether future header lines can
 ever contain spaces or not. However,
 [https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2134 dir-
 spec] ''implicitly'' rules that out.

 The only place where future header lines could contain a space is between
 two or more KeyValue elements, which would basically be two or more header
 lines combined. But why would anybody want to introduce such lines if they
 can as well have two or more separate header lines?

 How about we ask the bandwidth-file-spec authors to clarify whether this
 is planned in the future? If there are no such plans, that is, header
 lines with 1.x versions can never have spaces, and this is stated
 explicitly in the spec, I'd like to keep this simple and efficient parser
 implementation. Otherwise we can add a check like you suggested, which
 certainly makes the parser more complex, but which would address this
 case.

 > When we are parsing country lists, ISO 2-alpha *can* be more than 2
 characters, and we're looking at using that functionality for describing
 CDNs in an extensible way in #30196. Codes will only be longer than 2
 characters if they start "OO".

 Hmm, [https://gitweb.torproject.org/torspec.git/tree/bandwidth-file-
 spec.txt#n125 bandwidth-file-spec] explicitly says that country codes are
 exactly two characters long.

 It's certainly easy to change the regex here, it's just not following the
 spec.

 In fact, I'm not even sure if the authors can change the spec here for 1.x
 versions, as a correct parser that can parse a valid 1.4 version with
 exactly two country code characters wouldn't parse a 1.5 version with more
 characters. This change might require new `CountryCodeExt` and
 `CountryCodeExtList` nonterminals in newly defined header lines.

 > Tests and checks are OK.

 Great! My suggestion would be to merge and release this code as a valid
 1.0 to 1.4 parser, ask spec authors for clarifications/amendments, and
 possibly release any changes in the next metrics-lib version. It would be
 good to unblock further code to archive bandwidth files.

 What do you think?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30216#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list