[tor-bugs] #20548 [Metrics]: Handle bad input more consistently in metrics code bases

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Nov 15 10:27:49 UTC 2016


#20548: Handle bad input more consistently in metrics code bases
-------------------------+---------------------
 Reporter:  karsten      |          Owner:
     Type:  enhancement  |         Status:  new
 Priority:  Medium       |      Milestone:
Component:  Metrics      |        Version:
 Severity:  Normal       |     Resolution:
 Keywords:               |  Actual Points:
Parent ID:               |         Points:
 Reviewer:               |        Sponsor:
-------------------------+---------------------

Comment (by karsten):

 Replying to [comment:1 iwakeh]:
 > Some thoughts:
 >
 > One step is unifying the parsing process by replacing all parsing code
 with metrics-lib provided parsing (which is already under way for
 CollecTor).

 Agreed.

 > This addresses goal number one in the description above.

 Hmm, I'm not sure which goals you refer to.  What I described above were
 different use cases, not goals.  Nevertheless, unifying the parsing
 process seems worthwhile.

 > Goal number two (of the bullet point list in the description above) is
 fine, too, as descriptors are separate data units and failure of parsing
 one should not influence parsing and storing of subsequent descriptors
 only because these happened to be stored in the same file temporarily.

 Agreed.

 > Regarding the second list: privacy and client expectation, i.e. topics
 3. and 4., are the most important.
 >
 > One way to combine storing-of-all-that-is-seen with privacy and client
 expectation, would be to store invalid descriptors separately.  [...]

 Hmmmm.  Those are two big disadvantages there. :)

 How about we do the following instead:
  - If we attempt to parse a relay descriptor in CollecTor (use cases 1 and
 2) and cannot figure out descriptor type, publication time, or digest, we
 append the raw bytes to a new local file per execution, say,
 `bad/2016-11-15-10-23-55`, and log a warning.  The operator can then look
 at that file, possibly reconfigure or fix the parsing code, and put it
 again in some `in/` subdirectory to parse it again.
  - If we attempt to parse a bridge descriptor in CollecTor (use case 3)
 and encounter anything that prevents us from sanitizing it, we print out a
 warning including the tarball file name.  The operator can look at the
 tarball, get the parsing code fixed or extended, and remove the line from
 the parse history file, so that the file will be parsed again next time.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/20548#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list