[tor-bugs] #22141 [Metrics/metrics-lib]: Deprecate `DescriptorFile` and add relevant information to `Descriptor`

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Jun 13 07:56:24 UTC 2017


#22141: Deprecate `DescriptorFile` and add relevant information to `Descriptor`
---------------------------------+-----------------------------------
 Reporter:  karsten              |          Owner:  karsten
     Type:  enhancement          |         Status:  assigned
 Priority:  Medium               |      Milestone:  metrics-lib 1.9.0
Component:  Metrics/metrics-lib  |        Version:
 Severity:  Normal               |     Resolution:
 Keywords:                       |  Actual Points:
Parent ID:                       |         Points:
 Reviewer:                       |        Sponsor:
---------------------------------+-----------------------------------

Comment (by karsten):

 I'm afraid I found two flaws in the design above and the suggestion on
 #22196:

  1. We're using the `Iterable<Descriptor>` and `InvalidDescriptor` as one
 possible returned element as channel for all kinds of exceptions thrown
 while reading descriptor files.  But we don't have such a channel in
 `DescriptorCollector#collectDescriptors` which returns, well, `void`.
 That means that the design above doesn't solve anything of
 [https://trac.torproject.org/projects/tor/ticket/16225#comment:7 this
 issue raised on the related ticket #16225].  And ideally we'd handle
 exceptions in the same way in all descriptor sources.
  1. The idea of replacing the parse history with a `minLastModified`
 timestamp doesn't handle I/O problems very well.  For example, in the
 current implementation, if there's a network problem with downloading a
 potentially large file from CollecTor, we would not include that in the
 parse history and retry collecting and reading it next time.  But with
 only a timestamp we would simply skip that descriptor file, which is
 pretty bad.

 New plan:
  - We rename `InvalidDescriptor` to `UnparseableDescriptor` and let it
 return the `DescriptorParseException` that made it unparseable in our
 view.  This instance ''might'' be useful to the application, at least by
 containing the raw descriptor `byte[]` or descriptor `File` reference.
 And if the application produced the input descriptor itself, like
 CollecTor, knowing about it being unparseable is more important than for a
 consumer, like Onionoo!
  - We leave the parse history in place and postpone the idea of stateless,
 overloaded methods.  Sad.
  - We add a method `removeFromHistoryFile(File)` that can be called by the
 application if it wants to reprocess a descriptor file containing an
 `UnparseableDescriptor`.  This would not be necessary for I/O exceptions,
 because those descriptor files would not go into the parse history and
 retried in the next execution anyway.
  - We log any exceptions on `warn` level that we caught while collecting
 or reading or parsing descriptors.  The application can't ''do'' anything
 to handle these issues anyway, except for telling the operator that
 something went wrong.  `DescriptorParseException`s thrown while parsing
 invalid/unparseable descriptors are exempt from this and will only be
 logged on info level.
  - We ''could'' introduce an `ExceptionListener` for exceptions are than
 `DescriptorParseException`, or the "ParseExceptionsLog" that you mentioned
 (even though I'm not entirely certain what that would be).  I don't think
 it's necessary though, because we wouldn't it use that ourselves, and we
 don't know whether it would be used by anyone.

 Before I write more code (or longer comments!), what do you think? :)

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22141#comment:13>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list