[tor-bugs] #20521 [Metrics/metrics-lib]: Deprecate `DescriptorReader.setExcludeFiles()` and add two separate methods for loading and saving a history file

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Nov 1 14:37:20 UTC 2016


#20521: Deprecate `DescriptorReader.setExcludeFiles()` and add two separate methods
for loading and saving a history file
-------------------------------------+-------------------------------
     Reporter:  karsten              |      Owner:  karsten
         Type:  enhancement          |     Status:  new
     Priority:  Medium               |  Milestone:  metrics-lib 1.6.0
    Component:  Metrics/metrics-lib  |    Version:
     Severity:  Normal               |   Keywords:
Actual Points:                       |  Parent ID:
       Points:                       |   Reviewer:
      Sponsor:                       |
-------------------------------------+-------------------------------
 The history file implementation in `DescriptorReader` has a design bug for
 much too long that I'd finally want to fix.

 The issue is that `DescriptorReader` writes the history file passed in
 `setExcludeFiles()` immediately after reading and parsing the last
 descriptor and putting it into the queue, regardless of whether the
 application has finished processing those descriptors.

 If the application fails after the history file is written, it may not be
 able to process descriptors in the next execution that have still been in
 the queue at the time of failing.

 One way to reduce that gap would be to have `BlockingIteratorImpl` tell
 `DescriptorReader` when the application has first learned that `hasNext()`
 returned `false` or that `next()` threw a `NoSuchElementException`.  The
 benefit of that solution would be that no interface change would be
 required.  The downside would be that the application might not have fully
 cleaned up at the time of learning that there are no further descriptors
 available.  In one example, the application would close a database import
 file, perform the database import, and only write the history file after
 learning that the database import has succeeded.

 A better solution would be to decouple saving the history file to disk
 from completing the descriptor read operation.  We would replace the
 `setExcludeFiles()` method by a `setHistoryFile()` and a
 `saveHistoryFile()` method.  Applications would use `setHistoryFile()`
 before starting to read descriptors, process all descriptors, perform any
 cleaning up, and then call `saveHistoryFile()`.

 Here's an example of the code that this would save in applications that
 currently work around this known design bug by loading and saving history
 files themselves:

 {{{
 diff --git
 a/modules/hidserv/src/org/torproject/metrics/hidserv/Parser.java
 b/modules/hidserv/src/org/torproject/metrics/hidserv/Parser.java
 index 2ef404e..215b0c9 100644
 --- a/modules/hidserv/src/org/torproject/metrics/hidserv/Parser.java
 +++ b/modules/hidserv/src/org/torproject/metrics/hidserv/Parser.java
 @@ -96,33 +96,7 @@ public class Parser {
    public void readParseHistory() {
      if (this.parseHistoryFile.exists()
          && this.parseHistoryFile.isFile()) {
 -      SortedMap<String, Long> excludedFiles =
 -          new TreeMap<String, Long>();
 -      try {
 -        BufferedReader br = new BufferedReader(new FileReader(
 -            this.parseHistoryFile));
 -        String line;
 -        while ((line = br.readLine()) != null) {
 -          try {
 -            /* Each line is supposed to contain the last-modified time
 and
 -             * absolute path of a descriptor file. */
 -            String[] parts = line.split(" ", 2);
 -            excludedFiles.put(parts[1], Long.parseLong(parts[0]));
 -          } catch (NumberFormatException e) {
 -            System.err.printf("Illegal line '%s' in parse history.  "
 -                + "Skipping line.%n", line);
 -          }
 -        }
 -        br.close();
 -      } catch (IOException e) {
 -        System.err.printf("Could not read history file '%s'.  Not "
 -            + "excluding descriptors in this execution.",
 -            this.parseHistoryFile.getAbsolutePath());
 -      }
 -
 -      /* Tell the descriptor reader to exclude the files contained in the
 -       * parse history file. */
 -      this.descriptorReader.setExcludedFiles(excludedFiles);
 +      this.descriptorReader.setHistoryFile(this.parseHistoryFile);
      }
    }

 @@ -130,33 +104,7 @@ public class Parser {
     * and absolute paths to the parse history file to avoid parsing these
     * files again, unless they change until the next execution. */
    public void writeParseHistory() {
 -
 -    /* Obtain the list of descriptor files that were either parsed now or
 -     * that were skipped in this execution from the descriptor reader. */
 -    SortedMap<String, Long> excludedAndParsedFiles =
 -        new TreeMap<String, Long>();
 -    excludedAndParsedFiles.putAll(
 -        this.descriptorReader.getExcludedFiles());
 -
 excludedAndParsedFiles.putAll(this.descriptorReader.getParsedFiles());
 -    try {
 -      this.parseHistoryFile.getParentFile().mkdirs();
 -      BufferedWriter bw = new BufferedWriter(new FileWriter(
 -          this.parseHistoryFile));
 -      for (Map.Entry<String, Long> e
 -          : excludedAndParsedFiles.entrySet()) {
 -        /* Each line starts with the last-modified time of the descriptor
 -         * file, followed by its absolute path. */
 -        String absolutePath = e.getKey();
 -        long lastModifiedMillis = e.getValue();
 -        bw.write(String.valueOf(lastModifiedMillis) + " " + absolutePath
 -            + "\n");
 -      }
 -      bw.close();
 -    } catch (IOException e) {
 -      System.err.printf("Could not write history file '%s'.  Not "
 -          + "excluding descriptors in next execution.",
 -          this.parseHistoryFile.getAbsolutePath());
 -    }
 +    this.descriptorReader.saveHistoryFile(this.parseHistoryFile);
    }

    /** Set of all reported hidden-service statistics.
 }}}

 I'll push a branch to my repository in a moment.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/20521>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list