[tor-bugs] #16596 [Metrics/ExoneraTor]: Change database queries towards making only a single query per request

Tor Bug Tracker & Wiki blackhole at torproject.org
Fri Sep 8 20:22:15 UTC 2017


#16596: Change database queries towards making only a single query per request
--------------------------------+-----------------------------
 Reporter:  karsten             |          Owner:  karsten
     Type:  enhancement         |         Status:  merge_ready
 Priority:  Medium              |      Milestone:
Component:  Metrics/ExoneraTor  |        Version:
 Severity:  Normal              |     Resolution:
 Keywords:                      |  Actual Points:
Parent ID:                      |         Points:
 Reviewer:                      |        Sponsor:
--------------------------------+-----------------------------
Changes (by iwakeh):

 * status:  needs_review => merge_ready


Comment:

 Ok, I also took the time to make some changes, some of which I also found
 necessary for the reviewing itself.
 More inline ...

 Replying to [comment:9 karsten]:
 > ...
 > > Hmm, right. My idea was that we'll later copy over this servlet to
 metrics-web where it would make a little more sense to make the host name
 configurable. But on second thought that's not really the case. I'll just
 hard-code it.

 Fine.

 >
 > ...
 > > > In 'step 3' I see some problems with `null` values, for example,
 `"".equals(null)` would evaluate to false (line 147 following).
 > >
 > > Note that we're using `null` for invalid parameter values and `""` for
 empty parameter values. Do you see actual problems or potential problems
 there?
 > >
 > > In the latter case (potential problems), maybe we can resolve them by
 documenting things a bit better, or by making things clearer in subsequent
 commits.

 Maybe, think about a clearer way of documenting.  There is not actual
 problem considering the checks before the call to `writeForm`.

 > > ...
 > > > For most exceptions caught the error message should be logged; and,
 it might be time to switch to slf4j-api and logback, now.
 > >
 > > Yes, we can switch to slf4j, though we should do that in a separate
 commit on top of these, probably in a new ticket.

 Please find a commit on my task branch for that (link below).

 > ...
 >
 > > > QueryServlet: Helper methods ` private String
 parseTimestampParameter` and `private String parseIpParameter` should be
 `public static` and tests should be added.  Similarly, `private String
 convertIpV*ToHex`, which could be combined by simply adding another
 argument, i.e., `public static String convertIpV4ToHex(String relayIp,
 boolean isIpV4)`.
 > >
 > > Agreed, though let's save that for another ticket and not overload
 this ticket with too many improvements all at once. I know it's tempting.
 Stay strong, we'll eventually fix these things.

 Ok, new ticket.

 > >
 > > > A `MILLISEC_IN_DAY = 24L * 60L * 60L * 1000L` constant would be
 useful.
 >
 > Fixed.

 Fine.

 >
 > >
 > > > Other: Using the object `Boolean exit` for a triplet state is a bit
 too much of a hack.  There could be a simple enum, as simple as, for
 example, `public enum Exit { U, Y, N; }`.
 > >
 > > Ugh, that would turn the JSON field into a string, which doesn't seem
 very intuitive for "is an exit". And whoever processes this JSON will need
 to check for `null` anyway, regardless of boolean or string. In fact, I
 think that we're using the boolean field exactly in the way it's supposed
 to be used: `true` means "is an exit", `false` means "is not an exit", and
 `null` means "we don't know". I think I'd like to leave this one
 unchanged. It's not a hack.

 This is what I added to the third commit in my branch.  The U,N,Y
 abbreviations are handy and more readable.
 In addition, setting a value to `null` will make the default Gson omit the
 field from the json output.  That is not that intuitive.

 > >
 > > > Regarding SQL:
 > > > Order-by statement should be using names not numbers.
 > >
 > > Hmm, now that you mention that, I wonder if we even need results to be
 sorted at all. I'll try to take that out. Otherwise I'd change numbers to
 names in a subsequent commit, because it's not directly related to this
 ticket.
 >
 > Fixed.

 Fine, not ordering should also save some query time.

 >
 > > > Couldn't the exit-information be part of the query already?
 > >
 > > Yes, but I'd like to save database changes for a later ticket. There's
 so, so much more we can do to make the database schema more efficient.
 (I'd have to look at my notes from last year, but I believe that we're
 storing exit information directly in the database rather than the entire
 raw status entry.)
 > >
 > > > (I could also look into providing some patches, if we agree on the
 changes and you don't have the time?)
 > >

 The changes I added in
 [https://gitweb.torproject.org/user/iwakeh/exonerator.git/log/?h=task-16596
 this branch] start adding some tests also for the JSON/Gson related code.
 The tests document what JSON format to expect.  I moved all Gson related
 into QueryResponse.  If we ever switch to some other JSON generator/parser
 it will be easier to accomplish.
 The version information is also all moved there.  Exit is an enum now and
 `relevantStatuses` a primitive.  The VERSION constant is the version
 implemnted by QueryServlet and 'version' what got read (default VERSION).
 The test changes are split in two commits: one for the restructuring and
 one for the actual tests.  Please check the tests carefully, as I simply
 made everything pass as it is right now.
 Another commit streamlines all logging as slf4j and logback were already
 part of ExoneraTor.

 The first commit adapted .gitignore and the fifth sets Java 8, which works
 fine.

 Ready for merge hopefully with tests :-)

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/16596#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list