[metrics-bugs] #25103 [Metrics/Library]: Improve webstats performance

Tor Bug Tracker & Wiki blackhole at torproject.org
Fri Feb 2 18:02:12 UTC 2018

#25103: Improve webstats performance
 Reporter:  iwakeh           |          Owner:  metrics-team
     Type:  enhancement      |         Status:  needs_review
 Priority:  Medium           |      Milestone:
Component:  Metrics/Library  |        Version:
 Severity:  Normal           |     Resolution:
 Keywords:                   |  Actual Points:
Parent ID:  #25100           |         Points:
 Reviewer:                   |        Sponsor:
Changes (by iwakeh):

 * status:  merge_ready => needs_review


 Please review [https://gitweb.torproject.org/user/iwakeh/metrics-
 this commit] (based on the branch used above).

 This optimization of the log lines memory footprint will also benefit
 other API users when processing log descriptors.  'WebServerAccessLogLine'
 collects different items and uses references and is applied for fields
 that cannot become an enum type, e.g., there are usually only 'HTTP/1.0'
 and  'HTTP/1.1' as protocols, but we chose to also accommodate/allow
 others to be valid, too.

 I still have tests running importing CollecTor webstats logs.  An example
 of the gain these changes offer:  A heap dump of approx. 7.5G containing
 `73*10^6`  'WebServerAccessLogLine's only contains about
 `51*10^3`  String instances and less than `10^4` LocalDate instances.  As
 the heap is from importing logs with CollecTor most of the LocalDate
 instances are parts of paths and the dateList contains contains just the
 327 dates of the days available for import.

 I'll post more regarding CollecTor's webstat module on #25100.

 There is also [https://gitweb.torproject.org/user/iwakeh/metrics-
 another commit] providing hashCode and equals implementations for future
 use, but they not used currently.

Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25103#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online

More information about the metrics-bugs mailing list