[tor-bugs] #25196 [Metrics/Statistics]: Cut off recent dates from several CSV files

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Mar 7 14:47:45 UTC 2018


#25196: Cut off recent dates from several CSV files
--------------------------------+------------------------------
 Reporter:  karsten             |          Owner:  karsten
     Type:  defect              |         Status:  needs_review
 Priority:  Medium              |      Milestone:
Component:  Metrics/Statistics  |        Version:
 Severity:  Normal              |     Resolution:
 Keywords:                      |  Actual Points:
Parent ID:                      |         Points:
 Reviewer:  iwakeh              |        Sponsor:
--------------------------------+------------------------------

Comment (by iwakeh):

 Regarding webstats our [https://metrics.torproject.org/web-server-
 logs.html#n-discarding-non-matching-lines spec] states in section 4.1:

     In addition, log lines are treated differently according to the date
 they contain:

     During an import process the sanitizer takes all log line dates into
 account and determines the reference interval as stretching from the
 oldest date to the youngest date encountered. Depending on the reference
 interval log lines are not yet processed, if their date is on the edges of
 the reference interval, i.e., the date is not at least a day younger than
 the older endpoint or the date is only LIMIT days older than the younger
 endpoint, where LIMIT is initially set to two, but this might change if
 necessary.
     If the younger endpoint of the reference interval coincides with the
 current system date, the day before is used as the new younger reference
 interval endpoint, which ensures that the sanitizer won't publish logs
 prematurely, i.e., before there is a chance that they are complete. Thus,
 processing of log lines carrying such date is postponed.
     All log lines with dates for which the sanitizer already published a
 log file are discarded in order to avoid altering published logs.

 This means that logs are published (earliest) two days before today; two
 days before current system day.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25196#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list