[tor-bugs] #12676 [Metrics Data Processor]: Bridge descriptors CollecTor's recent/ directory contain many duplicates

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Jul 22 07:53:49 UTC 2014


#12676: Bridge descriptors CollecTor's recent/ directory contain many duplicates
------------------------------------+---------------------
 Reporter:  karsten                 |          Owner:
     Type:  defect                  |         Status:  new
 Priority:  minor                   |      Milestone:
Component:  Metrics Data Processor  |        Version:
 Keywords:                          |  Actual Points:
Parent ID:                          |         Points:
------------------------------------+---------------------
 The `recent/` directory should only contain new descriptors, and ideally
 no duplicates.  I just found that the latter is not the case:

 {{{
 $ grep -c "@type" recent/bridge-descriptors/server-
 descriptors/2014-07-22-07-04-02-server-descriptors
 18175
 $ grep -c "@type" recent/bridge-descriptors/extra-
 infos/2014-07-22-07-04-02-extra-infos
 9723
 }}}

 Compare this to relay descriptors:

 {{{
 $ grep -c "@type" recent/relay-descriptors/server-
 descriptors/2014-07-22-07-05-52-server-descriptors
 931
 $ grep -c "@type" recent/relay-descriptors/extra-infos/2014-07-22-07-05-52
 -extra-infos
 930
 $ grep -c "@type" recent/relay-
 descriptors/microdescs/micro/2014-07-22-07-05-52-micro
 30
 }}}

 The reason is that only novel relay descriptors will be downloaded and
 stored to disk, but the parsed bridge descriptor tarballs are full
 snapshots of Tonga's cached descriptor files.  We need to add a check
 whether we already have a sanitized bridge descriptor and only store it if
 not.

 Priority is minor, because this only adds some additional load on clients
 parsing descriptors more than once.  But other than that it's mostly
 harmless.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/12676>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list