[tor-bugs] #18910 [Metrics/CollecTor]: distributing descriptors accross CollecTor instances

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun Jun 26 11:28:15 UTC 2016


#18910: distributing descriptors accross CollecTor instances
-------------------------------+-----------------------------------
 Reporter:  iwakeh             |          Owner:  iwakeh
     Type:  enhancement        |         Status:  needs_information
 Priority:  Medium             |      Milestone:
Component:  Metrics/CollecTor  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:  ctip               |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+-----------------------------------

Comment (by iwakeh):

 Some thoughts:

 === The CollecTor side
 Maybe CollecTor (or the Metrics Team) needs a data collection and handling
 policy?
 (Or, is there anything like that I didn't find yet other than the license
 and of course the Tor-wide privacy goals?)

 In general, CollecTor shouldn't attempt to make received data better than
 it is
 by dropping unwanted things. At least not without some defined process.
 And collected data should only be changed when there is a reason for
 obfuscation or
 when it is enhanced (e.g. by adding the @source tag).

 === Handling of //unwanted// data
 Incomplete unreferenced server descs could be stored differently:
 * referenced server descs can be stored in the way it is done now and
 * unreferenced can be kept, but stored seperately.

 The synch-process could first concentrate on the referenced descriptors.

 === Regarding the repeated uploads:
 What is the reason for all these server descriptors gabelmoo received?
 Is there some benign explanation for the uploads?

 There are two routers uploading more than 5000
 [https://collector.torproject.org/recent/relay-descriptors/server-
 descriptors/2016-06-23-11-05-13-server-descriptors server-descriptors] in
 less than an hour:

 {{{
 router ThePuppetMasterIN 94.23.181.19 9001 0 9030
 router ThePuppetMasterMID 94.23.181.18 9001 0 9030

 grep -c "PuppetMasterIN" /tmp/2016-06-23-11-05-13-server-descriptors
 1800
 grep -c "PuppetMasterMID" /tmp/2016-06-23-11-05-13-server-descriptors
 3596
 }}}

 These two routers shouldn't upload descriptors again and again.
 The descriptors do not differ in relevant fields according to
 [https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n336 dir-
 spec].
 Is this not a problem that should be tackled on the Tor side?


 Maybe, we should actually search the old data for more upload frencies
 like the one triggering this discussion?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18910#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list