[tor-bugs] #18910 [Metrics/CollecTor]: distributing descriptors accross CollecTor instances

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu Sep 15 09:51:32 UTC 2016


#18910: distributing descriptors accross CollecTor instances
-------------------------------+-----------------------------------
 Reporter:  iwakeh             |          Owner:  iwakeh
     Type:  enhancement        |         Status:  needs_information
 Priority:  High               |      Milestone:  CollecTor 1.1.0
Component:  Metrics/CollecTor  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:  ctip               |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+-----------------------------------
Changes (by iwakeh):

 * priority:  Medium => High


Comment:

 The following is a summary of the discussion above and elsewhere, and
 should give an overview of the first sync-version functionality.

 == Functionality and design of descriptor distribution in CollecTor 1.1.0
 === Configuration
 1. General settings
  Add a SyncManager configuration in the Scheduler section of the
 properties file.
  Property `SyncFolder` contains the path for storing the downloded
 descriptors.
 1. Choice of sync-sources
  Add a configuration property `SyncSources` containing an array of strings
 specifying a source name and source URL for each CollecTor instance to
 retireve descriptors from. This setup is similar to the current torperf
 configuration.
 1. Choice of descriptors
  Add a configuration property `SyncDescriptorLists`, which will contain
 comma separated lists (separated by space) with a source name defined in
 `SyncSources` and a list of descriptor designations.
 1. Backup of replaced local files
  if `KeepReplaceBackup` is set to true, keep a copy of the old local
 descriptors in `BackupFolder`.
 === SyncManager
 The SyncManager module will be started by the Scheduler accordinng to the
 configuration defined above.
 Each SyncManager run will perform the following steps:
 a. Retrieve descriptors from the CollecTor instances defined in
 `SyncSources`.  These descriptors are stored in `SyncFolder` under the
 host part of the instance's url, e.g. {{{my-sync-
 folder/collector.torproject.org/recent/exit-lists}}} for exitlists from
 the main instance.
 b. Following retrieval the fetched descriptors are examined:
   i. discard descriptor files that do not contain what they should (see
 comment:11) and log a warning with sync-source info and reason (see
 criteria).
   i. move valid descriptors (see criteria) without a pre-existing local
 copy to the localstore.
   i. if there is a local copy already, decide which copy to keep (see
 criteria).
     I. local copy is kept, log debug message with source and reason and
 delete fetched descriptor.
     I. local and fetched are identical, log debug message with source and
 reason and delete fetched descriptor.
     I. fetched copy should replace local descriptor. Depending on
 `KeepReplaceBackup` move local copy to `BackupFolder` and move fetched
 copy to main storage. If `KeepReplaceBackup` is false, replace local copy
 by fetched. In all cases log debug message with source and reason.

 === Replacement criteria
 As the replacement criteria are not fully defined yet and it is very
 likely that there will be more criteria in future a modular/pluggable
 approach seems useful, i.e.:
 1. define `KeepCriterium` and `ReplaceCriterium` interfaces
 1. register implementing classes with the SyncManager, which will apply
 these for the selection steps described above.

 == Open Questions
 A. Which `KeepCriterium` and `ReplaceCriterium` classes shuld be
 implemented initially?
  currently there are
  1. a `ReplaceCriterium` keep the consensus with more signatures and
  1. a `KeepCriterium` only keep descriptors that contain what they claim
 to be.
  1. More criteria that should be implemented with release 1.1.0?
 A. Should the applied criteria be configurable?  E.g. this could be done
 by listing the classes in collector.properties, but we have already more
 than fifty config settings, which is a lot.
 A. The data combination mentioned in comment:11 part two is not yet
 considered, but the design will be open to add this later.
  Anyway some questions: What kind of data enhancement could be there? What
 about descriptor signatures?

 -----
 Set to high in order to solve the open questions quickly.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18910#comment:13>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list