[tor-bugs] #18910 [Metrics/CollecTor]: distributing descriptors accross CollecTor instances
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Sep 15 09:51:32 UTC 2016
#18910: distributing descriptors accross CollecTor instances
-------------------------------+-----------------------------------
Reporter: iwakeh | Owner: iwakeh
Type: enhancement | Status: needs_information
Priority: High | Milestone: CollecTor 1.1.0
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: ctip | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+-----------------------------------
Changes (by iwakeh):
* priority: Medium => High
Comment:
The following is a summary of the discussion above and elsewhere, and
should give an overview of the first sync-version functionality.
== Functionality and design of descriptor distribution in CollecTor 1.1.0
=== Configuration
1. General settings
Add a SyncManager configuration in the Scheduler section of the
properties file.
Property `SyncFolder` contains the path for storing the downloded
descriptors.
1. Choice of sync-sources
Add a configuration property `SyncSources` containing an array of strings
specifying a source name and source URL for each CollecTor instance to
retireve descriptors from. This setup is similar to the current torperf
configuration.
1. Choice of descriptors
Add a configuration property `SyncDescriptorLists`, which will contain
comma separated lists (separated by space) with a source name defined in
`SyncSources` and a list of descriptor designations.
1. Backup of replaced local files
if `KeepReplaceBackup` is set to true, keep a copy of the old local
descriptors in `BackupFolder`.
=== SyncManager
The SyncManager module will be started by the Scheduler accordinng to the
configuration defined above.
Each SyncManager run will perform the following steps:
a. Retrieve descriptors from the CollecTor instances defined in
`SyncSources`. These descriptors are stored in `SyncFolder` under the
host part of the instance's url, e.g. {{{my-sync-
folder/collector.torproject.org/recent/exit-lists}}} for exitlists from
the main instance.
b. Following retrieval the fetched descriptors are examined:
i. discard descriptor files that do not contain what they should (see
comment:11) and log a warning with sync-source info and reason (see
criteria).
i. move valid descriptors (see criteria) without a pre-existing local
copy to the localstore.
i. if there is a local copy already, decide which copy to keep (see
criteria).
I. local copy is kept, log debug message with source and reason and
delete fetched descriptor.
I. local and fetched are identical, log debug message with source and
reason and delete fetched descriptor.
I. fetched copy should replace local descriptor. Depending on
`KeepReplaceBackup` move local copy to `BackupFolder` and move fetched
copy to main storage. If `KeepReplaceBackup` is false, replace local copy
by fetched. In all cases log debug message with source and reason.
=== Replacement criteria
As the replacement criteria are not fully defined yet and it is very
likely that there will be more criteria in future a modular/pluggable
approach seems useful, i.e.:
1. define `KeepCriterium` and `ReplaceCriterium` interfaces
1. register implementing classes with the SyncManager, which will apply
these for the selection steps described above.
== Open Questions
A. Which `KeepCriterium` and `ReplaceCriterium` classes shuld be
implemented initially?
currently there are
1. a `ReplaceCriterium` keep the consensus with more signatures and
1. a `KeepCriterium` only keep descriptors that contain what they claim
to be.
1. More criteria that should be implemented with release 1.1.0?
A. Should the applied criteria be configurable? E.g. this could be done
by listing the classes in collector.properties, but we have already more
than fifty config settings, which is a lot.
A. The data combination mentioned in comment:11 part two is not yet
considered, but the design will be open to add this later.
Anyway some questions: What kind of data enhancement could be there? What
about descriptor signatures?
-----
Set to high in order to solve the open questions quickly.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18910#comment:13>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list