[metrics-bugs] #21515 [Metrics/CollecTor]: Add auxiliary data on Tor relays and bridges to CollecTor
Tor Bug Tracker & Wiki
blackhole at torproject.org
Mon Feb 20 15:00:27 UTC 2017
#21515: Add auxiliary data on Tor relays and bridges to CollecTor
Reporter: karsten | Owner: metrics-team
Type: enhancement | Status: new
Priority: Medium | Milestone:
Component: Metrics/CollecTor | Version:
Severity: Normal | Keywords:
Actual Points: | Parent ID:
Points: | Reviewer:
This ticket is the result of a local TODO list review and combines a few
related ideas. Some of the ideas here are new, but some are really old
and have been sitting on my list forever.
The general idea here is that CollecTor could provide auxiliary data on
Tor relays and bridges. The main goal would be that other applications
like Onionoo and Metrics but also Nyx can use this data to provide richer
information on relays and bridges to their users. A secondary goal would
be that CollecTor would serve as an archive for this data for future
applications that don't exist yet.
Auxiliary data might include:
1. GeoIP country database: This is the same data as the Tor daemon uses
internally to resolve relay IP addresses to country codes. We would be
able to produce historical data by extracting `src/config/geoip` files
from the Tor daemon Git repository. This data could be used by Metrics to
bring back the relays by country graph.
2. GeoIP city database: This data would be the same as Onionoo uses to
resolve relay IP addresses to city names. The main advantage of having
this file in CollecTor would be that Onionoo could automatically pull this
data instead of relying on the operator to update GeoIP files.
3. GeoIP ASN database: This is similar to 2 but for ASN information.
4. Bridge GeoIP country database: Here's an idea to provide country
information for bridges despite replacing IP addresses by hashes.
CollecTor could keep a list of all bridge IP addresses in a given month
and use the GeoIP country database from 1 to produce a custom database for
resolving bridge IP addresses to country codes. Basically, that database
would contain hashed fingerprints, 10.x.y.z IP addresses, and country
codes. CollecTor would add a new line to this file whenever it observes a
new bridge IP address, which would happen once per hour in particular at
the beginning of a month. This file would change once per month when
hashes for 10.x.y.z addresses change. However, this means that we'd have
to reprocess the entire bridge tarball archive to generate older database
files, because we have long deleted the inputs for generating those old
10.x.y.z IP addresses. Consumers of this data would be Onionoo but also
Metrics for a new bridge country graph.
5. Relay reverse DNS entries: Right now, Onionoo runs its own rDNS
resolver. But we could as well run that as part of CollecTor and provide
the output data in a new data format to everyone who needs it. There
would also be other consumers of this data, including the relay controller
Nyx which would be display rDNS entries without risking to leak who is
fetching that information.
This is a lot, but maybe there's even more. It's probably useful to
discuss these different new data sets together. Once we decide we want to
provide some or even all of them we should switch to child tickets. And
just to set expectations right, it's probably going to take months to find
enough time to implement these new data sets, if we think it's a good
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/21515>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs