[tor-dev] Local onionoo cache
karsten at torproject.org
Mon May 13 08:51:59 UTC 2013
On 5/13/13 9:38 AM, Roger Dingledine wrote:
> On Mon, May 13, 2013 at 08:58:27AM +0200, Karsten Loesing wrote:
>> The only downside I can see is that it takes about 30--45 minutes for
>> new exits to show up in your local cache. An alternative would be to
>> query the exit list yourself, download the most recent consensus, and
>> compile a list of exit addresses yourself.
> Speaking of delays: the place that knows about new relays first is each
> directory authority. Not only that, but they know also what IP address
> the relay is exiting from, since that's where the relay publishes its
> descriptor from. E.g.,
> @uploaded-at 2013-05-12 16:57:35
> @source "188.8.131.52"
> router nthdimension 184.108.40.206 443 0 0
> Seems like this info could provide an alternative, simpler way to generate
> the exit-addresses file:
> which if we're doing our modularity right, should be the input to the
> various other scripts.
Interesting. Haven't thought of using that information. metrics-db
even has this information available from gabelmoo, because it rsyncs
gabelmoo's cached-* files (and v3-status-votes) once per hour. But
metrics-db discards all descriptor annotations so far.
However, I don't think this information can replace the information we
learn from TorDNSEL or TorBEL. Some concerns:
- Relays may exit from more than just one IP address, but the directory
authorities would only see at most one of these addresses. Here's an
exit list entry with two exit IP addresses:
Published 2013-05-12 20:59:32
LastStatus 2013-05-12 22:02:59
ExitAddress 220.127.116.11 2013-05-12 22:03:11
ExitAddress 18.104.22.168 2013-05-12 22:03:11
- The directory authorities sometimes download descriptors they don't
have from other directory authorities. In that case we don't learn the
IP address that the relay exits from. Here's an example:
@downloaded-at 2013-05-12 18:50:10
- The directory authorities are indeed the first to learn these source
IP addresses. But we probably don't want arbitrary services to query
the authorities frequently for their cached descriptors to learn their
annotations. That means we'd have to aggregate and cache this
information at another place, which introduces a delay.
> I guess I should make a trac ticket of this idea. But which component?
> We sure seem to have a lot of projects that overlap tordnsel / torbel in
> some way.
For now, I'd say it's an "Analysis" ticket, because we don't yet know
how to use this information. If you want to make a ticket, I'll paste
my concerns above there.
And you're right that Onionoo overlaps with TorDNSEL/TorBEL to a certain
extent. Or rather, it uses their data and presents them in a more
convenient way. This wasn't planned, and it would be better if
TorDNSEL/TorBEL had a more convenient interface that people could use
instead. Until that's the case, people can easily use Onionoo.
More information about the tor-dev