[tor-dev] Using MaxMind's GeoIP2 databases in tor, BridgeDB, metrics-*, Onionoo, etc.

Karsten Loesing karsten at torproject.org
Wed Feb 26 08:48:02 UTC 2014


On 03/02/14 16:23, Karsten Loesing wrote:
> On 30/01/14 19:31, Nick Mathewson wrote:
>> On Wed, Jan 29, 2014 at 1:53 PM, Nick Mathewson <nickm at alum.mit.edu>
>> wrote:
>>> On Thu, Jan 16, 2014 at 5:15 AM, Karsten Loesing
>>> <karsten at torproject.org> wrote:
>>   [...]
>>>> Another option is to write a new tool that parses their full databases
>>>> and converts them into file formats we already support
>>   [...]
>>>
>>> Writing our own parser seems goofy.
>>
>> Then again, I'm a goofy guy, and though the format's kinda ugly, I've
>> seen much worse.
>>
>> https://github.com/nmathewson/mmdb-convert
>>
>> If we use this thing, we should move it over into src/config in Tor.
>>
>> It needs more hacking and testing, but at least it's pure python.
> 
> This is great!
> 
> I started trying it out and reviewing it.  I'm planning to improve it
> this week until it writes geoip and geoip6 files that we can then place
> into src/config/.  (And I'm thinking about using the same file in
> Onionoo and dropping city information there, for simplicity.)
> 
> Thanks for starting this!

Keeping this list in the loop how most of our IP-to-country-resolution
problems are solved by now.  Quoting from my original mail sent to this
list on January 16, 2014:

>  - tor: We ship little-t-tor with a geoip and a geoip6 file for clients
> to support excluding relays by country code and for relays to generate
> by-country statistics.

We added a slightly revised version of Nick's mmdb-convert tool to tor
master [0].  We also produced a new geoip file [1] for IPv4 addresses
and a new geoip6 file [2] for IPv6 addresses based on the February 7,
2014 MaxMind GeoLite2 Country database.

[0]
https://gitweb.torproject.org/tor.git/blob/HEAD:/src/config/mmdb-convert.py

[1] https://gitweb.torproject.org/tor.git/blob/HEAD:/src/config/geoip

[2] https://gitweb.torproject.org/tor.git/blob/HEAD:/src/config/geoip6

>  - BridgeDB: I vaguely recall that the BridgeDB service uses GeoIP data
> to return only bridges that are not blocked in a user's country.  Or
> maybe that was a feature yet to be implemented.

Haven't heard from BridgeDB folks.  If they need IP-to-country data,
they can probably use tor's geoip and geoip6 files.

>  - Onionoo: The Onionoo service uses MaxMind's city database to provide
> location information of relays.  (It also uses MaxMind's ASN database to
> provide information on AS number and name.)

Onionoo now uses the GeoLite2 City database to resolve relay IP
addresses to country codes, country names, region names, city names,
latitudes, and longitudes.  I decided against using tor's geoip and
geoip6 files, because suddenly MaxMind decided to put CSV files on their
website, and updating Onionoo's parser for those was not too hard.

>  - metrics-db: I'm planning to use GeoIP data to resolve bridge IP
> addresses to country codes in the bridge descriptor sanitizing process.

I'd probably re-use the code from Onionoo for this task.

>  - metrics-web: We have been using GeoIP data to provide statistics on
> relays by country.  This is currently disabled because the
> implementation was eating too many resources, but I plan to put these
> statistics back.

Same as metrics-db, unless somebody wants to do this in Python.  But we
have two working solutions that can be adapted for metrics-web.

So, guess that resolves this thread.  Thanks again, Nick, for being a
goofy guy!

All the best,
Karsten



More information about the tor-dev mailing list