[tor-dev] Proposal: Check Maxmind GeoIP DB before distributing

Katharina Kohls katharina.kohls at rub.de
Tue Jul 3 09:34:33 UTC 2018


Hi,

On 30.06.2018 13:53, Jaskaran Singh wrote:
> 5. Dealing with false positives
> Maxmind calculates geolocation of an IP addr using WHOIS records,
> Reverse DNS etc. It claims to have precision rate of 99.5% on country
> level. The other 0.5% is more likely to be those IP addresses for which
> neither WHOIS record nor Reverse DNS are setup.
>
> A very large percentage of Tor Nodes are run from datacenters, which
> usually have all their records set up. It's highly unlikely for an IP
> address belonging to a datacenter to be mapped to a wrong location.
>
> Hence, false positives would be very few, and can be safely ignored
> after a simple manual/scripted investigation.
We measured Tor relay locations a while ago using ICMP RTT measurements 
from multiple server instances located in Europe, North America, Asia, 
and Oceania. Using the minimum RTT for each connection*, we applied 
multilateration for estimating the location of a relay. Even though this 
approach is noisy because of varying network conditions and routes, we 
still get a good estimate of the relay's actual position.

We compared our estimated ICMP relay locations with the GeoIP information:
- our test set consisted of a full consensus
- we conducted the measurements within 5 days and repeated reference 
experiments a month later to test the stability of results
- we sent 500 pings per relay from 8 remote servers and repeated the 
measurements multiple times
- we use the minimum RTT as input for the multilateration

Results can be summarized as follows:
- the median location error is in a range of 440km
- 287 outliers are more than 2654km away from the position that GeoIP 
suggested. This represents ~4.6% of the tested relays
- the 75th percentile of nodes differs by more than 1000km

Currently we repeat the experiments with 16 instead of 8 servers and 
work on improving the evaluation to improve the location estimate.

We cannot take these results as a ground truth, as a majority of GeoIP 
locations already document the actual country and continent a relay is 
in. Nevertheless, this is a good way to add an independent verification 
step. The location error for the outliers is a proof that there are 
nodes that actually run on a different continent and this is an 
important security issue if users want to circumvent a certain country. 
The same applies for the 75th percentile, which also leads to updated 
country information for a significant set of relays.

We can conclude that yes, a large percentage of Tor nodes have OK 
records. But the number of false positives is not that low and, from my 
opinion, cannot be ignored. Besides an independent verification step, 
for which I suggest timing measurements and multilateration, location 
errors that lead to an updated country code should be considered as 
update (or respective nodes should be flagged).

*this follows the motivation that no transmission can ever be faster 
than a certain threshold, so the minimum RTT is the closest we can get 
to this threshold


Cheers,
Katharina


More information about the tor-dev mailing list