[metrics-bugs] #26585 [Metrics/Onionoo]: improve AS number and name coverage (switch maxmind to RIPE Stat)

Tor Bug Tracker & Wiki blackhole at torproject.org
Sun Jul 1 14:24:17 UTC 2018


#26585: improve AS number and name coverage (switch maxmind to RIPE Stat)
-----------------------------+------------------------------
 Reporter:  nusenu           |          Owner:  metrics-team
     Type:  enhancement      |         Status:  new
 Priority:  Medium           |      Milestone:
Component:  Metrics/Onionoo  |        Version:
 Severity:  Normal           |     Resolution:
 Keywords:                   |  Actual Points:
Parent ID:                   |         Points:
 Reviewer:                   |        Sponsor:
-----------------------------+------------------------------

Comment (by irl):

 For country codes, there are 321 relays where there are disagreement and
 7837 in agreement (κ = 0.959 excluding relays for which MaxMind had no
 country code). There were no relays for which RIPEstat did not return a
 country code, but there were 21 relays for which MaxMind was missing a
 country code. This leaves 300 relays for which both MaxMind and RIPEstat
 had a country code but there was disagreement.

 RIPEstat does return 7 relays with the country code "eu" and 1 relay with
 the country code "ap" for Europe and Asia/Pacific respectively.
 [[https://dev.maxmind.com/geoip/legacy/codes/iso3166/|MaxMind have
 documentation]] indicating that they also use these codes, but did not
 return any results with these codes. In all of these cases, MaxMind did
 not have a country code.

 Without ground truth to compare to, it is not possible to say whether
 MaxMind or RIPEstat are correct in the cases where there were
 disagreement. It is also possible that MaxMind and RIPEstat agree on a
 country code that is incorrect.

 For AS numbers, there are 269 relays where there are disagreement and 7889
 in agreement (κ = 0.979 excluding relays for which either MaxMind or
 RIPEstat had no AS number). There were 101 relays for which MaxMind did
 not return an AS number and 2 relays for which RIPEstat did not return an
 AS number. Both of the relays for which RIPEstat did not return an AS
 number were in the 1.0.0.0/8 BGP prefix which has the "cn" country code
 for RIPEstat, but the "au" country code from MaxMind. MaxMind placed these
 relays in AS 4804.

 It is not clear to me what our threshold on agreement should be. As the
 MaxMind database is distributed to users and can be used, for example, to
 disable/prefer the use of exit relays in specific countries, it may be
 dangerous to users if they get mixed information about the country code
 assigned to relays. It may be equally dangerous to incorrectly assign
 country codes, but without ground truth to compare to it is not possible
 to say whether a switch would improve that situation or not.

 We should conduct an analysis of the different databases and feeds
 available to us, to determine which best fits our requirements. As for
 querying RIPEstat, I have [[https://github.com/britram/canid|a tool]]
 which I have used in the above analysis and would make it easier to
 integrate this into Onionoo if we were to choose to integrate data from
 RIPEstat.

 I don't believe we should consider outright replacing MaxMind with
 RIPEstat for the reason that we distribute this to end clients and we need
 a database that we can do this with, but I can see that having additional
 information when MaxMind does not have any information, and also to add
 the BGP prefix information (finer grained topology information than just
 AS) would be valuable to some users.

 What do you think about the addition of two new fields: 'country_source'
 and 'as_source' to indicate the source of country/as information? We could
 then supplement the MaxMind data with data from RIPEstat where MaxMind
 does not have the information while being able to make it clear to users
 where the information has come from if that is important to them.

 We could also additionally add a 'bgp_prefix' field with prefix data from
 RIPEstat.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26585#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list