[metrics-team] Data Files Country Codes
karsten at torproject.org
Thu Jun 13 20:14:14 UTC 2019
On 2019-06-13 19:29, Daniel Herschel wrote:
> I was looking to do some data analysis and data visualization using your
> publicly available datasets (these
> ones: https://metrics.torproject.org/stats.html), and I had a question
> regarding the country column present in a number of the datasets.
> The columns documentation says that the country codes are based on GeoIP
> addresses. Using a list a GeoIP address (found
> here: https://dev.maxmind.com/geoip/legacy/codes/iso3166/), I was able
> to convert most of these codes to their corresponding country name for
> ease in reading on visualizations.
> However, I did find some countries that did not have a mapping. Do you
> know what these countries would be/what the codes correspond to? The
> image below shows the codes in question. (dd, xk, an, cs, du are the
> specific codes I am looking at. NaN means the entry was empty and ?? is
> your code for unknown.)
It looks like these codes come from single relays reporting statistics
and using another GeoIP database than the one shipped with the tor software.
I think it's safe to just consider all of these users as coming from an
unknown country (??).
Hope this works for you!
All the best,
> The last part of the image shows the counts for each appearance within
> the file (this being the relay_users file). As you can see, there are
> many data points for these codes, so it would be great to know what
> country they correspond to.
> I appreciate any answers you can provide.
> metrics-team mailing list
> metrics-team at lists.torproject.org
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 528 bytes
Desc: OpenPGP digital signature
More information about the metrics-team