[ooni-dev] Wrong measurements from beta Measurements API?

David Fifield david at bamsoftware.com
Tue Jul 25 00:45:39 UTC 2017


I was experimenting with adapting ooni-sync to the /api/v1/measurements
endpoint. A minimal proof of concept patch is attached. While trying it,
I found that the API was returning duplicate measurements and
measurements that don't seem to match the query. I'm using this command:
	./ooni-sync -xz -directory measurements.archive input=archive.org since=2017-01-01


Here is a query that at the moment happens to return two results with
the same measurement_id and measurement_url, but different input and
measurement_start_time. There are a few more example of this phenomenon
(I found it 6 times in the first 1000 measurements I downloaded).

https://measurements-beta.ooni.io/api/v1/measurements?input=archive.org&limit=100&offset=500&order=asc&order_by=measurement_start_time&since=2017-01-01
    {
      "input": "http://archive.org", 
      "measurement_id": "51daa51b-07d2-491e-ba2b-9189e1a08146", 
      "measurement_start_time": "2017-01-04T01:55:06Z", 
      "measurement_url": "https://measurements.ooni.torproject.org/api/v1/measurement/51daa51b-07d2-491e-ba2b-9189e1a08146", 
      "probe_asn": "AS3243", 
      "probe_cc": "PT", 
      "report_id": "20170104T105911Z_AS3243_OadZCx9yRNvqKYsLQaQDa3c1swLofXEQNtcplXQ14QrXemKcCT", 
      "test_name": "web_connectivity"
    }, 
    {
      "input": "http://wayback.archive.org", 
      "measurement_id": "51daa51b-07d2-491e-ba2b-9189e1a08146", 
      "measurement_start_time": "2017-01-04T09:19:42Z", 
      "measurement_url": "https://measurements.ooni.torproject.org/api/v1/measurement/51daa51b-07d2-491e-ba2b-9189e1a08146", 
      "probe_asn": "AS3243", 
      "probe_cc": "PT", 
      "report_id": "20170104T105911Z_AS3243_OadZCx9yRNvqKYsLQaQDa3c1swLofXEQNtcplXQ14QrXemKcCT", 
      "test_name": "web_connectivity"
    }, 


I also found that the results included some entries whose "input" field
didn't seem to match the query. Here is a small sample of them. So far
I've found 57/5783 (10%) of downloads whose input doesn't contain
"archive.org".

https://measurements-beta.ooni.io/api/v1/measurement/00868be9-2441-42fb-9691-95501d6b93df "http://www.imdb.com"
https://measurements-beta.ooni.io/api/v1/measurement/0214f18c-058c-44ef-b291-9db88cc923dc "http://666games.net"
https://measurements-beta.ooni.io/api/v1/measurement/03b771b2-9f2c-4eee-8835-5128bf9e7832 "http://www.cesr.org"
https://measurements-beta.ooni.io/api/v1/measurement/0cc491bb-30a0-4dea-9271-1f6ba23c2b8a "http://adultfriendfinder.com"
https://measurements-beta.ooni.io/api/v1/measurement/0fdfa57f-836f-4a75-8543-c8dcade5455a "http://last.fm"
https://measurements-beta.ooni.io/api/v1/measurement/10f1f5ad-91c4-46f5-9d6a-38e455fd7158 "http://www.earthwatch.org"
https://measurements-beta.ooni.io/api/v1/measurement/14520322-b00d-437c-be29-22dc9b2cdc75 "http://abpr2.railfan.net"
https://measurements-beta.ooni.io/api/v1/measurement/1bb1aa36-f4fe-440c-be03-6029955c90ea "http://666games.net"
https://measurements-beta.ooni.io/api/v1/measurement/210703f1-e52c-4740-99b3-36c2db849cc1 "http://amphetamines.com"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Minimal-hacks-to-use-the-api-v1-measurements-endpoin.patch
Type: text/x-diff
Size: 4257 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/ooni-dev/attachments/20170724/fa90d57f/attachment.patch>


More information about the ooni-dev mailing list