[ooni-dev] Minor measurements API feature request: order_by=index

David Fifield david at bamsoftware.com
Tue Jun 27 00:45:24 UTC 2017


Currently you can do queries with order_by=test_start_time,
order_by=probe_cc, etc., but you cannot do order_by=index.

https://measurements.ooni.torproject.org/api/v1/files?limit=1&order_by=index
{
  "error_code": 400, 
  "error_message": "Invalid order_by"
}

As I understand it, the difference between index and test_start_time is
that index is always increasing over time (newly uploaded reports always
get a higher index than existing reports), while newly uploaded reports
can have a test_start_time that is in the past (if the probe was not
able to upload for a time, for example).

The ability to order_by=index would allow a slight robustness
enhancement in ooni-sync, in the case when a new report is uploaded
while ooni-sync is running. Currently ooni-sync always does
	order=asc&order_by=test_start_time&limit=1000
That is, starting with the oldest reports, get a page of 1000 reports at
a time. The issue is what happens when a report from the past is
uploaded while ooni-sync is downloading. In this case ooni-sync will not
notice the new report right away. Here is an example with made-up
indexes and dates:
	ooni-sync starts downloading page 0 from index=5000 (2016-01-01) to index=5999 (2016-03-31)
	new report with index=9999 (2016-02-01) appears, gets inserted into page 0
	ooni-sync finishes downloading page 0
	ooni-sync starts downloading page 1 from index=5999 (2016-03-31) to index=6998 (2016-04-05)
	ooni-sync finishes downloading page 1
In this example, ooni-sync never downloads the report with index=9999.
Also, it sees index=5999 twice, because index=9999 pushed index=5999
from page 0 to page 1.

An order_by=index option would prevent newly uploaded reports from
unaligning the pages like that (at least when order_by=asc is used).

The reasons why this is minor minor minor and hardly worth mentioning:
 * index=9999 will get downloaded the next time you run ooni-sync
 * it can't cause ooni-sync to skip any already uploaded reports (it
   would, with order=desc, but that's why ooni-sync uses order=asc)
 * ooni-sync will see but won't actually download index=5999 twice
 * newly uploaded reports are likely to be on the last page anyway


More information about the ooni-dev mailing list