[ooni-dev] Bridge reachability results

Ruben Bloemgarten ruben at chokepointproject.net
Thu Jul 31 17:09:42 UTC 2014

On 07/31/2014 06:53 PM, Arturo Filastò wrote:
> On 7/23/14, 9:26 PM, Ruben Bloemgarten wrote:
>> Arturo,
>> As we discussed last week, it would be ideal if we could retrieve a file
>> containing urls of file locations. That way drop-off locations can vary
>> and scale horizontally without the parser requiring prior knowledge. 
>> Something like this :
>> ooni section :
>> probe     -> test
>>               -> test report     ->  collector
>>                                          -> generate received report
>> list (for retransmission in case of intended or unintended failure to
>> push to publisher)
>>                                          -> do data scrubbing if
>> required for data publication
>>                                          -> compress (scrubbed) report
>>                                          -> push (scrubbed) compressed
>> report     -> publisher (a http server)
>>                                    -> generate/update url list of
>> reports existing on the publisher
>> chokepoint section:                               
>> parser    ->    retrieve url list from publisher
>>               ->     process url list
>>               ->     retrieve file in url
>>               ->     decompress file
>>               ->     process report file
> This seems like a reasonable thing to do. I think the ideal way to do
> this would be to integrate it into the publishing step and every time we
> update the HTTP published data we also regenerate this file containing
> the list of existing reports.
That would be perfect.
>> The report itself seems fairly clear, but some comments and questions
>> nevertheless. 
>> Report meta data:
>> "options: [-f, /home/uwaterloo_geossl/bridge_reachability/bridges.txt,
>> -t, '300']"
>> Should preferably only contain the filename, not the path, it seems like
>> there is potential data leakage there.
> Yes you are correct. There is a ticket open about this issue and I think
> it's something we should do:
> https://trac.torproject.org/projects/tor/ticket/12706
>> probe_cc: RU
>> Does the cc refer to the ip´s cc ?
>> How is the cc generated, maxmind or ?
> Yes it is the country code of the IP address and the data is taken from
> maxmind.
>> If the cc is generated based on an external geolocation service, this
>> service and the date of generation should preferably be known.
> I need to look into if there is a way to determine the version of the
> database installed on a users machine, because the calculation of the CC
> and ASN is done locally by them not upstream.
I am assuming that the maxmind dat file is used for this and it is
packaged with the ooni client.
What would probably be sufficient would be to supply the date the
maxmind dat file was taken from maxmind + the sum of the file for
verification. (we keep a collection of the dat files so we could start
generating sums for comparison.)
>> probe_ip:
>> This should preferably be removed entirely before publication. Maybe it
>> should not be there at all, the ASN seems sufficient.
> We want to keep that there for consistency and there are cases when we
> are ok with also publishing the probe IP address.
How prone to user error is this ? Can we think of a way to determine if
the ip reveal was done on purpose and block publication/destroy the file
based on that ?
>> Report content:
>> Can you supply a list of possible values or value ranges that can be
>> expected for the following report entries:
>> input: ....
>> success:  is this true/false ?
>> tor_progress: is this 0-100 ?
>> tor_progress_summary: this refers to the stage the previous finishes ?
>> tor_progress_tag: are there values other than 'null' and 'done' ?
> I just realized that the bridge_reachability test is not specified. I
> created a ticket for doing that:
> https://trac.torproject.org/projects/tor/ticket/12757#ticket
>> As discussed previously, let´s start with processing tests for the
>> publicly known bridges only. How to manage the secret bridges we should
>> tackle at a later stage, with the understanding that we should under no
>> circumstances have access to the actual addresses of those bridges.
> Ok, I think we should anyways also start collecting data for the private
> bridges so that we have that in stock for when we decide how to do that.
As long as I don´t get to see them, that sounds fine :)
>> - Ruben
>> On 07/12/2014 04:18 PM, Arturo Filastò wrote:
>>> As promised I published the bridge reachability measurements on the
>>> public ooni report hosting.
>>> You can find them here:
>>> https://ooni.torproject.org/reports/0.1/CN/
>>> https://ooni.torproject.org/reports/0.1/RU/
>>> https://ooni.torproject.org/reports/0.1/US/
>>> Keep in mind that, as I was telling you, during some of the runs there
>>> were some issues with the measurements due to incompatibilities of
>>> ooniprobe with the old fedora version running on planetlab, so not all
>>> the measurements may be 100% accurate. They should still, at least, give
>>> you an idea of how the data format looks like and if it contains enough
>>> information for doing your parsing work.
>>> I would suggest we keep this discussion public and maintain the ooni-dev
>>> list in cc.
>>> ~ Art.

More information about the ooni-dev mailing list