[ooni-dev] Bridge reachability results
art at torproject.org
Thu Jul 31 16:53:49 UTC 2014
On 7/23/14, 9:26 PM, Ruben Bloemgarten wrote:
> As we discussed last week, it would be ideal if we could retrieve a file
> containing urls of file locations. That way drop-off locations can vary
> and scale horizontally without the parser requiring prior knowledge.
> Something like this :
> ooni section :
> probe -> test
> -> test report -> collector
> -> generate received report
> list (for retransmission in case of intended or unintended failure to
> push to publisher)
> -> do data scrubbing if
> required for data publication
> -> compress (scrubbed) report
> -> push (scrubbed) compressed
> report -> publisher (a http server)
> -> generate/update url list of
> reports existing on the publisher
> chokepoint section:
> parser -> retrieve url list from publisher
> -> process url list
> -> retrieve file in url
> -> decompress file
> -> process report file
This seems like a reasonable thing to do. I think the ideal way to do
this would be to integrate it into the publishing step and every time we
update the HTTP published data we also regenerate this file containing
the list of existing reports.
> The report itself seems fairly clear, but some comments and questions
> Report meta data:
> "options: [-f, /home/uwaterloo_geossl/bridge_reachability/bridges.txt,
> -t, '300']"
> Should preferably only contain the filename, not the path, it seems like
> there is potential data leakage there.
Yes you are correct. There is a ticket open about this issue and I think
it's something we should do:
> probe_cc: RU
> Does the cc refer to the ip´s cc ?
> How is the cc generated, maxmind or ?
Yes it is the country code of the IP address and the data is taken from
> If the cc is generated based on an external geolocation service, this
> service and the date of generation should preferably be known.
I need to look into if there is a way to determine the version of the
database installed on a users machine, because the calculation of the CC
and ASN is done locally by them not upstream.
> probe_ip: 127.0.0.1
> This should preferably be removed entirely before publication. Maybe it
> should not be there at all, the ASN seems sufficient.
We want to keep that there for consistency and there are cases when we
are ok with also publishing the probe IP address.
> Report content:
> Can you supply a list of possible values or value ranges that can be
> expected for the following report entries:
> input: ....
> success: is this true/false ?
> tor_progress: is this 0-100 ?
> tor_progress_summary: this refers to the stage the previous finishes ?
> tor_progress_tag: are there values other than 'null' and 'done' ?
I just realized that the bridge_reachability test is not specified. I
created a ticket for doing that:
> As discussed previously, let´s start with processing tests for the
> publicly known bridges only. How to manage the secret bridges we should
> tackle at a later stage, with the understanding that we should under no
> circumstances have access to the actual addresses of those bridges.
Ok, I think we should anyways also start collecting data for the private
bridges so that we have that in stock for when we decide how to do that.
> - Ruben
> On 07/12/2014 04:18 PM, Arturo Filastò wrote:
>> As promised I published the bridge reachability measurements on the
>> public ooni report hosting.
>> You can find them here:
>> Keep in mind that, as I was telling you, during some of the runs there
>> were some issues with the measurements due to incompatibilities of
>> ooniprobe with the old fedora version running on planetlab, so not all
>> the measurements may be 100% accurate. They should still, at least, give
>> you an idea of how the data format looks like and if it contains enough
>> information for doing your parsing work.
>> I would suggest we keep this discussion public and maintain the ooni-dev
>> list in cc.
>> ~ Art.
More information about the ooni-dev