[tor-dev] CollecTor data: mapping bridge-network-status to bridge-server-descriptor to bridge-extra-info

Karsten Loesing karsten at torproject.org
Thu Jul 9 08:26:55 UTC 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/07/15 05:39, Roger Dingledine wrote:
> On Wed, Jul 08, 2015 at 07:45:04PM -0700, David Fifield wrote:
>> I'm trying to use CollecTor data to find out how much bandwidth
>> is offered by different pluggable transports over time. I.e., I
>> want to be able to say something like, "On July 1, bridges with
>> obfs3 offered X MB/s, bridges with obfs4 offered Y MB/s," etc.
> 
> Great!
> 
>> I'm having trouble because sometimes, a router digest listed in
>> a bridge-network-status document is not found in the same
>> tarball.
> [snip]
>> Here's an example of where it goes wrong. 
>> bridge-descriptors-2015-07/statuses/01/20150701-060138-4A0CCD2DDC7995083D73F5D667100C8A5831F16D
>
>> 
> Yeah, I'm not surprised it goes wrong, since the descriptor from 
> 0701-06:01 was likely published in the previous month.
> 
>> However, I did find it in the previous month's tarball,
> 
> Yep.

I think you picked the wrong example for something going wrong,
because that descriptor is actually included in the 2015-07 tarball.

But there are indeed cases when a status published in 2015-07
references a server descriptor that was published in 2015-06, and that
server descriptor would be contained in the 2015-06 tarball.  Example
from the same status:

bridge-descriptors-2015-07/statuses/01/20150701-060138-4A0CCD2DDC7995083D73F5D667100C8A5831F16D

contains a line:

r Unnamed ABQ4ZADwj8WkfgApkhVTFalGweU GqjwHG/sFpFzY4sx9SWuzVTcHag
2015-06-30 12:59:03 10.135.171.161 443 0

which references the following server descriptor:

bridge-descriptors-2015-06/server-descriptors/1/a/1aa8f01c6fec169173638b31f525aecd54dc1da8

>> It seems rare that the bridge-server-descriptor is missing. In
>> the 2015-07 tarball, it happened for 5891/477496 relays (1.2%).
> [snip]
>> How do you handle cases like this? I had a browse through the
>> Onionoo source code, but did not quickly understand it.

Onionoo typically reads descriptors from CollecTor's recent/ directory
which have been published in the past 72 hours, not the tarballs in
the archive/ directory that are organized by publication month.

>> Should I just always include the month preceding the earliest
>> month I want to process?

Yes, you should do that.

> How many of the 5891 cases does that resolve?

If you happen to find cases which are not explained by that, please
let me know.

All the best,
Karsten

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - http://gpgtools.org

iQEcBAEBAgAGBQJVnjBPAAoJEJD5dJfVqbCrfjYH/1kYG9hl10sekKpfhV7y3nAq
wjm/hhyz7bqz9uPJmXs9d8+rkgJBIhUGC+LWqdmmgU8VNRb4NpCq7vBO6MIRJQQG
a7C3XNYRw10+Bs+jfBiE5D6z4i2rLXGDqaFkmKCEbrh6To5pqo2ziJkWUP6Y/8gH
EHjsEINFB4doV2EAccAAAjN6L1cLQPLBEVVAPtN7Pm78hcNuZ9D+n8TA+XWfmOvV
JG26kerEMkA2XPj3nbPvBLTYM5AMvMr/lDQpAuaSZYHb0E8DiLcVlUcaX4Y/IpY8
SqwLmheZdrFItxCH3Fd8c3hxiZ/Qs6iVZ6EPFRuqbBSOu7VLvyo7N4aXrk2bt6c=
=OKle
-----END PGP SIGNATURE-----


More information about the tor-dev mailing list