[metrics-team] DirAuth vote names inconsistent

Karsten Loesing karsten at torproject.org
Mon Sep 11 18:46:08 UTC 2017


On 2017-09-11 20:37, Tom Ritter wrote:
> On 11 September 2017 at 13:31, Karsten Loesing <karsten at torproject.org> wrote:
>> Hi Tom,
>>
>> On 2017-09-11 17:21, Tom Ritter wrote:
>>> I'm looking at https://collector.torproject.org/recent/relay-descriptors/votes/
>>> and it seems the format of a vote file name is
>>>
>>> %Y-%m-%d-%H-%M-%S-vote-[fingerprint]-[random-fingerprint]
>>>
>>> Where random-fingerprint is the fingerprint of the dirauth that
>>> downloaded the vote, but it
>>> a) changes every hour
>>> b) only one view is available
>>>
>>> The result of this is that if I want to download a vote I need to try
>>> all the dirauth fingerprints until I find the one that is available.
>>>
>>> Is my understanding correct? Is there any way this could be
>>> deterministic? Perhaps (since we expose only a single vote) we can
>>> just strip the last fingerprint?
>>>
>>> I understand we do this because we want to archive different DirAuth
>>> views in case they differ but if they do we could expose the different
>>> file under the viewer fingerprint?
>>
>> The file name format is actually [datetime]-vote-[fingerprint]-[digest],
>> where [fingerprint] is the v3 identity and [digest] is the digest of the
>> vote document (not the identity of the downloading authority).
>>
>> For example, you'll find all of gabelmoo's votes under v3 identity
>> ED03BB616EB2F60BEC80151114BB25CEF515B226 (first hex string).
>>
>> There's also a specification for file names that we're currently hiding
>> here:
>>
>> https://gitweb.torproject.org/collector.git/tree/src/main/resources/docs/PROTOCOL#n202
> 
> I think that document is wrong.
> 
>    year DASH month DASH day DASH hour DASH minute DASH second
>    DASH VOTE DASH fingerprint DASH digest
> 
>    Where VOTE is the string "vote" and all time related
>    values are derived from the valid-after dates. 'fingerprint'
>    is the fingerprint of the authority and 'digest' is the SHA1
>    digest of the authority's medium term signing key.
> 
> Specifically: 'digest' is the SHA1 digest of the authority's medium
> term signing key.
> 
> It's not the medium term signing key, it's the digest of the vote document.

Ah, quite possible. Let's fix the document then.

>> Does that answer your question?
> 
> Kinda. The filename is still unpredictable, which means I'll have to
> parse the /recent/ page and find it to download it. That's not ideal,
> but if it's the way things is, I'll manage. What I'm doing is already
> kind of a pile of hacks =)

I think we'll have to keep the digest of the vote document in the file
name, because we want to archive different vote documents (as you
guessed above).

If parsing the web page feels too much like a hack, you could also parse
the index.json file:

https://collector.torproject.org/index/index.json

All the best,
Karsten

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/metrics-team/attachments/20170911/815fca2d/attachment.sig>


More information about the metrics-team mailing list