[metrics-team] OnionStats - roadmap?

Anathema anathema at anche.no
Mon Aug 8 11:02:46 UTC 2016


On 08/08/2016 10:44, Karsten Loesing wrote:
> 
>> - results are returned for both bridges and relays when using
>> 'limit' and 'offset'
> 
> The reason for the current behavior is that clients can easily
> implement paging of results by setting limit to the number of results
> they want to display and offset to the number of results on earlier
> pages that they want to skip.  That won't work anymore with the
> suggested change.  I'd consider this a bad backward-incompatible
> change, unless there's a reason for making this change that I'm
> overlooking.

Even with my implementation, 'offset' and 'limit' work as expected, so I
didn't get "That won't work anymore with the suggested change".

The reason I output both relays and bridges (instead of only relays) is
for simplicity: let's assume I want 10 bridges and 10 relays. With the
current protocol I've to perform 2 queries. With the proposed
implementation, it's just 1 query.

Of course performance changes since returning both nodes is more
intensive then just one, but it's just a matter of hardware and software
resources that we can deal with.

I'm open to suggestions and I don't have any problem in switching back
to the backward-compatible change, if the list thinks it's the case.

> 
>> - 'order' parameter's value can be any field, so it's not limited
>> to the 'consensum_weight'
> 
> Oh yes, this one is a good change, and it should be
> backward-compatible.  We might have to specify how sorting works for
> some fields.  For example, are or_addresses sorted alphanumerically?
> Do we sort by first address in or_addresses or by the
> (alphanumerically) smallest?  How do we handle missing values?
> 

At the moment, IPv4 is treated as string, since I didn't mapped the
field as 'ip'. You can read more about it here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/ip.html

I need to change the mapping and reindex, it shouldn't take long. As you
can read, IPv6 is not supported yet.

> So, now that I'm listing these possible issues, would it be easier to
> start with the fields that are easy and add more complicated fields later?

Since I'm leveraging the full ElasticSearch capabilities, the sorting
fields are already build in, so it would be worst to "cripple"
ElasticSearch and then "improve" it.

> 
>> The (negative) differences are: - 'lookup' is not implemented: I
>> was not able to find a difference between 'lookup' and
>> 'fingerprint': can you provide some real examples?
> 
> The difference between those two parameters is specified on the
> protocol page:
> 
> https://onionoo.torproject.org/protocol.html
> 
> Here's where we added the fingerprint parameter:
> 
> https://gitweb.torproject.org/onionoo.git/commit/?id=8f63e74709cd05cd812e33f95ffe51b05d6d537c
> 
> If it turns out to be difficult to implement that parameter, let's
> talk more.  Maybe we don't need it anymore.  Removing that would be a
> backward-incompatible change, but let's see.

The code you linked seems to be relative to the 'fingerprint' parameter.
However, it doesn't show me any real case scenario. In my testing (on
the current protocol), 'lookup' works exactly like 'fingerprint'. I can
"merge" the two and maybe remove one later.

> 
>> - 'search' does not implement: "any 4 hex characters of a
>> space-separated fingerprint" and "beginning of a base64-encoded
>> fingerprint without trailing equal signs": I was not able to find
>> any relevant case for those
> 
> Also mentioned at the meeting:
> 
> Sometimes people paste fingerprints from other sources into Atlas or
> other Onionoo clients, and we should return matching relays to them.
> We're only returning relays and bridges matching all search terms, so
> we'll have to store all 4 hex character blocks of a fingerprint and
> make them searchable.  Let me know if you have more questions about this.
> 

Implemented!
https://github.com/davinerd/onionoo-ng/commit/1fb3b1ec56d57d697587222fe70fd62133b2b6b4

I've just one doubt about the "4 hex character blocks": with my
implementation you can search for 1 hex block, doesn't matter if the
first, the second, the third or the fourth. To be honest, I don't even
enforce the block to be 4 chars, but it can be even 1 (of course, the
amount of data returned will be huge).

> 
> Hope this helps to get your code closer to the current Onionoo
> protocol.  Ideally, you'd be able to deploy an Atlas version that
> points to your Onionoo server and offer that to users.  Let me know if
> you need help with that.

I'll try that after we work out the last issues, thanks!

As a side node, I'd like you to look at
https://github.com/davinerd/onionoo-ng/issues/1 (it's the explanation
and fix for the 'offset' issue iwakeh found during the meeting) and let
me know your thoughts.

Another thing: I implemented the 'summary' document in a separate
branch: https://github.com/davinerd/onionoo-ng/tree/summary_doc feel
free to take a look and comment.

I've just one question about the summary document, but I'll start a new
thread.

-- 
Anathema

+------------------------------------------------------------------+
   GPG/PGP KeyID: CFF94F0A available on http://pgpkeys.mit.edu:11371/
   Fingerprint: 80CE EC23 2D16 143F 6B25  6776 1960 F6B4 CFF9 4F0A

   https://keybase.io/davbarbato
+------------------------------------------------------------------+


More information about the metrics-team mailing list