[metrics-team] OnionStats - roadmap?

David Fifield david at bamsoftware.com
Mon Aug 8 13:01:10 UTC 2016


On Mon, Aug 08, 2016 at 01:02:46PM +0200, Anathema wrote:
> On 08/08/2016 10:44, Karsten Loesing wrote:
> > 
> >> - results are returned for both bridges and relays when using
> >> 'limit' and 'offset'
> > 
> > The reason for the current behavior is that clients can easily
> > implement paging of results by setting limit to the number of results
> > they want to display and offset to the number of results on earlier
> > pages that they want to skip.  That won't work anymore with the
> > suggested change.  I'd consider this a bad backward-incompatible
> > change, unless there's a reason for making this change that I'm
> > overlooking.
> 
> Even with my implementation, 'offset' and 'limit' work as expected, so I
> didn't get "That won't work anymore with the suggested change".
> 
> The reason I output both relays and bridges (instead of only relays) is
> for simplicity: let's assume I want 10 bridges and 10 relays. With the
> current protocol I've to perform 2 queries. With the proposed
> implementation, it's just 1 query.

I agree with Karsten that this change is not worth breaking
compatibility. It doesn't make sense to me. I don't envision a use case
for returning separate lists of relays and bridges--is it some kind of
double-paned search results page you are envisioning? Also, the choice
of relays/bridges seems arbitrary--why not return the first 10 results
from each country, or the first 10 from each AS, or the first 10 from
each distinct Tor version? Those don't seem any more or less useful to
me than relays/bridges.

> > So, now that I'm listing these possible issues, would it be easier to
> > start with the fields that are easy and add more complicated fields later?
> 
> Since I'm leveraging the full ElasticSearch capabilities, the sorting
> fields are already build in, so it would be worst to "cripple"
> ElasticSearch and then "improve" it.

So what does ElasticSearch do, when comparing two arrays, say?

> >> - 'search' does not implement: "any 4 hex characters of a
> >> space-separated fingerprint" and "beginning of a base64-encoded
> >> fingerprint without trailing equal signs": I was not able to find
> >> any relevant case for those
> > 
> > Also mentioned at the meeting:
> > 
> > Sometimes people paste fingerprints from other sources into Atlas or
> > other Onionoo clients, and we should return matching relays to them.
> > We're only returning relays and bridges matching all search terms, so
> > we'll have to store all 4 hex character blocks of a fingerprint and
> > make them searchable.  Let me know if you have more questions about this.
> > 
> 
> Implemented!
> https://github.com/davinerd/onionoo-ng/commit/1fb3b1ec56d57d697587222fe70fd62133b2b6b4
> 
> I've just one doubt about the "4 hex character blocks": with my
> implementation you can search for 1 hex block, doesn't matter if the
> first, the second, the third or the fourth. To be honest, I don't even
> enforce the block to be 4 chars, but it can be even 1 (of course, the
> amount of data returned will be huge).

I think that this search feature is much less useful if it matches
anywhere other than at the beginning of the fingerprint. It's just going
to be a 2^−16 random sample of the search space, on top of the result
you want.


More information about the metrics-team mailing list