[metrics-bugs] #25274 [Metrics/Onionoo]: Consolidate Onionoo's API

Tor Bug Tracker & Wiki blackhole at torproject.org
Fri Feb 16 10:05:38 UTC 2018


#25274: Consolidate Onionoo's API
-----------------------------+------------------------------
 Reporter:  karsten          |          Owner:  metrics-team
     Type:  enhancement      |         Status:  new
 Priority:  Low              |      Milestone:
Component:  Metrics/Onionoo  |        Version:
 Severity:  Normal           |     Resolution:
 Keywords:                   |  Actual Points:
Parent ID:                   |         Points:
 Reviewer:                   |        Sponsor:
-----------------------------+------------------------------
Description changed by karsten:

Old description:

> The following ideas have been on my mind for quite some time. Therefore
> low priority.
>
> How about we simplify Onionoo's API? Two ideas:
>
> == Consolidate document types ==
>
> We have 6 different document types right now:
>  - Summary documents with just a handful fields to support searches and
> to enable subsequent requests for other documents. I hear they're not
> used by Relay Search (anymore).
>  - Details documents with 80% of relevant content we serve.
>  - Bandwidth, weights, clients, and uptime documents all containing
> history objects.
>
> The idea is to consolidate these 6 document types into one. Basically,
> this would be the details document plus all history objects.
>
> Of course, this would increase the size of responses a lot, and possibly
> include data that the clients are not interested in. And we can't expect
> clients to list a dozen or two dozen fields they're interested in by
> using the `fields` parameter.
>
> How about we add the history objects as optional fields and extend the
> `fields` parameter to allow adding optional fields. Example:
>  - `fields=fingerprint` returns ''just'' the fingerprint field. This is
> what we do right now, though only with details documents.
>  - `fields=+write_history,+read_history` returns all fields that are
> currently in details documents plus the two history objects that are
> currently in bandwidth documents.
>  - `fields=-effective_family` returns fields in details documents except
> for the effective family. We don't need this syntax for this specific
> feature, but it might make sense to add it while we're at it.
>
> Benefits are a somewhat cleaner API and a reduced number of requests. I
> think that requests would still be easy to cache, because clients like
> Relay Search would always ask for the same combination of fields.
>
> == Consolidate parameters ==
>
> We have 19 different parameters right now, and I won't list them all
> here. But our main client, Relay Search, only uses one of them: `search`.
> This is possible, because we provide most parameters as qualified search
> terms.
>
> The current situation of supporting a parameter both as HTTP parameter
> and as qualified search term has led to confusion in the past. Sometimes
> they're not exactly the same. In most cases supporting both requires more
> development effort.
>
> We could provide just the `search` parameter and make sure that all other
> parameters are supported as qualified search terms. Maybe we don't even
> have to use a parameter in the HTTP sense but use the entire resource
> string as (qualified) search terms.
>
> == Example ==
>
> Relay Search currently sends this query for the top 10 relays by
> consensus weight (line breaks added for readability):
>
> {{{
> https://onionoo.torproject.org/details
>   ?type=relay
>   &order=-consensus_weight
>   &limit=250
>   &running=true
> }}}
>
> This query would then look as follows:
>
> {{{
> https://onionoo.torproject.org
>   /type:relay
>   %20order:-consensus_weight
>   %20limit:250
>   %20running:true
> }}}
>
> Subsequent queries for details pages look like this:
>
> {{{
> https://onionoo.torproject.org/details
>   ?lookup=D4125249A474408F0FBA4DB15AC207E31E4CF6B3
> https://onionoo.torproject.org/bandwidth
>   ?lookup=D4125249A474408F0FBA4DB15AC207E31E4CF6B3
> https://onionoo.torproject.org/weights
>   ?lookup=D4125249A474408F0FBA4DB15AC207E31E4CF6B3
> }}}
>
> With the suggested changes, these queries would be turned into a single
> query:
>
> {{{
> https://onionoo.torproject.org
>   /lookup:D4125249A474408F0FBA4DB15AC207E31E4CF6B3%20
>   %20fields=
>     +write_history,
>     +read_history,
>     +consensus_weight_fraction,
>     +guard_probability,
>     +middle_probability,
>     +exit_probability
> }}}
>
> == Implementation ==
>
> I haven't looked at the code yet, but I believe we can make this change
> by editing just the web server parts of Onionoo. We can even keep the
> different document types on disk, as written by the updater. We just need
> to tell the server to grab different documents and combine them into the
> response.
>
> This doesn't mean it's trivial to implement. Still, I could imagine that
> it pays off in the longer term, by making Onionoo's API a bit easier to
> maintain.

New description:

 The following ideas have been on my mind for quite some time. Therefore
 low priority.

 How about we simplify Onionoo's API? Two ideas:

 == Consolidate document types ==

 We have 6 different document types right now:
  - Summary documents with just a handful fields to support searches and to
 enable subsequent requests for other documents. I hear they're not used by
 Relay Search (anymore).
  - Details documents with 80% of relevant content we serve.
  - Bandwidth, weights, clients, and uptime documents all containing
 history objects.

 The idea is to consolidate these 6 document types into one. Basically,
 this would be the details document plus all history objects.

 Of course, this would increase the size of responses a lot, and possibly
 include data that the clients are not interested in. And we can't expect
 clients to list a dozen or two dozen fields they're interested in by using
 the `fields` parameter.

 How about we add the history objects as optional fields and extend the
 `fields` parameter to allow adding optional fields. Example:
  - `fields=fingerprint` returns ''just'' the fingerprint field. This is
 what we do right now, though only with details documents.
  - `fields=+write_history,+read_history` returns all fields that are
 currently in details documents plus the two history objects that are
 currently in bandwidth documents.
  - `fields=-effective_family` returns fields in details documents except
 for the effective family. We don't need this syntax for this specific
 feature, but it might make sense to add it while we're at it.

 Benefits are a somewhat cleaner API and a reduced number of requests. I
 think that requests would still be easy to cache, because clients like
 Relay Search would always ask for the same combination of fields.

 == Consolidate parameters ==

 We have 19 different parameters right now, and I won't list them all here.
 But our main client, Relay Search, only uses one of them: `search`. This
 is possible, because we provide most parameters as qualified search terms.

 The current situation of supporting a parameter both as HTTP parameter and
 as qualified search term has led to confusion in the past. Sometimes
 they're not exactly the same. In most cases supporting both requires more
 development effort.

 We could provide just the `search` parameter and make sure that all other
 parameters are supported as qualified search terms. Maybe we don't even
 have to use a parameter in the HTTP sense but use the entire resource
 string as (qualified) search terms.

 == Example ==

 Relay Search currently sends this query for the top 10 relays by consensus
 weight (line breaks added for readability):

 {{{
 https://onionoo.torproject.org/details
   ?type=relay
   &order=-consensus_weight
   &limit=250
   &running=true
 }}}

 This query would then look as follows:

 {{{
 https://onionoo.torproject.org
   /type:relay
   %20order:-consensus_weight
   %20limit:250
   %20running:true
 }}}

 Subsequent queries for details pages look like this:

 {{{
 https://onionoo.torproject.org/details
   ?lookup=D4125249A474408F0FBA4DB15AC207E31E4CF6B3
 https://onionoo.torproject.org/bandwidth
   ?lookup=D4125249A474408F0FBA4DB15AC207E31E4CF6B3
 https://onionoo.torproject.org/weights
   ?lookup=D4125249A474408F0FBA4DB15AC207E31E4CF6B3
 }}}

 With the suggested changes, these queries would be turned into a single
 query:

 {{{
 https://onionoo.torproject.org
   /lookup:D4125249A474408F0FBA4DB15AC207E31E4CF6B3%20
   %20fields:
     +write_history,
     +read_history,
     +consensus_weight_fraction,
     +guard_probability,
     +middle_probability,
     +exit_probability
 }}}

 == Implementation ==

 I haven't looked at the code yet, but I believe we can make this change by
 editing just the web server parts of Onionoo. We can even keep the
 different document types on disk, as written by the updater. We just need
 to tell the server to grab different documents and combine them into the
 response.

 This doesn't mean it's trivial to implement. Still, I could imagine that
 it pays off in the longer term, by making Onionoo's API a bit easier to
 maintain.

 (Edit: Fixed a typo in one the examples.)

--

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25274#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list