[metrics-bugs] #29315 [Metrics/Website]: Write down guidelines for adding new stats

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu Apr 25 10:40:31 UTC 2019


#29315: Write down guidelines for adding new stats
-------------------------------------+--------------------------
 Reporter:  karsten                  |          Owner:  irl
     Type:  enhancement              |         Status:  accepted
 Priority:  Very High                |      Milestone:
Component:  Metrics/Website          |        Version:
 Severity:  Normal                   |     Resolution:
 Keywords:  metrics-roadmap-2019-q2  |  Actual Points:
Parent ID:                           |         Points:  3
 Reviewer:                           |        Sponsor:
-------------------------------------+--------------------------
Changes (by irl):

 * owner:  karsten => irl
 * status:  needs_revision => accepted
 * reviewer:  irl =>


Comment:

 Replying to [comment:15 karsten]:
 > Replying to [comment:14 irl]:
 > > I would like for these systems to be as open/transparent as is
 possible. The demarcation between a system that collects metrics and Tor
 Metrics should not just be for Tor Metrics. Anyone should be able to do
 what Tor Metrics does. This means that services publish data, and we pull
 from the service.
 >
 > This sounds like a fine recommendation where this is possible. If a
 system can sanitize its data by itself before making it available to us
 and others, great! Let's just be clear that we're shifting complexity and
 maintenance work from Tor Metrics to services run by others. If they have
 the resources to do this, okay.
 >
 > But let's consider whether we want to make this a hard requirement.
 There may be services where we're glad that somebody runs them and where
 we cannot expect them to also run sanitizing code. The options in such a
 case are that we either don't get the data, or we sanitize it somewhere.
 And if we can choose where to sanitize it, we can either do it as part of
 a CollecTor module or in a separate tool run on the host that also runs
 the service. In either case we're providing the sanitized data to others
 who can then do everything that Tor Metrics does.

 If we are not going to make it work any other way, it is probably better
 to do the sanitizing in CollecTor than to run it on another machine as
 this might split our focus and end up with us making mistakes. We could
 make it a "very strong" recommendation, but then fallback to doing the
 sanitizing in CollecTor as a last resort.

 > > It does not need to be a web server. If there is not already a
 webserver then a Gopher server or TCP port that dumps out the document are
 also fine as far as I'm concerned, maybe karsten has other opinions.
 >
 > Gopher? My initial reaction is that we shouldn't fall into the same
 esoterism trap where we also lost Haskell-written TorDNSEL.

 Good point. However, what do we mean when we say "web server"? Would we
 accept a server that only allows SPDY/3 for example? We should pick some
 client libraries that must have support for at least one of the available
 protocols.

 > > Increasingly I'm thinking that the Tor directory protocol meta format
 is a good format to have metrics in. We already have parsers for these
 that are fast and efficient, and it's easier to detect errors due to the
 strict format (even if #30105 and similar things sometimes slip through).
 The document format also provides for signing of documents, which I'd like
 to see more of our data sources doing. #29624 is looking at defining a new
 format for exit lists, and is using the meta format with Ed25519
 signature.
 >
 > Sounds good to me, as a recommendation that likely works for most new
 formats. For example, having sanitized web server logs in the Apache
 format made sense, because then it was possible to use existing tools to
 process them. But yes, for most formats this is a fine recommendation.

 Right, if there are already well-defined formats for certain structured
 data we should reuse those.

 > Would you mind taking the draft and the comments above and writing an
 updated draft? I feel like if I continue owning this task, we'll need more
 review rounds. Let me know!

 Ok, I'll pick this up and then develop the next version of the draft.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29315#comment:16>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list