[tor-bugs] #31422 [Circumvention/BridgeDB]: Make BridgeDB report internal metrics

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Jun 9 18:23:15 UTC 2020


#31422: Make BridgeDB report internal metrics
-------------------------------------------------+-------------------------
 Reporter:  phw                                  |          Owner:  phw
     Type:  enhancement                          |         Status:
                                                 |  needs_information
 Priority:  Medium                               |      Milestone:
Component:  Circumvention/BridgeDB               |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:  s30-o21a1, anti-censorship-          |  Actual Points:
  roadmap-2020                                   |
Parent ID:  #31274                               |         Points:  2
 Reviewer:  agix                                 |        Sponsor:
                                                 |  Sponsor30-can
-------------------------------------------------+-------------------------
Changes (by phw):

 * status:  merge_ready => needs_information


Comment:

 Replying to [comment:13 karsten]:
 > I'm less sure about how useful they will be. The median will likely be
 the most interesting statistic here, but the min and max will only tell
 you about the smallest and largest outliers but not tell you much about
 how the distribution looks like. Not sure how useful the standard
 deviation will be.
 >
 > Would it be an option to add quantiles? Your comment suggests that you'd
 have to require Python 3.8 in order to use the quantiles() function of the
 built-in statistics module. But did you consider using SciPy/NumPy to
 compute these? However, if neither of those is an option, I'd recommend
 against computing quantiles yourself, because there are just too many ways
 to screw up.
 >
 > If you have quantiles, you might want to include first and third
 quartile as well as smallest and largest non-outliers within 1.5 inter-
 quartile ranges from the median. That's the five values you'd also find in
 a boxplot. We're computing these five values in our
 [https://metrics.torproject.org/onionperf-latencies.html OnionPerf latency
 statistics]. [https://gitweb.torproject.org/metrics-
 web.git/tree/src/main/sql/onionperf/init-onionperf.sql#n187 Here]'s the
 SQL code that we use. (I don't think we have Python code around for
 computing the high and low values.)
 [[br]]
 Thanks for the feedback! I removed the standard deviation and added the
 four metrics you suggest: 1st and 3rd quartile, and the upper and lower
 whiskers.
 [https://github.com/NullHypothesis/bridgedb/commit/0beed8953e7a72a69b72045b2623d81b926012f1
 Here's the patch]. I used numpy to determine the quartiles. I originally
 hesitated to add yet another dependency – especially a bulky one like
 numpy – but we can remove it again once Python 3.8 (which has built-in
 support for quantiles) is available in Debian stable.

 On an unrelated note: Karsten, do we need to coordinate on when we deploy
 this patch? Note that the patch bumps the key `bridgedb-metrics-version`
 to 2 and adds several new fields for our internal metrics. Does this break
 anything on the metrics side of things?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/31422#comment:18>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list