[tor-bugs] #31422 [Circumvention/BridgeDB]: Make BridgeDB report internal metrics

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu Jun 4 09:14:11 UTC 2020


#31422: Make BridgeDB report internal metrics
-------------------------------------------------+-------------------------
 Reporter:  phw                                  |          Owner:  phw
     Type:  enhancement                          |         Status:
                                                 |  needs_review
 Priority:  Medium                               |      Milestone:
Component:  Circumvention/BridgeDB               |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:  s30-o21a1, anti-censorship-          |  Actual Points:
  roadmap-2020                                   |
Parent ID:  #31274                               |         Points:  2
 Reviewer:                                       |        Sponsor:
                                                 |  Sponsor30-can
-------------------------------------------------+-------------------------

Comment (by karsten):

 Replying to [comment:12 phw]:
 > I think it's time for a review of what I've done so far:
 > https://github.com/NullHypothesis/bridgedb/compare/enhancement/31422

 I took a brief look at the new metrics captured by your patch:

 > Here are the internal metrics that the patch is currently capturing:
 > * Number of IPv4/IPv6 requests.

 You're already counting lots of requests and reporting binned numbers, so
 this should be fine.

 > * Min, max, median, and stdev of the number of users that bridges were
 handed out to.

 I don't see any privacy issues with computing and reporting these four
 statistics.

 I'm less sure about how useful they will be. The median will likely be the
 most interesting statistic here, but the min and max will only tell you
 about the smallest and largest outliers but not tell you much about how
 the distribution looks like. Not sure how useful the standard deviation
 will be.

 Would it be an option to add quantiles? Your comment suggests that you'd
 have to require Python 3.8 in order to use the quantiles() function of the
 built-in statistics module. But did you consider using SciPy/NumPy to
 compute these? However, if neither of those is an option, I'd recommend
 against computing quantiles yourself, because there are just too many ways
 to screw up.

 If you have quantiles, you might want to include first and third quartile
 as well as smallest and largest non-outliers within 1.5 inter-quartile
 ranges from the median. That's the five values you'd also find in a
 boxplot. We're computing these five values in our
 [https://metrics.torproject.org/onionperf-latencies.html OnionPerf latency
 statistics]. [https://gitweb.torproject.org/metrics-
 web.git/tree/src/main/sql/onionperf/init-onionperf.sql#n187 Here]'s the
 SQL code that we use. (I don't think we have Python code around for
 computing the high and low values.)

 If you want to start with somewhat simpler statistics, be sure to include
 first and third quartile together with the median. You could always add
 the high and low values later if you need them.

 > * The number of empty responses per distributor.
 > * The number of bridges per (sub)hashring.

 Like the first number, I don't see an issue with reporting these binned
 numbers.

 > In the meanwhile, I'll spend some more time thinking about the other
 metrics suggestions in this ticket.

 Let me know if you want me to take another look!

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/31422#comment:13>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list