[tor-bugs] #14453 [BridgeDB]: Implement statistics gathering for number of Bridges-per-Transport in BridgeDB

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Jan 28 23:35:54 UTC 2015


#14453: Implement statistics gathering for number of Bridges-per-Transport in
BridgeDB
---------------------------------------------+----------------------
 Reporter:  isis                             |          Owner:  isis
     Type:  task                             |         Status:  new
 Priority:  normal                           |      Milestone:
Component:  BridgeDB                         |        Version:
 Keywords:  tor-bridge,bridgedb,SponsorS-pt  |  Actual Points:
Parent ID:                                   |         Points:
---------------------------------------------+----------------------
 As part of the
 [https://trac.torproject.org/projects/tor/wiki/org/sponsors/SponsorS/PluggableTransports
 SponsorS PT work], we promised a way to gather statistics on the number of
 bridges per transport.

 The proposal states this is a task for Metrics. However, it's possible to
 do this on the BridgeDB side. In fact, it would help BridgeDB in the
 future to determine how to better allocate bridges to its Distributors
 (and help the Distributors hand them out to users in smarter ways).

 Technically, BridgeDB already sort-of has data on the number of Bridges-
 per-Transport… or, rather, when a client requests a certain type of bridge
 from a certain Distributor (e.g. "give me an IPv4 obfs3 bridge from the
 HTTPS Distributor"), BridgeDB creates (or retrieves from a cache) a
 "filtered" subhashring containing only Bridges which fit the client's
 request. BridgeDB even logs the number of Bridges in these subhashrings in
 its DEBUG and INFO logs:

 {{{
 22:19:16 INFO    L1361:Bridges.addRing()        Bridges inserted into
 HTTPS-Transpo subring: 235
 22:19:16 DEBUG     L75:Dist.getNumBridgesPerA() Returning 3 bridges from
 ring of len: 235
 }}}

 The problem with using those numbers for statistics is that BridgeDB's
 Distributors may have multiple adjacent subhashrings, usually about 5. So,
 in the above case, there's roughly something like 1175=5*235 obfs3 bridges
 in the HTTPS Distributor. (These numbers aren't from the real deployed
 BridgeDB, by the way.)

 ---------

 A better way to do this would be to provide a database query (as part of
 #12031) which counts the number of Bridges which claim to offer a PT. An
 example mechanism for doing this in Redis would be to keep a hash (i.e.
 using [http://redis.io/commands/hset HSET] or `HINCRBY`) of Bridges which
 have any PTs, where the keys are the Bridge fingerprints, add a field for
 each type of PT, and then (if not using `HINCRBY`) store
 `IP:PORT[,IP:PORT[,IP:PORT[…]]]`, for example:

 {{{
 redis> HSET 26F6A7570E0F655DFDD054E79ACBB127112C2D7B obfs4
 "4.4.4.4:4444,5.5.5.5:5555"
 }}}

 With that scheme, a new `HSET` would be necessary each time the `@type
 bridge-extrainfo` descriptors are parsed, but this only has time
 complexity O(1).

 Some considerations / additional query parameters:

   * For these statistics, should we only count Bridges with the Running
 flag? Or only if the OONI machine says the PT is reachable?

   * What sanitisations should be done on these numbers? Should we round
 them? Or provide a scale, i.e. "between 1000-5000 obfs4 bridges"?

   * Do we want only the ''Bridges'' with a given PT? Or do we want the
 ''number of instances'' of a given PT (e.g. if a Bridge has multiple obfs3
 instances)?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/14453>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list