[tor-bugs] #1839 [BridgeDB]: Rotate available bridges over time

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Mar 31 09:08:04 UTC 2015


#1839: Rotate available bridges over time
-----------------------------+-------------------------------------------
     Reporter:  arma         |      Owner:  isis
         Type:  enhancement  |     Status:  needs_review
     Priority:  blocker      |  Milestone:
    Component:  BridgeDB     |    Version:
   Resolution:               |   Keywords:  bridgedb-dist, bridgedb-0.3.2
Actual Points:               |  Parent ID:
       Points:               |
-----------------------------+-------------------------------------------

Comment (by isis):

 Some IRC logs, because they started off with arma reviewing the design of
 this ticket's implementation and recent changed in #4771, and then drifted
 to other future BridgeDB/Metrics related tasks.

 {{{
 05:50          armadev  | i was looking at #1839
 05:50 -zwiebelbot:#tor-dev- tor#1839: Rotate available bridges over time -
 https://bugs.torproject.org/1839
 05:50             isis@ | oh great, thanks
 05:50          armadev  | i need to look more at the plan there, but i
 continue to think that the strategy of "don't let an attacker learn very
 many bridges in a given time period no matter how much effort they
                           put in" is a good one
 05:50          armadev  | but that made me remember another thing i was
 wanting us to look at
 05:51          armadev  | which is #10 on https://blog.torproject.org/blog
 /research-problems-ten-ways-discover-tor-bridges
 05:51          armadev  | right now, we have a hash ring design, where the
 "address" of the requestor maps to a point on the hash ring, and we give
 them the next k bridges?
 05:51            *      coderman_ wants more redteam arma blog
 05:52          armadev  | so that naturally will lead somebody who can
 attack some points in this ring to learn all the rest of the bridges if
 they do this attack
 05:52          armadev  | whereas we could imagine something other than a
 hash ring, or rather, using the ring differently, to make it so all
 bridges map to a small closed cycle
 05:52          armadev  | i haven't thought through the details and maybe
 it cannot be made to work easily, but i wanted to raise the topic again.
 05:55             isis@ | so the alternative that i could do would be to
 have "consistent" hashrings, which is something used usually in backend
 systems for data replication, where if the number of duplicates
                           N=3, then you end up with the resource places
 into three buckets as-evenly-as-possible placed around the main hashring
 05:56             isis@ | with a replication level of N=1, this would
 result in each bridge being in its own little subgroup, and no others
 05:57             isis@ | BridgeDB kind of has like four classes which try
 to implement this concept, and IMO they all do a really bad job, with code
 duplication, unused code, and half-implemented stuff all over
                           the place
 05:57             isis@ | i am finishing up my branch which cleans up all
 the hashring code today, it is for #12505
 05:57 -zwiebelbot:#tor-dev- tor#12505: Refactor Bridges.py and Dist.py in
 BridgeDB - https://bugs.torproject.org/12505
 05:59             isis@ | anyway, if we did this, they we could easily say
 "every distributor gets one main consistent hashring which is split into X
 subhashrings. depending on what week it is, only one of those
                           subhashrings is available." then, in the
 subhashring, rotate the clients around the ring with a different frequency
 06:00             isis@ | does that sound like it would solve #10?
 06:00          armadev  | but it's still a ring. this is good for the
 "change what you're giving out over time" feature, but no, i think it
 doesn't address #10.
 06:00          armadev  | the issue is that my address maps to bridges 5,
 6, and 7
 06:00          armadev  | and your address maps to bridges 7, 8, and 9
 06:01          armadev  | so if the adversary sees your behavior, it
 learns about 7, and from 7 it learns about me, and then it learns about 5
 and 6
 06:01          armadev  | it would be better, for #10, if every address
 maps to a trio of bridges that are the same trio that other people get
 when they're mapped there
 06:01          armadev  | rather than these partially overlapping sets
 that we do now.
 06:02             isis@ | ah, i see
 06:02             isis@ | yes, the overlap has also bothered me, but i
 wasn't thinking of the zig-zag problem
 06:02          armadev  | not sure this one needs to be solved now
 06:02             isis@ | hmm. the overlap is a much more difficult one to
 solve
 06:02          armadev  | and i think "different bridges at different
 times" is a more important topic to do
 06:03          armadev  | the blog post describes a potential solution.
 06:04          armadev  | but 'more work remains' before that solution
 will actually do what we want.
 06:04          armadev  | it's the sort of thing we should write up as a
 math problem for somebody's grad class, and then sit back and wait
 06:04             isis@ | the overlap is also even harder to solve
 because, when BridgeDB parses new descriptors, it rebuilds all the
 hashrings entirely, causing rings to add and lose bridges. however, for
 the
                           bridges which remain, their place in the
 hashring remains the same.
 06:04          armadev  | rather than get caught up in ourselves
 06:05          armadev  | huh. yeah.
 06:05          armadev  | which leads me to another topic that we should
 be pondering:
 06:05          armadev  | all of these steps we take to make it less
 likely for an attacker to Get All The Bridges lead to more bridges going
 unused for some time periods
 06:05          armadev  | we should think about ways to tell the bridge
 operator when they're in action, and when they're in reserve
 06:05          armadev  | so we can reassure them that being in reserve is
 a great and valuable role.
 06:06          armadev  | (some people run a bridge for a day, then stop.
 if they were in reserve the whole time, technically speaking, that wasn't
 a great and valuable role after all.)
 06:06             isis@ | so, e.g. if you ask for bridges right now
 (without #1839 deployed) and you get bridges A, B, and C, and then
 BridgeDB reparses and rebuilds, and B goes offline, then three hours later
                           you ask for more bridges, you'll likely get
 bridges A, B, and D
 06:07          armadev  | perhaps you mean C goes offline?
 06:07          armadev  | otherwise, this sounds bad :)
 06:08             isis@ | oh yeah. that. :)
 06:08          armadev  | hey, it's bridgedb, you never know
 06:09             isis@ | haha, the thing is becoming a tiny bit more
 well-behaved now
 06:09             isis@ | just a tiny bit
 06:09             isis@ | i think you once had a ticket for designing some
 bridge statistics interface for BridgeDB…
 06:09            *           isis  is looking for it
 06:09             isis@ | #7877
 06:09 -zwiebelbot:#tor-dev- tor#7877: Web interface for looking up bridge
 status? - https://bugs.torproject.org/7877
 06:09             isis@ | why did that never happen?
 06:10             isis@ | do we still want that to happen?
 06:10          Yawning  | hm
 06:10             isis@ | or do we consider Globe to solve that problem?
 06:10          armadev  | didn't we do something related to #7877 in
 globe?
 06:11          armadev  | except, i vaguely remember hearing from karsten
 that he decided to drop that data point from the globe interface, because
 i-don't-remember-why
 06:11          armadev  | it does seem a bit silly for bridgedb to grow a
 new interface for users,
 06:12          armadev  | when it's already exporting stuff to globe and
 globe is already an interface for users
 06:12          armadev  | but it might be wise for us to export a bit more
 stuff from bridgedb to globe, so it can give that stuff to users
 06:12             isis@ | once the database stuff for prop#226 is merged,
 we get a pretty neat stucture to build statistics gathering and analysis
 tools on top of
 06:12 -zwiebelbot:#tor-dev- Prop#226: "Scalability and Stability
 Improvements to BridgeDB: Switching to a Distributed Database System and
 RDBMS" [OPEN]
 06:12          armadev  | and i guess, step zero is for globe to resume
 giving out that info at all
 06:13             isis@ | yeah, i suppose i could also more easily support
 giving the metrics server access to certain queries, so that it benefits
 from BridgeDB keeping state and all
 06:14             isis@ | plus then metrics wouldn't have to do a bunch of
 crazy reparsing and recalculation of any things which bridgedb already
 does
 06:16          armadev  | yeah, hm, the 'pool assignment' entry on globe
 appears empty
 06:16          armadev  | for e.g.
 https://globe.torproject.org/#/bridge/1513028CD43BD34798D829719D76E6EC3F5391CA
 06:17          armadev  | #13921
 06:17             isis@ | yeah, see #13921
 06:17 -zwiebelbot:#tor-dev- tor#13921: Remove "bridge pool assignment" UI
 element from Atlas/Globe - https://bugs.torproject.org/13921
 06:17             isis@ | which replaces it in Globe with the `transport`
 field instead
 06:17             isis@ | showing which transports a bridge currently
 supports
 06:17          armadev  | well, great, but that removes the thing i was
 just talking about where we give feedback to the user about whether her
 bridge is in action or what
 06:18          armadev  | which i think will become even more important
 with #1839
 06:22             isis@ | armadev: well, right, but then we should
 probably do either #2755 or…
 06:22 -zwiebelbot:#tor-dev- tor#2755: Reconsider BridgeDB's pool
 assignment file implementation and deployment -
 https://bugs.torproject.org/2755
 06:22            *           isis  can't find the other ticket
 06:23             isis@ | i had a ticket that was for adding somewhere in
 the bridge-extrainfo descriptor a line like `BridgeDistribution 0` or
 `BridgeDistribution https`
 06:23          armadev  | isis: to me #2755 is more about documenting how
 bridges were given out over the past, so we can match up load and blocking
 measurements with distribution to find patterns.
 06:33            *           isis  found the torrc `BridgeDistribution
 https` tickets, they are #13727 and #13504
 06:33 -zwiebelbot:#tor-dev- tor#13727: BridgeDB should not distribute Tor
 Browser's default bridges - https://bugs.torproject.org/13727
 06:33 -zwiebelbot:#tor-dev- tor#13504: Bridges in Tor Browser Bundles
 should be public so that we have metrics on them -
 https://bugs.torproject.org/13504
 07:03          armadev  | isis: so in summary (there are a lot of
 tickets), where are we at with the goals of remembering how we gave out
 bridges at which time, so we can use that to study the effectiveness of
                           bridge distribution strategies in the past? and
 where are we at communicating to the operator what strategies we've used
 recently to give out her bridge?
 07:08             isis  | currently, there is a pile of assignments.log
 files which continued to be produced and never got synced to Metrics
 07:09             isis  | i could do #2755 soon, and ask karsten to allow
 BridgeDB to start syncing to Metrics again
 07:09 -zwiebelbot:#tor-dev- tor#2755: Reconsider BridgeDB's pool
 assignment file implementation and deployment -
 https://bugs.torproject.org/2755
 07:09            *        karsten looks at #2755
 07:10             isis  | or, if karsten likes, i can provide an interface
 to BridgeDB's newer databases, so that the Metrics server can obtain data
 without additional processing/storage
 07:10          karsten  | isis: or should we think about better usage
 statistics here?
 07:11          karsten  | well, Metrics has only data that is archived by
 CollecTor.
 07:11             isis  | sure, that sounds better than a string that
 likely has no meaning to most operators
 07:11          karsten  | we could come up with better stats that are
 collected by CollecTor and then displayed/processed by Metrics and/or
 Onionoo.
 07:11          armadev  | if there is historical how-we-distributed-it-
 when data that we have but we're not keeping, that's a bit sad
 07:12             isis  | i have #14453 and #10218 which are along those
 lines
 07:12          karsten  | you mean past assignment.log files?
 07:12 -zwiebelbot:#tor-dev- tor#14453: Implement statistics gathering for
 number of Bridges-per-Transport in BridgeDB -
 https://bugs.torproject.org/14453
 07:12 -zwiebelbot:#tor-dev- tor#10218: Provide "users-per-transport-per-
 country" statistics for obfsbridges - https://bugs.torproject.org/10218
 07:12          armadev  | (though ideally there is how-it-got-blocked-when
 data somewhere out there, that we are not collecting and not keeping, and
 we'd ideally like to have both.)
 07:12             isis  | karsten: yes, i have some past assignments.log
 files
 07:13          armadev  | karsten: i think i don't mean statistics
 summaries, but rather, than underlying data.
 07:13          armadev  | the sort of thing that researchers are going to
 want, a year from now, when they ask how that blocking event happened and
 which bridges it affected.
 07:13          karsten  | we could convert existing logs into the new
 format.
 07:14          karsten  | the yet-to-be-designed format.
 07:15             isis  | the assignments.log files, did Metrics used to
 sanitise them by replacing the fingerprints with hashed fingerprints?
 07:15          karsten  | isis: it seems #10218 is for little-t-tor, not
 bridgedb.
 07:15          karsten  | yes, that's what it did. and it sorted them by
 hashed fingerprint, so that the order didn't reveal anything.
 07:16          karsten  | maybe more.
 07:16             isis  | steps like those are something that BridgeDB
 could easily do to begin with, if it would make the processing less
 intense
 07:16          karsten  | that's right.
 07:17          karsten  | it totally should do those steps.
 07:18             isis  | and BridgeDB is parsing all the bridges into
 stem classes anyway, and is going to store them as json in couchDB, if
 that json is something more accessible
 07:19          karsten  | json is easier than inventing our own data
 format, yes.
 07:19          karsten  | we still need to think what to put into the json
 though.
 07:19          Yawning  | xmllllllll
 07:19          karsten  | xml in json, ok.
 07:19          Yawning  | :D
 07:19             isis  | one idea i had earlier was to allow collecTor to
 have certain queries on the new database (or the output of the query and
 some processing) for whatever statistics we wish to extract
 07:21          karsten  | ideally, collector would fetch a thing every
 hour or so, verify it, and store it.
 07:21          Yawning  | armadev: would #15515 count as what you want out
 of the defense at the intro point?
 07:21 -zwiebelbot:#tor-dev- tor#15515: Don't allow multiple INTRODUCE1s on
 the same circuit - https://bugs.torproject.org/15515
 07:21          Yawning  | or do you want something more sophisticated?
 07:21             isis  | karsten: i was just going to put everything in
 the json, that way BridgeDB could do cooler stuff with detecting when
 certain fields have changed
 07:22             isis  | karsten: verify means verify the descriptor
 signatures?
 07:22          karsten  | ah, mostly that it's valid json and contains
 certain required fields.
 07:22          karsten  | I think.
 07:23          karsten  | not sure about putting in everything, including
 things that are already contained elsewhere,
 07:23          karsten  | but it might be possible to remove certain
 fields while exporting to collector.
 07:23             isis  | i planned on writing protobufs to define what
 data was valid for BridgeDB to be exporting
 07:24          karsten  | okay, happy to learn what exactly that means. :)
 07:24          Yawning  | it's google's serialization format
 07:24             isis  | https://developers.google.com/protocol-
 buffers/docs/overview
 07:24          Yawning  | you feed a definition file into a code generator
 and it outputs code that marshals/demarshalls stuffs
 07:25          karsten  | nice, ok.
 07:25             isis  | basically, i write a .proto file and it
 generates python, java, c, and/or go
 07:25          Yawning  | https://capnproto.org/
 07:25          Yawning  | see also
 07:25          Yawning  | which is protobufs redesigned by the author
 after he left google
 07:25          Yawning  | haven't used it, claims to be be better
 07:26             isis  | lol, i can't tell if they are joking
 07:26             isis  | "∞% faster!!"
 07:27          Yawning  | heh
 07:27          Yawning  | if you read on they clarify what they mean
 07:28          Yawning  | ymmv, protobufs is a fine format to use
 07:28          Yawning  | and this thing may eat all ur dataz
 07:28          karsten  | isis: okay, want to start a list of things to
 put into that json that are safe to be collected and published by
 collector?
 07:28            *           isis  totally thinks they are joking at
 "Time-traveling RPC"
 07:29          karsten  | isis: and is this something for a tor proposal?
 07:29             isis  | but it is interesting and if they are not
 totally lying their pants off, then SUBSCRIBE
 07:29          Yawning  | isis: the idea they're doingis actually p clever
 07:29          karsten  | isis: or as addition to bridgedb-spec (if that
 exists)?
 07:30          karsten  | oh, yes, it still exists, I helped write it..
 07:30             isis  | karsten: currently, there is no proposal for
 better bridge statistics
 07:30             isis  | although we could start one
 07:30             isis  | and bridgedb-spec.txt lives in the top-level of
 tor-spec.git now
 07:30          karsten  | oh, nice.
 07:30          karsten  | you mean bridgedb has changed since  Date:   Fri
 Jul 5 01:40:49 2013 +0000
 07:31          karsten  | I should git pull..
 07:31             isis  | hah, it's almost entirely rewritten
 07:31             isis  | i am finishing the final refactorings now
 07:32             isis  | which is why i will have time and ability to do
 cool stuff, like the social distributor
 07:32             isis  | (and better bridge metrics, if we want that)
 07:34          karsten  | isis: just let me know if I can help with the
 stats side of things. it would be useful for bridge operators
 (onionoo/atlas/globe) and for sponsors (metrics).
 07:35             isis  | karsten: is that sponsor S, or sponsors in
 general
 07:35          karsten  | sponsors in general. I don't know what S wants.
 07:35             isis  | karsten: ok, i will start making a proposal, and
 ask you to review it
 07:36          karsten  | sounds great!
 }}}

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/1839#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list