[anti-censorship-team] Merging bridgedb and gettor?

Cecylia Bocovich cohosh at torproject.org
Thu May 14 14:06:03 UTC 2020

On 2020-05-14 5:14 a.m., Roger Dingledine wrote:
> Hi folks!
> I've been pondering our two tools, gettor and bridgedb, and realizing
> that conceptually they overlap in a bunch of ways. Does that mean,
> while Cecylia is exploring getting gettor back on its feet, we should
> make them overlap more in a technical sense too?

I'm a fan of this idea. There were a lot of times recently, while
working on GetTor tickets, that I realized we were doing the same thing
for BridgeDB. Some of this work (redoing the text in replies) wasn't
redundant, but some was. It's getting difficult to remember the lessons
learned from working on one tool so we can apply it to the other. Some
upcoming work that is definitely redundant is string localization and
usage metrics.

In fact, it looks like there's already an old ticket for this [0].

> Some initial thoughts:
> + Having fewer tools to maintain is an obvious win, in terms of fewer
> moving pieces, fewer things to explain to people, less mental energy.
> + Broadly speaking, they are both tools that want to support a variety
> of low bandwidth "access methods" for reaching them, and they use these
> access methods to tell users about "resources".

This is a big one. Moat was a huge improvement to the usability of
BridgeDB, but we've been talking about distributing bridges with GetTor
as well [1] since if you need GetTor, you probably need bridges anyway.

Additionally, a lot of people have complained that the gmail/riseup
restriction for BridgeDB is difficult to deal with. Google requires
personal information to set up an account and for riseup you have to
know someone who can invite you.

> - They have different threat models in terms of how they want to protect
> the resources they give out, which makes them have different preferences
> about which access methods will work best.

Following from the above point, it's true that at first glance GetTor
and BridgeDB seem to have conflicting threat models. GetTor would like
to send out as many copies of Tor Browser as much as possible, while
BridgeDB wants to be careful because users only need one working bridge
but censors want them all.

We can accomodate this while composing the systems though. We can have
the bridgedb email responder do whatever extra checks it needs to do
(e.g., filtering based on email provider) as an additional step. We
should be able to set this up so that emails get routed to different
services based on the destination address.

Ultimately, we want to change how we do rate limiting with BridgeDB
anyway because it doesn't work [2]. This is what our discussions on
rBridge/Hyphae/Salmon are about and why we'd like to integrate them.
This should allow us to open up what kinds of access methods we use for
BridgeDB so we can start using telegram, twitter, or email from any
provider for BridgeDB as well.

So, while we're thinking about how users interact with these reputation
systems, we should also think about whether these interactions could
happen over the kinds of distribution methods we're thinking of
implementing for GetTor. Or maybe we have different buckets for sending
bridges over GetTor [1].

> + But they both have a notion of "these resources aren't suitable for
> giving to people in this location", that is, they both want to learn
> something about *where* the user will be trying to apply the resource,
> and they both want to pull in some external "constraints" data set.
> + And being able to reuse (share) access methods could still be a
> win. For example, if Gettor learns how to answer people over Telegram,
> then it becomes much easier for BridgeDB to experiment with answering
> people over Telegram.
> + The monitoring tools ("is it working, am I getting answers, are the
> resources it gives me appropriate") overlap a lot between the two.
> - They have different text they want to use when interacting with users,
> and maybe different ways of handling errors. So those differences won't
> go away.

There's still functionality here that can be re-used. Like string
localization. It's something we haven't done for GetTor yet but want to.
And BridgeDB has already done it. Sure, we use different strings, but we
want to do the same kinds of things with them.

> = There are external tasks that gettor wants to do, such as pushing Tor
> Browser to the various websites that make up the resources we return. But
> just as BridgeDB is growing external modules like Wolpertinger, it's
> reasonable that GetTor could have external modules that publish/mirror
> our packages to various sites.
Yeah, and I'd like to automate these tasks for GetTor anyway.
> ? Is there a third thing that Tor wants to do, of the form "use this
> access method to give people this resource"? Generalizing from three
> would be even better than generalizing from two.
> Please use this initial brainstorming as a starting point and take this
> thread wherever it should best go. For example, maybe they should be
> two different front-ends and as much functionality as possible would be
> merged into a library that they both use. Or maybe gettor is essentially a
> smaller "lift" so it should become a module of what bridgedb does. Several
> options here.
Some more thoughts:

- There's a bit of a single-point-of-failure problem here. Now when one
thing goes down (or is attacked or whatever), both services stop
working. But, I think there's a better way to solve this than keeping
the services separate.

- If we're going to do this, we should make sure we're paying attention
to writing good, modular, readable code. My understanding is that this
would (and should) involve a large refactoring effort. The worst case
scenario would be for us to end up with a bunch of spaghetti code that's
hard to understand and maintain or two separate services just running on
the same machine. Will our current workload and sponsor work give us
space to spend time on actually improving the source code?

? This might make it easier to get funding to work on both of these
projects. I'm not a huge fan of letting funding concerns affect our
design decisions. This can either be a plus if it does give us more
surface area for funding, and also a minus if funding pushes us into a
place we don't like.

I am overall for this change, though much more familiar with GetTor than
BridgeDB. My guess is that we'd use the BridgeDB code base and integrate
GetTor functionality into it since my impression is that it's more
complicated than GetTor as it currently exists.

[0] https://bugs.torproject.org/3780

[1] https://bugs.torproject.org/3862

[2] https://bugs.torproject.org/31701

More information about the anti-censorship-team mailing list