Hi folks!
I've been pondering our two tools, gettor and bridgedb, and realizing that conceptually they overlap in a bunch of ways. Does that mean, while Cecylia is exploring getting gettor back on its feet, we should make them overlap more in a technical sense too?
Some initial thoughts:
+ Having fewer tools to maintain is an obvious win, in terms of fewer moving pieces, fewer things to explain to people, less mental energy.
+ Broadly speaking, they are both tools that want to support a variety of low bandwidth "access methods" for reaching them, and they use these access methods to tell users about "resources".
- They have different threat models in terms of how they want to protect the resources they give out, which makes them have different preferences about which access methods will work best.
+ But they both have a notion of "these resources aren't suitable for giving to people in this location", that is, they both want to learn something about *where* the user will be trying to apply the resource, and they both want to pull in some external "constraints" data set.
+ And being able to reuse (share) access methods could still be a win. For example, if Gettor learns how to answer people over Telegram, then it becomes much easier for BridgeDB to experiment with answering people over Telegram.
+ The monitoring tools ("is it working, am I getting answers, are the resources it gives me appropriate") overlap a lot between the two.
- They have different text they want to use when interacting with users, and maybe different ways of handling errors. So those differences won't go away.
= There are external tasks that gettor wants to do, such as pushing Tor Browser to the various websites that make up the resources we return. But just as BridgeDB is growing external modules like Wolpertinger, it's reasonable that GetTor could have external modules that publish/mirror our packages to various sites.
? Is there a third thing that Tor wants to do, of the form "use this access method to give people this resource"? Generalizing from three would be even better than generalizing from two.
Please use this initial brainstorming as a starting point and take this thread wherever it should best go. For example, maybe they should be two different front-ends and as much functionality as possible would be merged into a library that they both use. Or maybe gettor is essentially a smaller "lift" so it should become a module of what bridgedb does. Several options here.
--Roger
On 2020-05-14 5:14 a.m., Roger Dingledine wrote:
Hi folks!
I've been pondering our two tools, gettor and bridgedb, and realizing that conceptually they overlap in a bunch of ways. Does that mean, while Cecylia is exploring getting gettor back on its feet, we should make them overlap more in a technical sense too?
I'm a fan of this idea. There were a lot of times recently, while working on GetTor tickets, that I realized we were doing the same thing for BridgeDB. Some of this work (redoing the text in replies) wasn't redundant, but some was. It's getting difficult to remember the lessons learned from working on one tool so we can apply it to the other. Some upcoming work that is definitely redundant is string localization and usage metrics.
In fact, it looks like there's already an old ticket for this [0].
Some initial thoughts:
- Having fewer tools to maintain is an obvious win, in terms of fewer
moving pieces, fewer things to explain to people, less mental energy.
- Broadly speaking, they are both tools that want to support a variety
of low bandwidth "access methods" for reaching them, and they use these access methods to tell users about "resources".
This is a big one. Moat was a huge improvement to the usability of BridgeDB, but we've been talking about distributing bridges with GetTor as well [1] since if you need GetTor, you probably need bridges anyway.
Additionally, a lot of people have complained that the gmail/riseup restriction for BridgeDB is difficult to deal with. Google requires personal information to set up an account and for riseup you have to know someone who can invite you.
- They have different threat models in terms of how they want to protect
the resources they give out, which makes them have different preferences about which access methods will work best.
Following from the above point, it's true that at first glance GetTor and BridgeDB seem to have conflicting threat models. GetTor would like to send out as many copies of Tor Browser as much as possible, while BridgeDB wants to be careful because users only need one working bridge but censors want them all.
We can accomodate this while composing the systems though. We can have the bridgedb email responder do whatever extra checks it needs to do (e.g., filtering based on email provider) as an additional step. We should be able to set this up so that emails get routed to different services based on the destination address.
Ultimately, we want to change how we do rate limiting with BridgeDB anyway because it doesn't work [2]. This is what our discussions on rBridge/Hyphae/Salmon are about and why we'd like to integrate them. This should allow us to open up what kinds of access methods we use for BridgeDB so we can start using telegram, twitter, or email from any provider for BridgeDB as well.
So, while we're thinking about how users interact with these reputation systems, we should also think about whether these interactions could happen over the kinds of distribution methods we're thinking of implementing for GetTor. Or maybe we have different buckets for sending bridges over GetTor [1].
- But they both have a notion of "these resources aren't suitable for
giving to people in this location", that is, they both want to learn something about *where* the user will be trying to apply the resource, and they both want to pull in some external "constraints" data set.
- And being able to reuse (share) access methods could still be a
win. For example, if Gettor learns how to answer people over Telegram, then it becomes much easier for BridgeDB to experiment with answering people over Telegram.
- The monitoring tools ("is it working, am I getting answers, are the
resources it gives me appropriate") overlap a lot between the two.
- They have different text they want to use when interacting with users,
and maybe different ways of handling errors. So those differences won't go away.
There's still functionality here that can be re-used. Like string localization. It's something we haven't done for GetTor yet but want to. And BridgeDB has already done it. Sure, we use different strings, but we want to do the same kinds of things with them.
= There are external tasks that gettor wants to do, such as pushing Tor Browser to the various websites that make up the resources we return. But just as BridgeDB is growing external modules like Wolpertinger, it's reasonable that GetTor could have external modules that publish/mirror our packages to various sites.
Yeah, and I'd like to automate these tasks for GetTor anyway.
? Is there a third thing that Tor wants to do, of the form "use this access method to give people this resource"? Generalizing from three would be even better than generalizing from two.
Please use this initial brainstorming as a starting point and take this thread wherever it should best go. For example, maybe they should be two different front-ends and as much functionality as possible would be merged into a library that they both use. Or maybe gettor is essentially a smaller "lift" so it should become a module of what bridgedb does. Several options here.
Some more thoughts:
- There's a bit of a single-point-of-failure problem here. Now when one thing goes down (or is attacked or whatever), both services stop working. But, I think there's a better way to solve this than keeping the services separate.
- If we're going to do this, we should make sure we're paying attention to writing good, modular, readable code. My understanding is that this would (and should) involve a large refactoring effort. The worst case scenario would be for us to end up with a bunch of spaghetti code that's hard to understand and maintain or two separate services just running on the same machine. Will our current workload and sponsor work give us space to spend time on actually improving the source code?
? This might make it easier to get funding to work on both of these projects. I'm not a huge fan of letting funding concerns affect our design decisions. This can either be a plus if it does give us more surface area for funding, and also a minus if funding pushes us into a place we don't like.
I am overall for this change, though much more familiar with GetTor than BridgeDB. My guess is that we'd use the BridgeDB code base and integrate GetTor functionality into it since my impression is that it's more complicated than GetTor as it currently exists.
[0] https://bugs.torproject.org/3780
anti-censorship-team@lists.torproject.org