There are 4 parts.
-- /1./ --
For those that are new let me explain my discussion with Philipp Winter.
I suggest a browser fingerprinting based bridge distribution.
Each fingerprint only gets one bridge, so if you keep automatically asking you just keep getting the same address over and over.
I can tell you that it is almost impossible to exhaustively spoof a fingerprint, and we can use that to our advantage. We can group similar fingerprints together so that it doesn't matter if someone spoofs a few things. And we should ignore the user-agent.
I think I'm missing something here. Doesn't it suffice for an adversary
to tamper with a single feature to create a new browser fingerprint, and
thus obtain different bridges? I suppose it would depend on how the
server derives its fingerprints.<<
Like I said: Once we've given out all the bridges to each unique fingerprint then we start grouping new fingerprints to the nearest match.
In other words we assume the first users are more legitimate, and the GFW comes along later. So in the beginning we give out a new bridge to a fingerprint that is anyhow different from one we’ve seen before, and once we’ve given them all out (that we are willing to give out through fingerprinting), then if a new fingerprint comes along, he gets the same bridge as the fingerprint that had the highest quantity of same aspects to it. This defeats partial spoofing. Like I said, full spoofing is near impossible.
-- /2./ --
I know lots of people in China using Linode and Google Cloud which also
a credit card. They only support the credit card like VISA or MasterCard
instead of China UnionPay card. So it isn't too hard if someone wants to
get the card, especially is GFW who be described as "unlimited fund" by
GFW technology review blog.
They don't block the IP just because a few people use it. Yes, the
credit card it did block a lot of people:(<<
Thanks Tom for clearing up my misunderstanding.
-- /3./ –
Why not switch Obfs4 bridges to dynamic IPs and remove UpdateBridgesFromAuthority. This would make it like WAY more blocking resistant.
-- /4./ –
I take back what I said, I recon the Salmon algorithm is currently the best proxy distribution we have. I’m planning to read the paper, but just from what I’ve seen from the talk... https://www.invidio.us/watch?v=RO3wXRn8BfY
… I see one major vulnerability: what if the GFW gets just one bridge, then creates thousands of accounts (which there is no barrier for), then sends an invite across all of those accounts, then waits for trust level to get high, and all of those accounts gets a new bridge, repeat until you know all the bridges, then only block them all in one go once they have all the addresses.
The solution is to put a limit on how many addresses are given out per accounts with knowledge of a certain bridge. Basically extending the recommendation grouping that every bridge which accounts know only allows for a finite quantity of new bridges to be received. So an adversary that has one address can’t crawl the whole network, but only a fraction of it.
Another but minor weakness, I don’t think social network based distribution would scale well compared to fingerprinting based distribution.
I’m not claiming to know it all but hopefully my thoughts are helpful.
It seems like we are trying to solve two distinct problems with some large sub-problems, but we haven't adequately unraveled them:
1. How to distribute bridges from a database to minimize bridge enumeration by adversaries
a. How to identify users
b. How to distribute bridges to identified users
2. How to add more bridges and/or refresh burnt bridges to the database.
a. Adding more bridges
b. Refreshing burnt bridges
------------------------------ *1a.* Seems to be a topic of discussion. Before addressing a problem I prefer to understand what I want out of a solution to a problem. In the case of 1a for this project, I'd like:
Consistent mapping
Cross platform
Maximizes user privacy/anonymity
Understandable to users
Usable by all users
No/minimal maintenance required
Three solutions have been proposed - browser fingerprinting (via this thread) and social networks (via salmon paper). I'd argue that both fail to be useful in this case. (At least w/r/t the criteria I proposed - I'd be curious to hear other criteria sets)
*Browser fingerprinting* fails because if I try to get bridges from my desktop vs my laptop vs my cell vs at a library vs at an internet cafe, I will not be mapped to the same user. Additionally, if I make a change to my setup (switch from Firefox to Chrome, new monitor resolution, etc.) even on the same machine I likely would not be mapped to the same profile (or if I am, it seems that a censor would be able to generate enough profiles that they end up in several "trusted" groups). I'd also argue that collecting the browser fingerprints of some of the most vulnerable Tor users (those that need bridges) is a risk. While I trust the Tor project to maintain good data and security practices at present, am I confident that a new vulnerability won't arise, or a new 0 day that could be used to acquire the information Tor maintains? No - especially not with the value that a DB of fingerprints of those subverting censorship would be to a censor.
*Social networks* fail because they are not usable by all users (whether by choice or by censorship). This approach will also require maintenance.
*SMS *- Tom states "Chinese +86 phone numbers must be real name verification to using (due to against like telephone crimes). Using these number received the SMS will break the anonymity." Philipp also states that this would likely be resource intensive.
I'll propose one more potential - PGP signing, which I think works better than the above three, but is far from perfect.
*PGP signing* requests for bridges would enable a consistent, cross-platform mapping. There are plenty of resources available to understand how pgp works, and minimal maintenance and implementation would be needed. There are lots of open source implementations too. We also get the added bonus of years of people looking at trust and PGP keys, so we don't need to start from scratch when approaching this difficult problem. Additionally, users must deliberately sign their request with their preferred key.
------------------------------ *1b.* Seems well solved by salmon, at least on my first read. ------------------------------ *2a.* I do not have any thoughts on this ------------------------------ *2b.* I have been thinking about this a lot lately - it seems that there are two parts to solve this problem too, but either could be useful on their own:
1. Reach-ability probing from adversarial regions (aka identifying burnt bridges)
2. Ability for bridges to easily and quickly change IP address (refreshing a bridge)
1. Is a really interesting problem which could help inform our approaches to these other problems - it would provide more accurate data as to when bridges get blocked (which seems useful w/ salmon). Of course, this would require either having a vantage point in an adversarial region, or taking advantage of existing protocols to cause something in the adversarial region to ping a server. We also wouldn't want an adversary to be able to simply follow our "pings" to enumerate our server as well. I do not think this is an easy problem, but I think a successful solution would be immensely useful.
2. Is dependent on the bridge hosting solution, however, we may be able to provide a few scripts for a few environments which would do the job. For example, you could implement 2 in gcloud pretty simply: - Setup gcloud compute https://cloud.google.com/compute/docs/gcloud-compute on your Google cloud OBFS server. Then, setup a script which does the following when triggered:
- Initialized a new instance https://cloud.google.com/compute/docs/gcloud-compute#creating - Give the old instance ssh access to the new instance https://cloud.google.com/compute/docs/gcloud-compute#connecting-other - SCP over the existing torrc, bridge state files, etc. - Once complete, uses ssh to tell the new bridge all files have been copied over and it can finish initialization and start up - New bridge deletes old bridge https://cloud.google.com/compute/docs/gcloud-compute#deleting after ensuring it is fully functional
There may be a step or two missing above, but this should give the bridge a new ip address, and this script could be manually triggered by a bridge operator, or if we figure out a bridge reach-ability solution then the bridge could use the solution and automatically restart based upon failed reach-ability tests. And regarding the scripts, my understanding is that AWS also provides enough power in their CLI to be able to do the same, and this should add minimal (if any) expenses on top of what any bridge operators are incurring in the cloud.
Best, Sam
On Fri, May 1, 2020 at 9:57 PM soncyq47 soncyq47@protonmail.com wrote:
There are 4 parts.
-- /1./ --
For those that are new let me explain my discussion with Philipp Winter.
I suggest a browser fingerprinting based bridge distribution.
Each fingerprint only gets one bridge, so if you keep automatically asking you just keep getting the same address over and over.
I can tell you that it is almost impossible to exhaustively spoof a fingerprint, and we can use that to our advantage. We can group similar fingerprints together so that it doesn't matter if someone spoofs a few things. And we should ignore the user-agent.
I think I'm missing something here. Doesn't it suffice for an adversary
to tamper with a single feature to create a new browser fingerprint, and
thus obtain different bridges? I suppose it would depend on how the
server derives its fingerprints.<<
Like I said: Once we've given out all the bridges to each unique fingerprint then we start grouping new fingerprints to the nearest match.
In other words we assume the first users are more legitimate, and the GFW comes along later. So in the beginning we give out a new bridge to a fingerprint that is anyhow different from one we’ve seen before, and once we’ve given them all out (that we are willing to give out through fingerprinting), then if a new fingerprint comes along, he gets the same bridge as the fingerprint that had the highest quantity of same aspects to it. This defeats partial spoofing. Like I said, full spoofing is near impossible.
-- /2./ --
I know lots of people in China using Linode and Google Cloud which also
a credit card. They only support the credit card like VISA or MasterCard
instead of China UnionPay card. So it isn't too hard if someone wants to
get the card, especially is GFW who be described as "unlimited fund" by
GFW technology review blog.
They don't block the IP just because a few people use it. Yes, the
credit card it did block a lot of people:(<<
Thanks Tom for clearing up my misunderstanding.
-- /3./ –
Why not switch Obfs4 bridges to dynamic IPs and remove UpdateBridgesFromAuthority. This would make it like WAY more blocking resistant.
-- /4./ –
I take back what I said, I recon the Salmon algorithm is currently the best proxy distribution we have. I’m planning to read the paper, but just from what I’ve seen from the talk... https://www.invidio.us/watch?v=RO3wXRn8BfY
… I see one major vulnerability: what if the GFW gets just one bridge, then creates thousands of accounts (which there is no barrier for), then sends an invite across all of those accounts, then waits for trust level to get high, and all of those accounts gets a new bridge, repeat until you know all the bridges, then only block them all in one go once they have all the addresses.
The solution is to put a limit on how many addresses are given out per accounts with knowledge of a certain bridge. Basically extending the recommendation grouping that every bridge which accounts know only allows for a finite quantity of new bridges to be received. So an adversary that has one address can’t crawl the whole network, but only a fraction of it.
Another but minor weakness, I don’t think social network based distribution would scale well compared to fingerprinting based distribution.
I’m not claiming to know it all but hopefully my thoughts are helpful.
anti-censorship-team mailing list anti-censorship-team@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/anti-censorship-team
anti-censorship-team@lists.torproject.org