Hello,
Inspired by https://trac.torproject.org/projects/tor/ticket/18361 I've been working on way to improve the situation.
My "proof of concept" tech demo is what I consider good enough for use by brave people that aren't me, so I have put up an XPI package at: https://people.torproject.org/~yawning/volatile/cfc-20160323/
The source: https://git.schwanenlied.me/yawning/cfc (Requires the Firefox SDK aka Jetpack to package).
By default the addon will:
* Rewrite the CloudFlare captcha error page with messages that reflect my perception of reality[0].
* Rewrite imgur ".gifv" links to ".gif".
Under "Tools->Addons->Extensions" you can configure the addon to:
* Automatically fetch a cached copy of pages hosted on CloudFlare infrastructure from archive.is.
* Automatically fetch a cached copy of pages that present a CloudFlare captcha from archive.is.
* Pop up a UI widget asking if you want to fetch a cached copy of the page from archive.is each time you encounter a captcha.
* Disable the snarky error message replacement (Requires a restart to take effect, because I'm lazy).
* Disable the gifv URL rewrite.
TODO:
* Support a user definable blacklist (eg: If you want to always use archive.is to access gawker.com or other clickbait bullshit, you should be able to easily do so).
* Add more general quality of life things.
* Think about making it work on Fenec (It currently will not because PopUpNotifications are handled differently, among other things).
* Rewrite the internals to prepare for e10s.
WARNING:
* If archive.is is evil, they can track you across page fetches trivially, because this sort of use case is outside of Tor Browser's current threat model (Yes, CloudFlare/Google can also do the same thing currently, who do you trust more?).
* PEOPLE THAT HAVE BIG SCARY ADVERSARIES IN THEIR THREAT MODEL SHOULD NOT USE THIS.
If you don't know how to install addons given as XPI files, you shouldn't be using this. This is only tested on 6.0a4 (Linux/64 bit). It *should* work on everything that isn't Orfox that's relatively modern, YMMV.
Regards,
[I hate replying to myself.]
On Wed, 23 Mar 2016 09:15:36 +0000 Yawning Angel yawning@schwanenlied.me wrote:
My "proof of concept" tech demo is what I consider good enough for use by brave people that aren't me, so I have put up an XPI package at: https://people.torproject.org/~yawning/volatile/cfc-20160323/
I noticed some dumb bugs and UI issues in the version I pushed so I changed a lot of things and uploaded a new version that should be better behaved. In particular:
* It is now Content Script based, and does IPC so it may survive the transition to sandboxed/multiprocess firefox better.
* It will always inject a button into the DOM instead of trying to display browser UI stuff (content scripts are supposed to have isolation...).
* The UI selection pref is removed.
* The ask on captcha option for behavior is removed, since a button always will be there to bypass it.
* Loading lots of pages that end up displaying street signs *should* now behave correctly.
The old release is under `./old` for posterity.
Sorry for the inconvenience,
During the OONI survey to find instances of server-side Tor blocking, we found a few variations on CloudFlare captcha pages. They don't all say "Attention Required!". Apparently there is an option to customize the page, but few sites make use of it. Here are the regexes we used (excerpted from https://www.bamsoftware.com/git/ooni-tor-blocks.git): if status == 403: if server == "cloudflare-nginx" and re.search("<title>Attention Required! \| CloudFlare</title>|One more step to access", body): return True, "403-CLOUDFLARE" if server == "cloudflare-nginx" and re.search("<noscript id="cf-captcha-bookmark" class="cf-captcha-info">|<button type="submit" class="cf-captcha-submit">", body): # A customized captcha page. return True, "403-CLOUDFLARE" if server == "cloudflare-nginx" and re.search("<title>Access denied \| [^ ]* used CloudFlare to restrict access</title>", body): # With this one you don't get a captcha. May be controlled by the # site operator. return True, "403-CLOUDFLARE" if status == 503: if re.search("<div class="cf-browser-verification cf-im-under-attack">", body): return True, "503-CLOUDFLARE" I now think the 'server == "cloudflare-nginx"' tests are unnecessary. The last two patterns above don't even give you a captcha to solve, just deny access. You might want to limit your detection to 403 and 503 responses (or maybe exempt 200-series and 300-series responses).
These are a couple of sites that used customized CloudFlare: https://4chan.org/ ("Verification Required") https://yelp.com/ ("You're not barking up the wrong tree...") yelp.com only started using CloudFlare a little while ago. It's a funny case, because they *also* implement a hard Tor blacklist. Once you get through the CloudFlare captcha 403, you get a 503 from a different system.
Thank you, Yawning! This looks great. :)
I think Kate was planning on writing up an official position of the Tor project on the CloudFlare situation. Amongst other things, it's expected to contain several strong arguments for convincing sites that the CAPTCHA does them no good and to make their CloudFlare configuration more Tor friendly. Or simply use another CDN like Akamai.
After that appears, one could add a mailto: link alongside the cfc button, so that users could easily start a dialog with the site where they encounter a CloudFlare CAPTCHA.
A mailto: link can have email header and body information like mailto:..@..?subject=Unreachable from Tor due to CloudFlare CAPTCA&body=.. And the body could contain some text derived from whatever Kate writes.
In principle, the mailto: link's destination could determine the site's contact information from whois : https://stackoverflow.com/questions/8435678/whois-with-javascript If that's annoying, then simply placing a unix command like "whois [site] | grep Email" into the body along with some explanation should suffice.
It's easy enough to do all this with a shell script of course, but if cfc moves towards many people using it then maybe encouraging people to email sites will help.
Jeff
On Wed, 2016-03-23 at 11:00 +0000, Yawning Angel wrote:
[I hate replying to myself.]
On Wed, 23 Mar 2016 09:15:36 +0000 Yawning Angel yawning@schwanenlied.me wrote:
My "proof of concept" tech demo is what I consider good enough for use by brave people that aren't me, so I have put up an XPI package at: https://people.torproject.org/~yawning/volatile/cfc-20160323/
I noticed some dumb bugs and UI issues in the version I pushed so I changed a lot of things and uploaded a new version that should be better behaved. In particular:
It is now Content Script based, and does IPC so it may survive the transition to sandboxed/multiprocess firefox better.
It will always inject a button into the DOM instead of trying to display browser UI stuff (content scripts are supposed to have isolation...).
The UI selection pref is removed.
The ask on captcha option for behavior is removed, since a button always will be there to bypass it.
Loading lots of pages that end up displaying street signs *should* now behave correctly.
The old release is under `./old` for posterity.
Sorry for the inconvenience,
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
Nice!
Random thought: rather than "unreachable from Tor", "unreachable when using the internet safely." This is really about people wanting security, and these companies not wanting to grapple with what their customers want.
On Wed, Mar 23, 2016 at 05:31:50PM +0100, Jeff Burdges wrote: | | Thank you, Yawning! This looks great. :) | | | I think Kate was planning on writing up an official position of the Tor | project on the CloudFlare situation. Amongst other things, it's | expected to contain several strong arguments for convincing sites that | the CAPTCHA does them no good and to make their CloudFlare configuration | more Tor friendly. Or simply use another CDN like Akamai. | | After that appears, one could add a mailto: link alongside the cfc | button, so that users could easily start a dialog with the site where | they encounter a CloudFlare CAPTCHA. | | A mailto: link can have email header and body information like | mailto:..@..?subject=Unreachable from Tor due to CloudFlare | CAPTCA&body=.. | And the body could contain some text derived from whatever Kate writes. | | In principle, the mailto: link's destination could determine the site's | contact information from whois : | https://stackoverflow.com/questions/8435678/whois-with-javascript | If that's annoying, then simply placing a unix command like "whois | [site] | grep Email" into the body along with some explanation should | suffice. | | It's easy enough to do all this with a shell script of course, but if | cfc moves towards many people using it then maybe encouraging people to | email sites will help. | | Jeff | | | | | On Wed, 2016-03-23 at 11:00 +0000, Yawning Angel wrote: | > [I hate replying to myself.] | > | > On Wed, 23 Mar 2016 09:15:36 +0000 | > Yawning Angel yawning@schwanenlied.me wrote: | > > My "proof of concept" tech demo is what I consider good enough for | > > use by brave people that aren't me, so I have put up an XPI package | > > at: https://people.torproject.org/~yawning/volatile/cfc-20160323/ | > | > I noticed some dumb bugs and UI issues in the version I pushed so I | > changed a lot of things and uploaded a new version that should be | > better behaved. In particular: | > | > * It is now Content Script based, and does IPC so it may survive the | > transition to sandboxed/multiprocess firefox better. | > | > * It will always inject a button into the DOM instead of trying to | > display browser UI stuff (content scripts are supposed to have | > isolation...). | > | > * The UI selection pref is removed. | > | > * The ask on captcha option for behavior is removed, since a button | > always will be there to bypass it. | > | > * Loading lots of pages that end up displaying street signs *should* | > now behave correctly. | > | > The old release is under `./old` for posterity. | > | > Sorry for the inconvenience, | > | > _______________________________________________ | > tor-dev mailing list | > tor-dev@lists.torproject.org | > https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev |
| _______________________________________________ | tor-dev mailing list | tor-dev@lists.torproject.org | https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Wed, Mar 23, 2016 at 12:33:15PM -0400, Adam Shostack wrote:
Nice!
Random thought: rather than "unreachable from Tor", "unreachable when using the internet safely." This is really about people wanting security, and these companies not wanting to grapple with what their customers want.
Yes! Not random at all. When trying to succincly contrast current means to access and use registered-domain sites vs. onionsites I not infrequently slip into calling them the insecure web and the secure web respectively.
aloha, Paul
On Wed, 2016-03-23 at 14:09 -0400, Paul Syverson wrote:
On Wed, Mar 23, 2016 at 12:33:15PM -0400, Adam Shostack wrote:
Random thought: rather than "unreachable from Tor", "unreachable when using the internet safely." This is really about people wanting security, and these companies not wanting to grapple with what their customers want.
Yes! Not random at all. When trying to succincly contrast current means to access and use registered-domain sites vs. onionsites I not infrequently slip into calling them the insecure web and the secure web respectively.
Yes, that sounds reasonable. There would be a bunch of linguistic decisions like that. I suggested waiting until Kate finishes her CloudFlare FAQ specifically because she would already be making many relevant such decisions.
I think the main technical question is : How hard is it to safely use whois from JavaScript?
Jeff
Yawning Angel wrote:
Inspired by https://trac.torproject.org/projects/tor/ticket/18361 I've been working on way to improve the situation.
Neat. In the thread someone mentions that it's possible to derive the answer for the old-style street number captchas using tesseract [1]. Interestingly, there is a version of tesseract in javascript [2]. This is probably not especially useful for the current "select all boxes that contain one pixel of street sign" Recaptcha system, but if there were a way to trigger the old behavior, these techniques could be used together.
~Griffin
[1] https://trac.torproject.org/projects/tor/ticket/18361#comment:173 [2] http://tesseract.projectnaptha.com/
On Wed, Mar 23, 2016 at 2:15 AM, Yawning Angel yawning@schwanenlied.me wrote:
My "proof of concept" tech demo is what I consider good enough for use by brave people that aren't me, so I have put up an XPI package at: https://people.torproject.org/~yawning/volatile/cfc-20160323/
Very cool!
- If archive.is is evil, they can track you across page fetches trivially, because this sort of use case is outside of Tor Browser's current threat model (Yes, CloudFlare/Google can also do the same thing currently, who do you trust more?).
Because CloudFlare presents its captcha page under the target site's domain name, and the Google ReCAPTCHA iframe is embedded inside that, Tor Browser is designed to prevent tracking across visits to different CloudFlared sites. So in that sense the archive.is option allows more tracking.
One possible solution could be for the extension to replace the HTML content inside a desired content page (say, https://imgur.com/some-page.html) with an iframe containing the archive.is version. The iframe would then be embedded under the desired first-party domain (e.g., imgur.com instead of archive.is) so that the page requests and caching are isolated to imgur.com.