[metrics-bugs] #33010 [Metrics/Ideas]: Monitor cloudflare captcha rate: do a periodic onionperf-like query to a cloudflare-hosted static site

Tue Mar 10 03:15:32 UTC 2020

#33010: Monitor cloudflare captcha rate: do a periodic onionperf-like query to a
cloudflare-hosted static site
---------------------------------------+------------------------------
 Reporter:  arma                       |          Owner:  metrics-team
     Type:  task                       |         Status:  new
 Priority:  Medium                     |      Milestone:
Component:  Metrics/Ideas              |        Version:
 Severity:  Normal                     |     Resolution:
 Keywords:  network-health gsoc-ideas  |  Actual Points:
Parent ID:                             |         Points:
 Reviewer:                             |        Sponsor:
---------------------------------------+------------------------------

Comment (by woswos):

 I wanted to conduct a few simple experiments on this issue. I will start
 by explaining my setup and continue with the experiments themselves.

 '''Domain Setup'''
 I registered two domains ([https://captcha.wtf/ captcha.wtf] and
 [https://exit11.online/ exit11.online]) with IPv4 records on Cloudflare.
 After playing with Cloudflare settings, I understood that domain owners
 have an important role in the way Cloudflare blocks Tor users.

 A new free Cloudflare account comes with a default security level (like
 the security levels in the Tor browser and as comment:5 mentioned), and
 the default security level doesn't explicitly block Tor users. I am not
 saying Cloudflare is innocent, but they don't mention a possible Tor user
 blocking at this security level. However, Tor shows up as a country on the
 Cloudflare firewall settings, and it is possible to block Tor users based
 on this firewall rule. I think they have a list of Tor exit node IPs, and
 they use this list to perform the filtering. They "offer" JS and Captcha
 challenges in addition to simple blocking, as shown in the image below:

 [[Image(https://bottomless-pit.barkin.io/tor-firewall-rules.png,
 width=100%)]]

 I think that's why some Tor users face more captcha challenges at higher
 Tor browser security levels. JavaScipt is blocked at higher security
 levels, and they can't pass the Cloudflare JS challenges.
 \\
 Also, if a firewall rule related to Tor is set, Cloudflare applies that
 rule (for example, the never-ending captcha challenge) all the time even
 if the user has somehow managed to pass the challenge 5 seconds ago - I
 think that is the part all of us hate, it just creates an endless loop. A
 sample Cloudflare firewall record below shows that the same IP address is
 continuously challenged over and over again, even after successfully
 passing the captcha challenge.

 [[Image(https://bottomless-pit.barkin.io/tor-firewall-1.png, width=100%)]]
 \\
 exit11.online has the default Cloudflare configuration without any
 additional firewall or protection. I am guessing that this would be the
 case with most of the average Cloudflare users. I also registered the
 [https://bypass.exit11.online/ bypass.exit11.online] subdomain, which
 bypasses the Cloudflare proxy and only utilizes Cloudflare as a DNS
 hosting service and CDN.

 [[Image(https://bottomless-pit.barkin.io/tor-cloudflare-exit11.png,
 width=100%)]]
 \\
 captcha.wtf has the default Cloudflare configuration ''with the additional
 firewall configuration'' for blocking Tor users, as I have mentioned
 previously. I registered this second domain to see the difference between
 using the default Cloudflare settings and adding additional firewall
 rules. I also registered the [https://bypass.captcha.wtf/
 bypass.captcha.wtf] subdomain, which bypasses the Cloudflare proxy and
 only utilizes Cloudflare as a DNS hosting service and CDN.

 [[Image(https://bottomless-pit.barkin.io/tor-cloudflare-wtf.png,
 width=100%)]]

 [[Image(https://bottomless-pit.barkin.io/tor-cloudflare-wtf-firewall.png,
 width=100%)]]
 \\
 Both of these domains have a very simple static "Hello world!" page at
 `/index.html`, and there is a more complicated page at `/complex.html`
 that loads resources from different locations. Additionally, captcha.wtf &
 exit11.online have SSL certificates issued by Cloudflare and
 bypass.captcha.wtf & bypass.exit11.online have SSL certificates issued by
 Let's Encrypt. I thought that these might have an effect on the way
 Cloudflare behaves.

 '''Experimenting'''
 Later, I used the Python script mentioned in comment:7 (it uses httplib)
 and the tor-browser-selenium mentioned in comment:13 to conduct a few
 simple experiments. I wrote another script to fetch different domain
 combinations via tor-browser-selenium and Python's httplib. For example,
 fetching bypass.exit11.online, exit11.online, exit11.online/complex.html,
 and bypass.exit11.online/complex.html via both tor-browser-selenium and
 Python's httplib.

 '''Results'''
 After fetching each combination about 100 times at one minute intervals,
 the domain with the default configuration (exit11.online) was not blocked
 a single time via both Tor and httplib. However, the domain with
 additional firewall configuration (captcha.wtf) was blocked every single
 time when fetched via Tor. Of course, both of the `bypass` subdomains were
 fine since Cloudflare proxy was disabled, but I wanted to test it anyway.

 '''Possible Conclusions'''
 I'm sure my simple tests are not enough at all to draw a meaningful
 conclusion, but these results make me question the role of domain owners
 in this endless captcha problem. The domain with default Cloudflare
 configurations didn't block Tor users, but the domain with extra firewall
 configuration set by the domain owner banned Tor users all the time.
 However, again, this is an observation based on my very limited
 experiments.

 I want to conduct more advanced experiments based on your feedback to
 address the metrics mentioned in the original ticket and find possible
 patterns in the recorded data.

 I will organize my code a little bit more and put a link to the repository
 here. Meanwhile, please feel free to use both of these domains for further
 testing.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33010#comment:14>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online