[tor-bugs] #25985 [Obfuscation/Snowflake]: Add AMP cache as another domain fronting option with Google

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu May 3 17:25:39 UTC 2018


#25985: Add AMP cache as another domain fronting option with Google
-----------------------------------+------------------------
 Reporter:  twim                   |          Owner:  (none)
     Type:  project                |         Status:  new
 Priority:  Medium                 |      Milestone:
Component:  Obfuscation/Snowflake  |        Version:
 Severity:  Normal                 |     Resolution:
 Keywords:                         |  Actual Points:
Parent ID:                         |         Points:
 Reviewer:                         |        Sponsor:
-----------------------------------+------------------------

Comment (by dcf):

 Replying to [comment:5 twim]:
 > > I presume you at least need a Google account; is it something you set
 up in the Google Cloud Platform? Is there a fee?
 >
 > Curiously enough you don't need a Google account for that because the
 AMP project itself isn't solely a Google thing. It is just a special HTML
 markup that can be accelerated by any party incl. Google. You just set up
 an AMP version of your pages at your host and it just works. No GCP
 involved. There is no fee at the moment for page loading, there will only
 be on API calls (not our case). As IANAL, I am not aware whether this
 usage violates ToS. I couldn't find any.

 Thanks for this great info. It's a lot easier than I imagined. The Google
 AMP cache will issue GET requests to arbitrary URLs on your behalf (going
 back to the [https://www.bamsoftware.com/papers/oss.pdf OSS] idea). I
 tried it with my web server, which I haven't done anything to set up for
 AMP:
   !https://www-bamsoftware-
 com.cdn.ampproject.org/c/s/www.bamsoftware.com/amptest
 This resulted in an HTTP request to my server:
 {{{
 64.233.172.149 - - [03/May/2018:10:59:20 -0600] "GET /amptest HTTP/1.1"
 404 3726 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P)
 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile
 Safari/537.36 (compatible; Google-AMPHTML)"
 }}}
 Probably because the page doesn't pass AMP validation (i.e., doesn't
 exist), the AMP cache's response was a status-200 meta/JavaScript redirect
 to the original URL:
 {{{
 HTTP/1.1 200 OK
 Location: https://www.bamsoftware.com/amptest
 Cache-Control: private
 X-Content-Type-Options: nosniff
 Date: Thu, 03 May 2018 17:05:10 GMT
 Content-Type: text/html; charset=UTF-8
 Server: sffe
 Content-Length: 361
 X-XSS-Protection: 1; mode=block
 Alt-Svc: hq=":443"; ma=2592000; quic=51303433; quic=51303432;
 quic=51303431; quic=51303339; quic=51303335,quic=":443"; ma=2592000;
 v="43,42,41,39,35"

 <HTML><HEAD>
 <meta http-equiv="content-type" content="text/html;charset=utf-8">
 <TITLE>Redirecting</TITLE>
 <META HTTP-EQUIV="refresh" content="1;
 url=https://www.bamsoftware.com/amptest">
 </HEAD>
 <BODY
 onLoad="location.replace('https://www.bamsoftware.com/amptest'+document.location.hash)">
 Redirecting you to https://www.bamsoftware.com/amptest</BODY></HTML>
 }}}

 > > I've seen different kinds of AMP URLs...
 > > Do you know what the difference between all these URL styles is? Are
 they basically interchangeable? The first one looks like the best, if we
 can use it.
 >
 > I haven't managed to make URLs like
 https://www.google.com/amp/s/amp.reddit.com/blablabla to not redirect to
 the full article. I am probably just do not understand how this kind of
 links differs from others.

 The trick with these is you have to use a mobile User-Agent. Press
 Ctrl+Shift+I to open the browser console, click the "Responsive Design
 Mode", and choose a phone from the menu.

 > > ​https://amp-reddit-com.cdn.ampproject.org/
 >
 > This is the kind of links I am using in amper. I guess that in theory
 *.cdn.ampproject.org can resolve to non-Google IPs as well. These hosts
 can be fronted by typical Google server names.

 Okay yeah, I found these guides to the URL format. The `c` means content
 (can also be `r` for resource or `i` for image) and the `s` means use TLS.
   https://developers.google.com/amp/cache/overview#amp-cache-url-format
   https://www.ampbyexample.com/advanced/using_the_google_amp_cache/#amp-
 cache-url-format

 > > https://amp.reddit.com/
 >
 > This is the host from which one is serving their AMP pages.

 I see; so "amp" in the name here is just a convention, not a requirement.

 I found this description of the three kinds of URLs:
   https://www.ampproject.org/latest/blog/whats-in-an-amp-url/
 The "*.cdn.ampproject.org" ones they call "AMP Cache" URLs and the
 "google.com/amp" ones they call "AMP Viewer" URLs. It seems like the "AMP
 Viewer" URLs are only produced automatically by Google in search result
 pages. But yeah, in any case, you can domain-front the "AMP Cache" URLs.

 How these all link together is you have the original non-AMP page:
 https://www.reddit.com/r/OutOfTheLoop/comments/56euau/whats_with_google_amp_quite_annoyingly_being_used/
 In the source code, there is a `<link rel="amphtml" />` that points to the
 AMP version:
 https://amp.reddit.com/r/OutOfTheLoop/comments/56euau/whats_with_google_amp_quite_annoyingly_being_used/
 The AMP URL (or any URL) can be mechanically converted to an AMP Cache
 URL:
   https://amp-reddit-
 com.cdn.ampproject.org/c/s/amp.reddit.com/r/OutOfTheLoop/comments/56euau/whats_with_google_amp_quite_annoyingly_being_used/
 Which in some contexts may appear as an AMP Viewer URL (adds a header to
 the page):
 https://www.google.com/amp/s/amp.reddit.com/r/funny/comments/8gpwtd/shut_up_and_take_my_money/

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25985#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list