[tor-bugs] #25985 [Obfuscation/Snowflake]: Snowflake rendezvous using AMP cache

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Jun 11 01:23:56 UTC 2018


#25985: Snowflake rendezvous using AMP cache
-----------------------------------+--------------------------------
 Reporter:  twim                   |          Owner:  (none)
     Type:  project                |         Status:  needs_revision
 Priority:  Medium                 |      Milestone:
Component:  Obfuscation/Snowflake  |        Version:
 Severity:  Normal                 |     Resolution:
 Keywords:                         |  Actual Points:
Parent ID:                         |         Points:
 Reviewer:                         |        Sponsor:
-----------------------------------+--------------------------------
Changes (by dcf):

 * status:  needs_review => needs_revision


Comment:

 This is some review of commit
 [https://github.com/nogoegst/snowflake/commit/b75253ba55bde140be733c4cf5425fc1aab161c4
 b75253ba55bde140be733c4cf5425fc1aab161c4].

 I had trouble testing this change locally because `amper.Client` was
 changing my http into https. This is what I did:
 {{{
 broker$ ./broker -disable-tls -addr 127.0.0.1:8080
 proxy-go$ ./proxy-go -broker http://127.0.0.1:8080/
 client$ TOR_PT_MANAGED_TRANSPORT_VER=1 TOR_PT_CLIENT_TRANSPORTS=snowflake
 ./client -codec amp -url http://127.0.0.1:8080/
 }}}
 The error I get from the client is a result of attempting to parse HTTP as
 TLS:
 {{{
 BrokerChannel Error: tls: oversized record received with length 20527
 }}}
 (On further reflection, what I was trying to do wouldn't have worked
 anyway, because the AMP client code would have sent a URL path of
 `/c/s/.../amp/client/...`, not `/amp/client/...`, but I could have worked
 around that for testing.) It looks like the cause is
 [https://github.com/nogoegst/amper/blob/6dbbc42eb44c1f389591c83a268524d5eceed1d1/client.go#L55
 here in amper/client.go], with a hardcoded `"https"`. Looking at that, I
 also noticed a
 [https://github.com/nogoegst/amper/blob/6dbbc42eb44c1f389591c83a268524d5eceed1d1/client.go#L23
 CDNDomain = "cdn.ampproject.org"]. I have to say, it skeeves me a bit to
 see hardcoded constants like that in a library; it seems like the kind of
 thing that belongs in a configuration file.

 ----

 I like the idea behind
 [https://github.com/nogoegst/snowflake/commit/f358a7696045336ef5e15ff6f8ee2467d8d5f53a
 the commit that adds the -codec flag]: generalize a currently hardcoded
 interface, then provide alternate implementations in a later commit. I
 must say, though, that I don't like the layering of
 [https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L222
 clientOffersPOST]→[https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L186
 ctx.handleClientOffers]→[https://github.com/nogoegst/snowflake/blob/f358a7696045336ef5e15ff6f8ee2467d8d5f53a/broker/broker.go#L169
 DecodeError]. Instead of returning an `error` directly,
 `ctx.handleClientOffers` writes a JSON representation of the error into a
 buffer provided by the caller, which the caller must then convert back
 into an `error` by calling `DecodeError`. It seems unnecessarily
 convoluted just to enable the side effect of writing directly to the
 response body in the AMP case. This is a case where I would prefer to see
 duplicated code first, because I suspect we can find a better factoring.

 In my testing, the `DecodeError` error recovery didn't work. It decodes to
 an `Error` that has the same `E` string value but is unequal to
 `ErrNoSnowflakes`, for example. When I run a
 [https://gitweb.torproject.org/pluggable-
 transports/snowflake.git/tree/client?id=25b304a9a856f8c791882ad523df26ffc8fa629c
 master branch client] (with POST) against an
 [https://github.com/nogoegst/snowflake/tree/b75253ba55bde140be733c4cf5425fc1aab161c4/broker
 amp branch broker], it fails because the broker returns status 200 when it
 should return status 503. The culprit seems to be
 [https://github.com/nogoegst/snowflake/blob/b75253ba55bde140be733c4cf5425fc1aab161c4/broker/broker.go#L229-L234
 this error-matching switch]. If I add a `default` case with a panic, I see
 that the `ErrNoSnowflakes` case isn't being matched, even though that is
 the intention:
 {{{
 http: panic serving 127.0.0.1:56174: {"error":"No snowflakes proxies
 available"}
 }}}

 ----

 Here's an example of what the AMP responses look like:
 {{{
 $ wget -q -O- http://127.0.0.1:8080/client/amp
 <!doctype html>
 <html amp>
   <head>
     <meta charset="utf-8">
     <title>amp</title>
     <link rel="canonical" href="https://ampproject.org" />
     <meta name="viewport" content="width=device-width,minimum-scale=1
 ,initial-scale=1">
     <style>body {opacity: 0}</style><noscript><style>body {opacity:
 1}</style></noscript>
     <script async src="https://cdn.ampproject.org/v0.js"></script>
   </head>
   <body>
     <p>In varietate concordia</p>
     <pre id="data">eyJlcnJvciI6Ik5vIHNub3dmbGFrZXMg
 cHJveGllcyBhdmFpbGFibGUifQo</pre>
   </body>
 </html>
 [The base64 decodes to '{"error":"No snowflakes proxies available"}'.]
 }}}
 The overall AMP container encoding looks pretty reasonable. A few tiny
 questions:
  * [https://www.ampproject.org/docs/getting_started/create/basic_markup
 basic_markup] says the `<script async>` has to be the second child of
 `<head>`, right after `<meta charset="utf-8">`.
  * The `<style>body {opacity: 0}</style><noscript><style>body {opacity:
 1}</style></noscript>` doesn't match the documented
 [https://www.ampproject.org/docs/reference/spec/amp-boilerplate.html AMP
 boilerplate code], which is much longer. Is there a reason?
  * `<link rel="canonical" href="https://ampproject.org" />`: it would be
 better not to refer to any third parties. The AMP guide says this can be a
 relative URL, so I would prefer one of these if they work:
    * `<link rel="canonical" href="">`
    * `<link rel="canonical" href="#">`
    * `<link rel="canonical" href="?">`

 ----

 About the encoded data inside the AMP HTML container: I'm wondering if we
 should always put the data payload ''inside'' the same JSON object that
 can signal an error, rather than making the payload be sent ''instead of''
 the error object when there is no error. This is how it currently works:
  1. Try to parse the response body as JSON with an `"error"` field. If the
 JSON parsing is successful, handle the error and stop.
  2. Otherwise (JSON parsing failed), return the entire response body as a
 blob of bytes.
 That is, under the base64 encoding, a message can look either like this:
 {{{
 {"error": "No snowflake proxies available"}
 }}}
 or like this:
 {{{
 {"type": "answer", "sdp": "..."}
 }}}
 One problem with this design, as far as amper is concerned, is that it's
 impossible for the server to send a payload that happens to parse as JSON
 with an `"error"` field—it will be interpreted as an error, not a payload.
 This consideration doesn't affect Snowflake, because Snowflake's payloads
 don't have an `"error"` field even though they are JSON. But another
 consideration is that you can't tell the difference between a non-error
 response and an improperly encoded error response (if the body is
 truncated, for example).

 I propose to remove this ambiguity by including the data payload inside
 the JSON object that represents an error. Just like in an HTTP response,
 we have two things—a status code and a message—inside one package. A
 message would either look like this:
 {{{
 {"error": "No snowflake proxies available", "data": null}
 }}}
 or like this:
 {{{
 {"error": null, "data": "eyJ0eXBlIjogImFuc3dlciIsICJzZ..."}
 }}}
 (The `null` fields are included for clarity and could be omitted in
 practice.) To handle this, a client needs to `json.Unmarshal` on a `struct
 { Error, Data: string }`. Any JSON decoding error signifies an error in
 the AMP transport, not a Snowflake-level error. Assuming JSON decoding is
 successful, the `"error"` field indicates whether there was a Snowflake-
 level error.

 About signaling errors, I would prefer to use machine-readable strings
 (e.g. `"ErrNoSnowflakes"`) rather than human-readable strings (e.g. `"No
 snowflake proxies available"`). The latter seems more likely to get broken
 accidentally by someone adding punctuation, for example. The variable
 names you've chosen are good, but I would just them as the actual values,
 not just the identifiers. We can set up a canonical mapping, e.g.
 200⇒`""`, 503⇒`"ErrNoSnowflakes"`, 504⇒`"ErrAnswerTimeout"`, to map the
 status codes from POST into the error strings for common handling.

 ----

 Does
 [https://github.com/nogoegst/snowflake/blob/b75253ba55bde140be733c4cf5425fc1aab161c4/client/rendezvous.go#L69
 NewBrokerChannel] need to instantiate an `amper.Client` when it's not
 using `-codec amp`?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25985#comment:24>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list