[tor-bugs] #30704 [Circumvention/Snowflake]: Plan for snowflake update versioning and backwards compatability

Wed Jun 19 19:11:04 UTC 2019

#30704: Plan for snowflake update versioning and backwards compatability
-------------------------------------+------------------------
 Reporter:  cohosh                   |          Owner:  (none)
     Type:  task                     |         Status:  new
 Priority:  Medium                   |      Milestone:
Component:  Circumvention/Snowflake  |        Version:
 Severity:  Normal                   |     Resolution:
 Keywords:                           |  Actual Points:
Parent ID:                           |         Points:
 Reviewer:                           |        Sponsor:
-------------------------------------+------------------------

Old description:

> We have some upcoming changes that will change the way snowflake
> components talk to each other. We should decide (and possibly on a case-
> by-case basis) how to handle these updates.
> - Do we make sure changes are backwards compatible with clients/proxies
> that haven't updated yet?
> - Should we think about introducing some concept of versioning?
> - If we support older versions, how long until we no longer support them?
>
> Some examples of tickets that we'll need to think about this for:
> - 25429
> - 29206
> - #25985

New description:

 We have some upcoming changes that will change the way snowflake
 components talk to each other. We should decide (and possibly on a case-
 by-case basis) how to handle these updates.
 - Do we make sure changes are backwards compatible with clients/proxies
 that haven't updated yet?
 - Should we think about introducing some concept of versioning?
 - If we support older versions, how long until we no longer support them?

 Some examples of tickets that we'll need to think about this for:
 - #25429
 - #29206
 - #25985

--

Comment (by dcf):

 Ideally, we can keep the proxies as protocol-ignorant as possible, so that
 they don't impede changes at the endpoints (end-to-end principle). #29206,
 for example, likely won't require any changes in the proxy, which can
 continue blindly forwarding anything it receives. Similarly, #25985
 ideally only requires changes in the client and broker, not the proxy and
 server. The exception to this principle is if we need to change something
 about the WebRTC tunnel rather than the data tunnelled within it, for
 example if we want to try an unreliable channel.

 If we do need to upgrade proxies, my mental model is that proxies are easy
 to upgrade (at least the web-based ones) because they
 [https://gitweb.torproject.org/pluggable-
 transports/snowflake.git/tree/proxy/static/embed.html?id=91255463c68c3ada6adc8718bf380cbd654fe9ef#n4
 reboot themselves] once a day. Flash proxy had the same feature, and in
 [https://www.bamsoftware.com/talks/ee380-flashproxy/index.html#s14 this
 graph] you can see how quickly proxies upgraded when we had them
 [[#7063|start sending a cookie]].
 ([[span(style=background:blanchedalmond;padding:0 0.5ex,--)]] is cookie-
 naive; [[span(style=background:darkkhaki;padding:0 0.5ex,unset)]] and
 [[span(style=color:white;background:#222222;padding:0 0.5ex,1)]] are
 cookie-aware.)

 Let's take #29206 as an example. It changes the format of the stream
 between client and server to add framing. Here are a few potential ways to
 handle it:
  1. Flag day. Ignore backward compatibility. Push out an upgraded client
 in Tor Browser and try to coordinate an upgrade of the server at roughly
 the same time. This of course breaks all clients that do not upgrade, but
 we can perhaps get away with that at this stage.
  2. Backward-compatible protocol versioning. I think this is what cohosh
 is suggesting in comment:13:ticket:29206. We know that the old protocol is
 a raw TLS stream, so we can add a header of 00000000 or something to the
 new protocol, anything that enables easy distinguishing from the old
 protocol. The server peeks at the first few bytes to know what protocol is
 in use, and then switches to raw-stream mode or framing mode as
 appropriate. 00000000 could change to different numbers to represent
 future upgrades. The downside here is code complexity, maintaining two or
 more code paths.
  3. Parallel deployment. We make a branch for the old protocol and merge
 the new protocol into master. We deploy two instances of the server, one
 speaking the old protocol and one speaking the new. These could be on
 separate IP addresses/domain names, or could even be on the same host, say
 !wss://snowflake.bamsoftware.com/ and !wss://snowflake.bamsoftware.com/v2,
 with a reverse proxy diverting requests as appropriate. When a client
 registers with the broker, it includes a signal that indicates which
 protocol version the client supports. The proxy will need to know which
 server to connect to, so either we have the have the broker ''tell'' the
 proxy which server to connect to instead of having that information
 hardcoded in the proxy (something like #25598), or else we maintain two
 pools of proxies, one that uses the old server and one that uses the new.
 Eventually we deactivate the old server. This way would put the code
 complexity in the broker rather than the server, so it depends on the
 nature of the code change whether the trade is worth it.

 As for sending additional information (such as a version flag) from the
 client to the broker, I would ideally like to see that bundled into the
 registration blob. comment:16:ticket:29206 suggests using parallel
 metadata such as a URL path or HTTP header, but that only works with HTTP-
 based rendezvous, not for others that are proposed in #25594. Currently
 the rendezvous blob is just the raw text of the [https://w3c.github.io
 /webrtc-pc/#dom-rtcsessiondescription RTCSessionDescription] JSON:
 {{{
 {
   "type": "offer",
   "sdp": "v=0\r\no=...\r\n"
 }
 }}}
 It would be better if there were another layer so that we could put other
 metadata into the blob. Like:
 {{{
 {
   "version": 1,
   "foo": "bar",
   "sessiondescription": {
     "type": "offer",
     "sdp": "v=0\r\no=...\r\n"
   }
 }
 }}}
 The benefit is that we can bundle all the later into e.g. a DNS request,
 and in that way we make the rendezvous method independent of the contents
 of the rendezvous method. We could adapt to the nested format in a
 backward-compatible way by having the broker check whether there is a
 `"sessiondescription"` key at the top level, and if not, synthesize a new
 message that has the entire former message nested under that key. The
 additional code complexity is not bad: just check for the old format and
 convert it to the new format if needed before doing any other processing.

 Something similar applies to the broker's response messages toward the
 client and proxies. Currently the messages depend on HTTP metadata, namely
 the status code (comment:2:ticket:29293). They look like this:
  *
    {{{
 HTTP/2.0 200 OK
 Content-Length: 742

 {
   "type":"answer",
   "sdp":"v=0\r\no=...\r\n"
 }
    }}}
  *
    {{{
 HTTP/2.0 504 Gateway Timeout
 Content-Length: 0

    }}}
 It would be better if ''all'' the necessary information were in the HTTP
 body, because that's something that can be easily bundled up into other
 channels like DNS or AMP cache. Something like this:
  *
    {{{
 HTTP/2.0 200 OK
 Content-Length: 780

 {
   "status": 200,
   {
     "type":"answer",
     "sdp":"v=0\r\no=...\r\n"
   }
 }
    }}}
  *
    {{{
 HTTP/2.0 504 Gateway Timeout
 Content-Length: 21

 {
   "status": 504
 }
    }}}
 Then we upgrade clients and proxies to only look at the HTTP body and
 ignore the status code. We keep sending the old status codes for the
 benefit of older clients. Now that I think of it, this is backward-
 compatible for error responses, because old clients/proxies will only look
 at the status code and ignore the body, but not backward compatible for
 status 200, because the format of the body message will change. Maybe we
 could keep the toplevel `"type"`/`"body"` as they are, to signify an
 implicit status 200. Also, `"status": 504` is just an example; we may
 prefer to represent that as a meaningful token like `"status": "no-
 proxies"`.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30704#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online