[tor-bugs] #30716 [Circumvention/Obfs4]: Improve the obfs4 obfuscation protocol

Wed Sep 4 02:29:32 UTC 2019

#30716: Improve the obfs4 obfuscation protocol
-------------------------------------------------+-------------------------
 Reporter:  phw                                  |          Owner:  phw
     Type:  task                                 |         Status:
                                                 |  assigned
 Priority:  High                                 |      Milestone:
Component:  Circumvention/Obfs4                  |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:  sponsor28, anti-censorship-roadmap-  |  Actual Points:
  august                                         |
Parent ID:                                       |         Points:  20
 Reviewer:                                       |        Sponsor:
                                                 |  Sponsor28-must
-------------------------------------------------+-------------------------
Changes (by phw):

 * cc: dcf (added)

Comment:

 Website fingerprinting attacks typically operate on traffic traces that
 are frequently encoded as sequences of the form:
 {{{
 <time>,+/-<packet length>
 }}}
 `+<packet length>` refers to packets going from the client to the server
 and `-<packet length>` refers to packets going from the server to the
 client. For example:
 {{{
 1567548098,+1500
 1567548098,+800
 1567548099,-1500
 1567548099,-1500
 1567548100,-700
 }}}
 Interestingly, packet lengths may not even be necessary. In their
 [https://arxiv.org/pdf/1801.02265.pdf CCS'18 paper], Sirinam et al. write
 in Section 5.1.1:
 > However, we performed preliminary evaluations to compare the WF attack
 performance between using packet lengths and without packet lengths, i.e.,
 only packet direction, as feature representations. Our result showed that
 using packet lengths does not provide a noticeable improvement in the
 accuracy of the attack. Therefore, we follow Wang et al.’s methodology and
 consider only the direction of the packets.
 The traffic trace above can therefore be reduced to:
 {{{
 +1
 +1
 -1
 -1
 -1
 }}}
 Note that obfs4 makes no attempt to defend against website fingerprinting
 attacks. Its goal is to escape protocol classification but these two
 problems (and their respective attacks) overlap to some extent, which is
 why obfs4 would be better off with defences against such attacks.

 [https://lists.torproject.org/pipermail/tor-dev/2017-June/012310.html As
 dcf already pointed out], obfs4 only sends data when the application
 (e.g., Tor) has data to send. Then, depending on what iatMode is used,
 obfs4 may append padding to the application's data and add inter-arrival
 delays. Coming back to the example above, obfs4 can only **extend** a
 packet burst but not **break** a burst. That is, obfs4 can turn the packet
 sequence
 {{{
 +1
 +1
 -1
 -1
 -1
 }}}
 into the sequence
 {{{
 +1
 +1
 +1 (padding packet, which extends a burst)
 -1
 -1
 -1
 }}}
 but not into the sequence
 {{{
 +1
 -1 (padding packet, which breaks a burst)
 +1
 -1
 +1 (padding packet, which breaks a burst)
 -1
 -1
 }}}
 I spent some time looking into ways to fix this issue. It turns out that
 we can add the ability to break packet bursts to obfs4 without losing
 backwards compatibility, allowing a brand-new, burst-breaking obfs4 client
 to talk to an old obfs4 server (however, see below for a caveat). I
 implemented a simple proof-of-concept, for now called
 [https://trac.torproject.org/projects/tor/wiki/doc/PluggableTransports/BabyNameBook
 sharknado], in my
 [https://dip.torproject.org/phw/obfs4/commit/8da050f29866444b9af685d277c20b7ab142593a
 feature/30716 branch]. The idea is simple: instead of having obfs4 write
 directly to its socket, it now writes to the `SharknadoConn` struct, which
 implements the `net.Conn` interface.  After each call to `Read`, there's a
 1 in 10 chance to send padding, regardless of if the application has data
 waiting or not.

 There are several remaining challenges:
 * Effectively breaking bursts may require the client and the server to
 cooperate. For example, when the client receives the beginning of a burst,
 the adversary (who's somewhere between the client and the server) may
 already have seen the entire packet sequence, so we cannot break it
 anymore. We may be able to address this by having the server send only a
 few packets of its burst and then waiting until it received the client's
 burst-breaking packets.
 * We should find a way to make obfs4's packet sequences server-specific by
 incorporating the server's shared secret into the sequence generation
 process, just like it's done for packet lengths and inter-arrival times.
 * We need to build an evaluation framework to understand what works and
 what doesn't.

 Any thoughts?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30716#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online