[tor-bugs] #27112 [Core Tor/Stem]: Decouple payload processing from pop/unpack + tune abstraction layers

Thu Aug 30 19:30:28 UTC 2018

#27112: Decouple payload processing from pop/unpack + tune abstraction layers
---------------------------+--------------------------------
 Reporter:  dmr            |          Owner:  dmr
     Type:  enhancement    |         Status:  needs_revision
 Priority:  Medium         |      Milestone:
Component:  Core Tor/Stem  |        Version:
 Severity:  Normal         |     Resolution:
 Keywords:  client         |  Actual Points:
Parent ID:                 |         Points:
 Reviewer:  atagar         |        Sponsor:
---------------------------+--------------------------------

Comment (by dmr):

 Sorry again for the delay in my response, and for the verbosity...

 I think it's probably best that I step back a bit to explain a bit more of
 the broader picture as I see it.
 The changes in that branch were indeed messy, but I think they did push
 towards the vision I have. While I don't have the "end design" quite
 figured out (hence most of the messiness), I think it would help if I
 clarify at least an intermediate part of my vision. Thereafter, I can
 respond a bit more specifically to your comments.

 I see the following current limitations in stem.client's design:
 1. dropping multiplexed cells or (worse) throwing exceptions when they
 occur
 2. having `unpack()`/`pop()` methods that effectively won't work on the
 stream of data received from a guard, if there's //any// RELAY/RELAY_EARLY
 cells in there
 3. allowing only a single layer of decryption, instead of an arbitrary
 number of layers
 4. being limited to only a single "actor" that may send/process Cells
 (related to `1.`)

 There are more, but the above are the main ones that I was trying to
 address with this ticket.
 I felt my changes took a pretty big step forward towards these (not
 completed), but I feel the current master has not - mostly taking only
 some abstraction-layer tuning and small fixes.

 I'm not dead-set on any given design, but I hope the following helps
 explain why I was trending the way I was.

 ==== Limitation 1: dropping multiplexed cells or (worse) throwing
 exceptions when they occur
 By this I mean: Tor is multiplexed by nature. Most cell sequences on the
 wire can be interleaved with others (the spec calls out just a few
 exceptions).
 This should be fairly self-evident by the ability to have multiple
 circuits and streams open concurrently, but I wanted to call it out
 explicitly.

 This is related to `4.` but goes beyond it - any hop on any of our
 circuits may send a Cell to us - it does not have to be a response in part
 of a sequence.

 The client needs some way to handle this - i.e. to demultiplex.

 **My proposed solution** to this for stem is to centralize ORPort reads -
 separating that from more specific handlers. A central point (probably a
 dedicated thread) should be the only thing reading from the ORport socket.
 Sorting and handling of incoming cells probably will be facilitated by
 queues (not there yet).
 (This ticket doesn't aim to centralize that, but rather take a step
 towards making that possible.)

 ==== Limitation 2: having `unpack()`/`pop()` methods that effectively
 won't work on the stream of data received from a guard, if there's //any//
 RELAY/RELAY_EARLY cells in there

 Because our `Cell.unpack()`/`Cell.pop()` methods will directly create an
 interpreted RelayCell, from the payload sent over the wire (which, per the
 spec, is always encrypted), these methods effectively can't be used on
 incoming bytes that may have a RELAY/RELAY_EARLY cell.
 (Said a different way, interpreting an encrypted payload as if it were
 unencrypted doesn't get us the data we want, and has plenty of potential
 to throw an exception.)

 Since most of a tor client's useful operation involves relayed Cells,
 these methods in their current form don't quite make sense to me.

 The `Cell.by_value()` lookup method seemed like a natural place to adjust
 the returned class, but I'm not tied to changing that if you'd suggest
 another method.

 To make these methods more universally useful (e.g. to use `Cell.unpack()`
 in whatever solution fixes `1.`), **my proposed solution** for this is to
 unpack RELAY cells into some intermediate form that allows parsing the
 payload but not interpreting it. This was the basis for the
 `RawRelayCell`.

 I'm not tied to `RawRelayCell`, but I did see it as an interim way to
 solve this. (I'm still trying to figure out the best hierarchy for
 RELAY/RELAY_EARLY Cell classes.)

 ==== Limitation 3: allowing only a single layer of decryption, instead of
 an arbitrary number of layers
 Right now, the `Cell.decrypt()` method creates a `RelayCell`, meaning that
 it suffers from the same problems as `unpack()`/`pop()` does, for use in
 multi-hop circuits.

 We need a way to arbitrarily decrypt the payload without creating a
 "Exception hazard" (here: RelayCell) automatically. When decrypting, we at
 least need to parse and look at `'recognized'`, and per the spec
 [[https://gitweb.torproject.org/torspec.git/tree/tor-
 spec.txt?id=43c2a65a28ea671cd627df9ea5e89f371825a6cd#n1548|only consider a
 cell decrypted if we properly also confirm the digest]]:
 > If the digest is correct, the cell is considered "recognized" for the
 purposes of decryption (see section 5.5 above).

 Maybe something unclear here is this: //any// relay in our circuit may
 send us a Cell, so we can't just assume an encrypted cell is from the
 final hop - we need to iteratively decrypt until we can tell it is
 decrypted. Only at that point do we know //which// relay sent it to us.

 `RawRelayCell` also served that purpose - it was just a `Cell` container
 for the payload.
 Using this intermediary allowed a fairly nice API for multi-hop
 decryption, without the consumer needing to know specifics. I just want to
 point out the
 [[https://gitweb.torproject.org/stem.git/commit/?id=5baa2e37728ab50a5132d81225070e097e6bd058|example
 code that I had in a commit message]]. Here's the main snippet:
 {{{
       # relay_1 := nearest relay in circuit
       decryption_states = [
         (relay_1, digest_1, decryptor_1),
         (relay_2, digest_2, decryptor_2),
         (relay_3, digest_3, decryptor_3),
       ]
       new_decryption_states = copy.deepcopy(decryption_states)

       from_relay = None
       for i in range(len(new_decryption_states)):
         relay, digest, decryptor = new_decryption_states[i]

         cell, fully_decrypted, digest, decryptor = cell.decrypt(digest,
 decryptor)
         new_decryption_states[i] = (relay, digest, decryptor)

         if fully_decrypted:
           from_relay = relay
           break

       if from_relay:
         decryption_states = new_decryption_states
       else:
         # handle this, since the cell couldn't be decrypted
         # probably raise Something()
         pass
 }}}

 I don't think we want the `Cell.decrypt()` method to handle anything more
 than a //layer// of decryption. I believe the Circuit should maintain the
 crypto state, be told whether the cell is fully decrypted, and know what
 to do if fully decrypted or not.

 So if `decrypt()` doesn't return `cell, fully_decrypted, digest,
 decryptor` (where `cell` is typically a raw cell), I think it should at
 least return `payload, fully_decrypted, digest, decryptor`. These are
 practically the same except the latter is lower-level (arguably leaky
 abstraction) with an added need to create a `Cell` from that payload - a
 step that could be reduced - and likely a static method instead of an
 instance method. (An instance method, to me, seems more natural.)

 In other words, **my proposed solution** for this is to have `decrypt()`:
 * callable for an abitrary number of encryption layers
 * //not// cause Exceptions
 * //not// return an interpreted Cell by default

 ==== Limitation 4: being limited to only a single "actor" that may
 send/process Cells (related to `1.`)
 All of the stem.client code currently locks the ORPort, sends cells, and
 receives cells, and releases the lock.
 This is related to `1.`, but different (hopefully I can explain this ok).

 Given limitation `1.`, it's not //that// bad if only a single actor makes
 use of stem.client.

 But this limitation means that stem.client only allows a single
 synchronous message stream, and doesn't have good ways to support longer-
 term streams.

 Again, **my proposed solution** for this is for stem is to centralize
 ORPort reads, as well as writes.
 (This ticket doesn't aim to centralize that, but rather take a step
 towards making that possible.)

 ==== Summary
 I'm not tied to any specific solutions, and definitely open to suggestion.
 I hope the above at least explains what I'm trying to solve in this
 ticket.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/27112#comment:9>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online