[tor-bugs] #27112 [Core Tor/Stem]: Decouple payload processing from pop/unpack + tune abstraction layers

Thu Aug 30 19:34:49 UTC 2018

#27112: Decouple payload processing from pop/unpack + tune abstraction layers
---------------------------+--------------------------------
 Reporter:  dmr            |          Owner:  dmr
     Type:  enhancement    |         Status:  needs_revision
 Priority:  Medium         |      Milestone:
Component:  Core Tor/Stem  |        Version:
 Severity:  Normal         |     Resolution:
 Keywords:  client         |  Actual Points:
Parent ID:                 |         Points:
 Reviewer:  atagar         |        Sponsor:
---------------------------+--------------------------------

Comment (by dmr):

 Ok, now onto specific points in your comment.

 Replying to [comment:7 atagar]:
 > I think we might be misunderstanding each other, and it boils down to
 one simple question: **When you say 'for multi-hop circuits' which of the
 following do you mean?**
 >
 > a. So we can **relay**. That is to say, be in the **middle** of the
 circuit.
 > b. So we can **make circuits**. That is to say, we're the **originator**
 of a multi-hop circuit.
 >
 > If you mean 'a' then yes, much of this is necessary, but that is **not**
 our task at present. If the later then I'm unsure why much of this is
 necessary but quite possible I'm missing something.
 I mean 'b', but I do think that the `stem.client.cell` shouldn't have any
 limitations in it that prevents 'a'. Any logic that differentiates relay
 from client should be at another code layer. So I think with the proposed
 solutions, there isn't a difference here.

 > > (1) to confirm that we didn't get unlucky with 'recognized' == 0;
 >
 > If we're a client then all cells we receive are decryptable. The
 'recognized' field is nothing more than a poor man's 'is this cell still
 encrypted?' check. This is only relevant if we implement relaying.
 >
 > > (2) to work against various corruption / attacks that would affect the
 integrity of our payload
 >
 > As for this part, yup. I'd actually be **delighted** to merge a
 targeted, simple patch that implements decryption digests (with tests). I
 stared at your branch for hours this weekend hoping to integrate that, but
 honestly I couldn't make heads or tails of this code. The myriad of
 helpers (especially with names like 'coerce' and 'interpret') made my head
 spin.
 Agreed, it was messy and had a lot of helpers (and naming is hard). I'll
 try to take a look at adjusting `decrypt()`. Right now it also don't check
 `'recognized'`, which I believe it should do before attempting to create a
 RelayCell (and per my proposed solutions, //not// attempt to create a
 RelayCell).

 > > but I was pretty solid with the encrypt() and decrypt() interface /
 methods
 >
 > I kept the encrypt/decrypt interfaces the same. The **only** thing I
 dropped was...
 I guess I meant the interface holistically.

 Hopefully this illustrates some of the differences.

 `decrypt()`
 ||= Where =||= Returns =||= Raises =||= Params =||
 ||branch||(RawRelayCell, fully_decrypted bool, CipherContext,
 HASH)||nothing, I believe||self (BaseRelayCell), CipherContext, HASH||
 ||master||(RelayCell, CipherContext, HASH)||stem.ProtocolError and
 anything the `RelayCell` unpacking or constructor may
 raise||//(staticmethod)// link_protocol, bytes from ORPort, CipherContext,
 HASH||

 For brevity, I won't do the same for encrypt(), but it's similar. (It also
 only has a single layer of encryption implemented in the branch, although
 it's fairly easily refactored to `BaseRelayCell`)

 Again, I'm not tied to the specifics here. I do especially think returning
 `fully_decrypted` is necessary, and it makes sense to me that this method
 operate on the same types that it returns a superset of.

 > b. Rearchitecture of our RelayCell class, splitting this up into dozens
 of helpers. I wasn't a fan of this.
 Sorry, that was messy and overdone. Some of it I believe helped and was
 necessary, but other parts arguably didn't add much value, or were placed
 at the wrong level. (I'm still rethinking how to have a hierarchy for
 RELAY/RELAY_EARLY `Cell`s.)

 > Sorry! I know it sucks to get this kind of pushback on a branch you've
 worked so hard on.
 You raise fair points. Mostly I'm just trying to communicate where I'm
 coming from, and it looks like neither the code nor the commit messages
 did a good job of this.

 > 3. I **want** to merge your decryption digest implementation (there's
 now a TODO comment in decrypt() about it) but I couldn't make sense of
 this code.
 Good TODO comment - it helps keep track of things.

 For posterity's sake: it's a bit more complicated than the example in the
 comment.
 * the digest is computed using the //payload// (not the whole packed cell)
 with 0 in the digest field (hence the `pack_payload` helper method)
 * `self.digest` is an int per unpacking (and otherwise would've been
 coerced by the constructor), but `new_digest` is still a HASH, so we need
 to coerce the latter to `int` to compare (hence the `coerce_digest` helper
 method)

 > 4. I **do not** want to split RelayCell or excessively complicate this
 class unless there's a very good reason.
 Yep, this was excessive. I hope that my comments help explain at least a
 bit of what I consider necessary complexity.

 > Hope that helps!
 Likewise, I hope these help!

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/27112#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online