[tor-bugs] #21693 [Core Tor/Tor]: prop224 HS descriptors do wasteful double-base64 encoding

Thu Mar 9 16:09:30 UTC 2017

#21693: prop224 HS descriptors do wasteful double-base64 encoding
------------------------------+--------------------------------
     Reporter:  asn           |      Owner:
         Type:  task          |     Status:  new
     Priority:  Medium        |  Milestone:  Tor: 0.3.1.x-final
    Component:  Core Tor/Tor  |    Version:
     Severity:  Normal        |   Keywords:  tor-hs prop224
Actual Points:                |  Parent ID:  #21334
       Points:  4             |   Reviewer:
      Sponsor:  SponsorR-can  |
------------------------------+--------------------------------
 In #21334 we implement the new prop224 HS descriptor format that does the
 double-layer encryption to implement the client auth functionality.

 As part of that design, we first base64 the ciphertext of the inner layer
 of the descriptor. Then when we create the outer descriptor layer, we
 base64 the ciphertext of the middle layer which also includes the inner
 layer.

 This results in a construction as follows:

 `middle_layer = base64(encrypt(client_auth_data +
 base64(encrypt(inner_layer))))`
 `outer_layer = header + middle_layer`.

 Notice that in the above construction we actually base64 the inner layer
 twice which is wasteful. During design and development we glossed over
 this fact thinking that it's not that big of a waste, and since we are
 already padding the whole `middle_layer` to 10k bytes it's fine (a typical
 default size for `middle_layer` is about 4k bytes).

 However, Nick brought this topic again during review and we decided to
 open a ticket to discuss this, since in theory we could define some sort
 of binary format for the middle layer and avoid the wasteful double
 base64.

 The pros of this would be that we could fit more data in the middle layer.
 [https://gitlab.com/asn/tor/merge_requests/12#note_24371597 I estimated]
 that we could fit an extra 1k bytes of data by addressing this.
 [https://lists.torproject.org/pipermail/tor-dev/2016-November/011658.html
 Based on some initial calculations] this means that we could fit an extra
 intro point on the default 10k bytes descriptor, or maybe another block of
 16 authed clients (if we are lucky since that's about 1.2k bytes).

 The negative of this would be that we would have to go back into design
 stage to spec the binary format, and then we would need to write the code
 to implement that; whereas now we are using the same decoding function for
 both layers. That's basically a simple matter of programming and time and
 I'm definitely willing to do it if we decide it's the right thing to do.

 We discussed this with David in IRC and decided that given the current
 state of development we should perhaps roll with the current design.
 That's because only a small number of hidden services would benefit from
 this change since default descriptors (of 3 intro points and no client
 auth) are just 4k bytes in size and they get padded to 10k bytes anyway.
 Only descriptors with many intro points and client auth data that are
 about 11k bytes would benefit from this change since they wouldn't need to
 get padded to 20k bytes, and they could actually fit in 10k.

 Opening this ticket seems like a good idea since 0.3.1 is the time to do
 this change if we ever want to; so that we don't feel silly in the future.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/21693>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online