<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div dir="ltr">Hi Nick,</div><div dir="ltr"><br><blockquote type="cite">On 21 Mar 2020, at 05:38, Nick Mathewson <nickm@torproject.org> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"><span>Walking Onions: week 3 update</span><br><span></span><br><span>As you might recall, the SNIPs need to be nice and small, but they</span><br><span>need to be completely self-contained.  The ENDIVE, on the other</span><br><span>hand, needs to be diff-friendly, and compressible.  Relays need to</span><br><span>derive all the SNIPs from the ENDIVE.  The ENDIVE and SNIP formats</span><br><span>that I worked on during last week [SNIPFMT] was designed to meet these</span><br><span>criteria.</span><br><span></span><br><span>This week, I wrote up the algorithms to extract signed SNIPs from an</span><br><span>ENDIVE.  This is in section 4 [RECONST] of the specs in my</span><br><span>work-in-progress repository [GITREPO] This was mainly straightforward</span><br><span>consequence from the work I did last week, but there were a few</span><br><span>things that turned up.</span><br><span></span><br><span>One trickier aspect of this work is that we want to move most of the</span><br><span>decision-making to the authorities, and have the relays simply do an</span><br><span>expansion algorithm.  Updating all the relays would be slow, so relays</span><br><span>need to be able to handle new index types and data types without</span><br><span>fully understanding them.  That means that we need to specify an</span><br><span>algorithm that can work flexibly with many future kinds of data.</span><br><span></span><br><span>Another tricky aspect is that the data which the relays need to put</span><br><span>into SNIPs all needs to be signed, so all of that data needs to come</span><br><span>out the same when relays calculate it as the when the authorities</span><br><span>calculate it. Thus, all the algorithms for building Merkle trees and</span><br><span>indices and SNIPRouterData bodies need to be deterministic.</span><br><span></span><br><span>I think that I got all of this relatively right in [RECONST], but</span><br><span>there are probably some mistakes hiding in there.</span><br><span></span><br><span>I'm hoping that there is a cleaner alternative to the weighted-index</span><br><span>reconstruction algorithm -- the one that's there right now puts a</span><br><span>lot of requirements on the input data, in order to avoid</span><br><span>floating-point operations.</span><br></div></blockquote><div><br></div><div>There's a shorter encoding for Raw:</div><div><br></div><div>for each <span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">[i, pos1, pos2] in </span><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0);">index_ranges:</span></div><div><font color="#000000"><span style="caret-color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">  w = pos2 - pos1</span></font></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">  j  = the index of pos1 among all sorted pos1s</span></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">  new_encoding = [i, j, w]</span></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;"><br></span></div><div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">[i, j, w] is an efficient encoding if index_ranges is sparse </span><span style="-webkit-text-size-adjust: auto;">compared</span></div><div><span style="-webkit-text-size-adjust: auto;">with </span>ENDIVERouterData, because:</div></div><div>* j has the same cardinality as I</div><div>* w is smaller than pos1 and pos2</div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;"><br></span></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">If index_ranges is dense, there may be a more efficient </span><span style="-webkit-text-size-adjust: auto;">encoding:</span></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">  add missing i with weight 0</span></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">  drop j</span></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;"><br></span></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">With this encoding, you can drop a few of the constraints.</span></div><div><br></div><div>There's a faster calculation and more efficient encoding for</div><div>Weighted, because there's a common factor in each POS()</div><div>calculation:</div><div><span style="-webkit-text-size-adjust: auto;">(1 << 32) / total_weight</span></div><div><span style="-webkit-text-size-adjust: auto;"><br></span></div><div><span style="-webkit-text-size-adjust: auto;">If we remove that factor, we end up with a much simpler </span><span style="-webkit-text-size-adjust: auto;">algorithm,</span></div><div><span style="-webkit-text-size-adjust: auto;">that's also more flexible:</span></div><div><span style="-webkit-text-size-adjust: auto;"><br></span></div><div><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">Algorithm: Expanding a "Weighted" indexspec.</span></p><p class="p2" style="margin: 0px; font-stretch: normal; line-height: normal; min-height: 20.3px; -webkit-text-size-adjust: auto;"><span class="s1"></span><br></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">Each weighted indexspec also has a multiplier, </span>which may vary</p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;">between indexspecs.</p><p class="p2" style="margin: 0px; font-stretch: normal; line-height: normal; min-height: 20.3px; -webkit-text-size-adjust: auto;"><span class="s1"></span><br></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">Let total_weight = SUM(indexspec.index_weights)</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">Verify total_weight * multiplier <= UINT64_MAX.</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1"><br></span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">Let total_so_far = 0.</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">Let result_idx = {} (an empty mapping).</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">Define POS(b) = b * multiplier</span></p><p class="p2" style="margin: 0px; font-stretch: normal; line-height: normal; min-height: 20.3px; -webkit-text-size-adjust: auto;"><span class="s1"></span><br></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">For 0 <= i < LEN(indexspec.indexweights):</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">  <span class="Apple-converted-space"> </span>Let w = indexspec.indexweights[i].</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">  <span class="Apple-converted-space"> </span>Let lo = POS(total_so_far).</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">  <span class="Apple-converted-space"> </span>Let total_so_far = total_so_far + w.</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">  <span class="Apple-converted-space"> </span>Let hi = POS(total_so_far) - 1.</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">  <span class="Apple-converted-space"> </span>Append (lo, hi) => i to result_idx.</span></p><p class="p2" style="margin: 0px; font-stretch: normal; line-height: normal; min-height: 20.3px; -webkit-text-size-adjust: auto;"><span class="s1"></span><br></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1">Return result_idx.</span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><span class="s1"><br></span></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;">If multiplier is large, then the weights can be quite small.</p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;">(Small weights sacrifice some accuracy.) And if the multiplier</p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;">is 1, you effectively have a "Raw" index.</p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;"><br></p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;">If you make those changes, you should be able to use a similar</p></div><div><span style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); -webkit-text-size-adjust: auto;">process to </span>expand all the different index types. (After multiplying,</div><div>truncating, or hashing, you either end up with a delta, or an</div><div>absolute position. You can turn deltas into absolute positions,</div><div>and then feed them all into the same base algorithm.)</div><div><br></div><div>There are also a few things I think might be bugs:</div><div><br></div><div><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;">Was there meant to be a constraint that the Weighted total is</p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;">UINT64_MAX? Or close to UINT64_MAX?</p><p class="p1" style="margin: 0px; font-stretch: normal; line-height: normal; -webkit-text-size-adjust: auto;">The fixed parameters don't make much sense otherwise.</p></div><div><br></div><div>I think v2 and v3 onion services assign descriptors to the next</div><div>highest HSDir RSA id (or ed25519 hash). But</div><div>INDEX_FROM_RING_KEYS() uses the relay input as the lowest value.</div><div><br></div><div>There is no next member for the last member in</div><div>INDEX_FROM_RING_KEYS(), but the code asks for one.</div><div>(Perhaps there are some typos here that make the code hard to</div><div>understand.)</div><div><br></div><div>We'll need special treatment for ring wrapping. (That is, 0xFF… is not</div><div>a real upper limit, it actually means, "use the lowest relay".)</div><div><br></div><div>It's weird to call a middle value a "suffix", but "infix" is also a bit of an</div><div>unusual word.</div><div><br></div><div>There are also a bunch of typos, let me know when you're ready for</div><div>copy-editing.</div><div><br></div><div>T</div></body></html>