<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><br><div>On 11 Jan 2018, at 00:43, nusenu <<a href="mailto:nusenu-lists@riseup.net">nusenu-lists@riseup.net</a>> wrote:<br><br></div><blockquote type="cite"><div><span>Hi,</span><br><span></span><br><span>the goal of this email is to avoid a false positive warning for relay operators</span><br><span>on atlas but the root cause might be in core tor.</span><br><span></span><br><span>background:</span><br><span>I really liked when irl added the big red warning to atlas when a tor relay</span><br><span>runs an outdated (aka not running a "recommended") tor version</span><br><span>because it actually triggered operators to upgrade, an important step toward a more healthy network.</span><br><span>The problem is: This big red banner on atlas has false-positives which confuses operators [0].</span><br><span></span><br><span>Originally this has been an onionoo bug which</span><br><span>has been fixed in v1.8.0, but it happens again and Karsten had the feeling</span><br><span>that tor dir auths do not update  the version information of a relay after it</span><br><span>upgraded (and uploaded a new descriptor). I looked into one example and can confirm what Karsten suggested [1].</span><br></div></blockquote><div><br></div><div>I have opened a feature request for consensus-health to show per-relay</div><div>versions in the details and overlap tables:</div><div><br></div><div><a href="https://trac.torproject.org/projects/tor/ticket/24862#ticket">https://trac.torproject.org/projects/tor/ticket/24862</a></div><div><br></div><div>Unfortunately, consensus-health does not parse descriptors, so we will</div><div>have to rely on at least one authority picking up the new version. But</div><div>it's a start, and it will help us monitor the fix and any regressions.</div><br><blockquote type="cite"><div><span>Let me show you that example of FP 1283EBDEEC2B9D745F1E7FBE83407655B984FD66.</span><br><span>Data has been provided by Karsten and is available here: [2].</span><br><span></span><br><span>That relay was running 0.3.0.10 and upgraded to 0.3.0.13 and uploaded his first </span><br><span>descriptor with 0.3.0.13 on:</span><br><span></span><br><span>2018-01-09 10:14:00,server,,0.3.0.13</span><br><span></span><br><span>except for bastet dir auths did not care and still said this relay runs</span><br><span>0.3.0.10:</span><br><span></span><br><span>2018-01-09 11:00:00,consensus,,0.3.0.10</span><br><span>2018-01-09 11:00:00,vote,bastet,0.3.0.13  <<<< note </span><br><span>2018-01-09 11:00:00,vote,dannenberg,0.3.0.10</span><br><span>2018-01-09 11:00:00,vote,dizum,0.3.0.10</span><br><span>2018-01-09 11:00:00,vote,Faravahar,0.3.0.10</span><br><span>2018-01-09 11:00:00,vote,gabelmoo,0.3.0.10</span><br><span>2018-01-09 11:00:00,vote,longclaw,0.3.0.10</span><br><span>2018-01-09 11:00:00,vote,maatuska,0.3.0.10</span><br><span>2018-01-09 11:00:00,vote,moria1,0.3.0.10</span><br><span>2018-01-09 11:00:00,vote,tor26,0.3.0.10</span><br><span>2018-01-09 12:00:00,consensus,,0.3.0.10</span><br><span>2018-01-09 12:00:00,vote,bastet,0.3.0.13 <<<<<<</span><br><span>2018-01-09 12:00:00,vote,dannenberg,0.3.0.10</span><br><span>2018-01-09 12:00:00,vote,dizum,0.3.0.10</span><br><span>2018-01-09 12:00:00,vote,Faravahar,0.3.0.10</span><br><span>2018-01-09 12:00:00,vote,gabelmoo,0.3.0.10</span><br><span>2018-01-09 12:00:00,vote,longclaw,0.3.0.10</span><br><span>2018-01-09 12:00:00,vote,maatuska,0.3.0.10</span><br><span>2018-01-09 12:00:00,vote,moria1,0.3.0.10</span><br><span>2018-01-09 12:00:00,vote,tor26,0.3.0.10</span><br><span></span><br><span>even 6 hours later this is unchanged.</span><br><span></span><br><span>Then the operator upgraded from 0.3.0.13 to 0.3.1.9</span><br><span>and uploaded his first descriptor:</span><br><span></span><br><span>2018-01-09 16:39:01,server,,0.3.1.9</span><br><span></span><br><span>this remained "unnoticed" by all dir auths until</span><br><span>longclaw voted for the new version:</span><br><span></span><br><span>2018-01-09 23:00:00,consensus,,0.3.0.10</span><br><span>2018-01-09 23:00:00,vote,bastet,0.3.0.10</span><br><span>2018-01-09 23:00:00,vote,dannenberg,0.3.0.10</span><br><span>2018-01-09 23:00:00,vote,dizum,0.3.0.10</span><br><span>2018-01-09 23:00:00,vote,Faravahar,0.3.0.10</span><br><span>2018-01-09 23:00:00,vote,gabelmoo,0.3.0.10</span><br><span>2018-01-09 23:00:00,vote,longclaw,0.3.1.9 <<<<<</span><br><span>2018-01-09 23:00:00,vote,maatuska,0.3.0.10</span><br><span>2018-01-09 23:00:00,vote,moria1,0.3.0.10</span><br><span>2018-01-09 23:00:00,vote,tor26,0.3.0.10 </span><br><span></span><br><span>On 2018-01-10 02:38:07 the relay uploaded a second descriptor with</span><br><span>v0.3.1.9 and almost all dir auths agreed immediately:</span><br><span></span><br><span>2018-01-10 02:38:07,server,,0.3.1.9</span><br><span>2018-01-10 03:00:00,consensus,,0.3.1.9</span><br><span>2018-01-10 03:00:00,vote,bastet,0.3.0.10</span><br><span>2018-01-10 03:00:00,vote,dannenberg,0.3.1.9</span><br><span>2018-01-10 03:00:00,vote,dizum,0.3.1.9</span><br><span>2018-01-10 03:00:00,vote,Faravahar,0.3.1.9</span><br><span>2018-01-10 03:00:00,vote,gabelmoo,0.3.1.9</span><br><span>2018-01-10 03:00:00,vote,longclaw,0.3.1.9</span><br><span>2018-01-10 03:00:00,vote,maatuska,0.3.1.9</span><br><span>2018-01-10 03:00:00,vote,moria1,0.3.1.9</span><br><span>2018-01-10 03:00:00,vote,tor26,0.3.1.9</span><br><span></span><br><span></span><br><span>So it took the operator 17 hours to convince enough </span><br><span>dir auths that he upgraded.</span><br><span>I can see multiple reasons why this can make sense (as the tor version</span><br><span>is actually not that relevant consensus data) but maybe it was</span><br><span>not clear what the side effects of not updating that field are.</span><br><span></span><br><span>While I believe there is still another onionoo issue,</span><br><span>this should also be improved.</span><br><span></span><br><span>Thoughts?</span><br></div></blockquote><div><br></div><div>I've looked at the Tor source code that handles versions. Version</div><div>parsing and voting seem to happen unconditionally.</div><div><br></div><div>I also checked router_differences_are_cosmetic(), and it seems to</div><div>handle platform string changes correctly.</div><div><br></div><div>So maybe the issue is in the descriptor fetching and updating logic?</div><div>How many authorities received the new descriptor?</div><div>Did any of the other fields in the vote change when the new descriptor</div><div>was updated?</div><div><br></div><div>Can we get logs from the relays that are affected by this issue, so we</div><div>can see how many authorities they uploaded to?</div><div>Can we get logs from some authorities so we see how they handled the</div><div>new descriptor?</div><div><br></div><div>It might also help to open a core tor ticket to track this.</div><br><blockquote type="cite"><div><span>[0] <a href="http://lists.nycbug.org/pipermail/tor-bsd/2018-January/000620.html">http://lists.nycbug.org/pipermail/tor-bsd/2018-January/000620.html</a></span><br><span>[1] <a href="https://trac.torproject.org/projects/tor/ticket/22488#comment:11">https://trac.torproject.org/projects/tor/ticket/22488#comment:11</a></span><br><span>[2] <a href="https://trac.torproject.org/projects/tor/attachment/ticket/22488/task-22488-relay-versions.csv.gz">https://trac.torproject.org/projects/tor/attachment/ticket/22488/task-22488-relay-versions.csv.gz</a></span><br></div></blockquote></body></html>