[metrics-team] Hello from blackbird

Su Yu graylatern at gmail.com
Sun Mar 24 19:01:08 UTC 2019


Hi Karsten,

Sorry for the late reply! I needed to learn how to read the consensuses
(and I still have a lot to learn).

- How does the shorter release cycle affect update behavior?

I have been thinking about this question. With the data you shared, I
plotted the age of servers vs. the age of their versions:

[image: server_age_version_age.png]
The "Age of server" is computed as the number of days between today and the
date that the server started running. The "Age of version" is the field
"days_since_first_recommended" from the data table. The left plot uses the
data of relays and the right is of the bridges.

Overall, it seems that the older servers update more quickly, which is a
little counter-intuitive. I guess this can be a result of how the dataset
was constructed?

However, to really address this question, we will need longitudinal data of
the update behavior, instead of only one snapshot. I think this requires
parsing the consensus files from multiple time points and compiling them.
Did you create the above dataset by parsing the consensus? If so, would you
think this is viable?

On a separate note: I saw this
<https://metrics.torproject.org/uncharted-data-flow.html> on the Metrics
website and feel very interested. I wonder if there is more work/data
relevant to this network?

Thanks,
blackbird



On Mon, Mar 11, 2019 at 11:00 AM Karsten Loesing <karsten at torproject.org>
wrote:

> Hi,
>
> sorry for the late reply. I guess I was hoping to find time to produce
> some more data for you, but that time has not materialized yet. Let me
> respond to your questions anyway, and maybe you'll be able to produce
> your own data from the original data.
>
> On 2019-03-01 22:27, Su Yu wrote:
> > Sorry! I realized I should've added some caption to the figures.
> >
> > - In the two histograms are the distribution of the age of Tor versions.
> > Most people have their Tor between 0 - 250 days old and there is a
> > long-tail distribution.
> > - In the box plot, the boxes contain the first quantile to the third
> > quantile of data points, and the line in the center is the median. The
> > upper and lower "whiskers" show the maximum and minimum of the data, and
> > the points above the top whisker are outliers. It appears that there are
> > less bridges with very old versions, but the bridges and relays are
> > similar in keeping up with new versions.
> >
> > You're definitely right that the temporal changes are important. I'll
> > focus on this in some follow-up analysis. I have a couple of questions
> > regarding this:
> >
> > 1. What is the "date" column in the csv file you shared, specifically?
>
> It's the date when relays were listed as running in a consensus.
>
> > 2. What's a good way to see the the unattended updates data? I can look
> > into it if you could point me to a general direction.
>
> There's no explicit data about unattended updates. My idea was that, if
> a relay updates really soon after a new version comes out, it's likely
> using unattended updates. But we do not know which relays have
> unattended updates configured on their system.
>
> > 3. It seems many of the potential questions will require new data. I'm
> > happy to work on data generation/cleaning; but is there a good way to
> > share the datasets or figures? They may also be too large for the mail
> > list..
>
> Figures should be fine on the mailing list. If you have larger datasets,
> can you upload them somewhere and link to them? Otherwise figures and
> descriptions on the mailing list will have to do for now.
>
> So, if you want to work with the original data, you should take a look
> at consensuses here:
>
>
> https://metrics.torproject.org/collector.html#type-network-status-consensus-3
>
> In particular, here are some lines contained in consensuses that might
> be relevant:
>
> valid-after 2019-03-11 14:00:00
>
> server-versions
>
> 0.2.9.15,0.2.9.16,0.2.9.17,0.3.4.10,0.3.4.11,0.3.5.7,0.3.5.8,0.4.0.1-alpha,0.4.0.2-alpha
>
> r seele AAoQ1DAR6kkoo19hBAX5K0QztNw qbFrFVLkIeCAYnciYZP5lRs4P1s
> 2019-03-11 11:17:29 67.174.243.193 9001 0
>
> s Running Stable V2Dir Valid
>
> v Tor 0.3.5.7
>
> Consensuses are specified here:
>
> https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
>
> Thanks!
>
> All the best,
> Karsten
>
>
> >
> > Hi Teor - thank you for chiming in! The release schedule page looks
> > awesome. I notice there is not any mention of dev versions - is that
> > information also available somewhere?
> >
> > Thanks!
> >
> > blackbird
> >
> >
> >
> >
> >
> > On Thu, Feb 28, 2019 at 6:43 PM teor <teor at riseup.net
> > <mailto:teor at riseup.net>> wrote:
> >
> >     Hi,
> >
> >     On 28 Feb 2019, at 19:00, Karsten Loesing <karsten at torproject.org
> >     <mailto:karsten at torproject.org>> wrote:
> >>
> >>     On 2019-02-22 23:05, Su Yu wrote:
> >>>
> >>>     I did some quick plotting in Jupyter notebook (see below; the
> figures
> >>>     are also attached separately). Regarding the relays vs. bridges
> >>>     question
> >>>     that you mentioned, it seems the bridges are better at keeping
> >>>     themselves not /too/ outdated, but they're actually not that
> >>>     different
> >>>     in keeping up-to-date?
> >>
> >>     Thanks for making these graphs. Though it's hard (for me) to
> interpret
> >>     these results.
> >>
> >>     One reason might be that these graphs are considering a time frame
> of
> >>     over 1 decade. A lot of things have changed over that time frame:
> >>
> >>     - The network has grown a lot over the years, which means that
> recent
> >>     years have a greater weight in those graphs than distant years. This
> >>     doesn't have to be a bad thing, it's just probably not intended and
> >>     possibly surprising when interpreting the results.
> >>
> >>     - Release cycles have changed, with a much shorter cycle in the last
> >>     year or two as compared to earlier years. This may skew results
> >>     even more.
> >>
> >>     If I were to continue this analysis I'd try to look more at
> >>     changes over
> >>     time. Things I'd look at:
> >>
> >>     - How does the shorter release cycle affect update behavior? It's
> >>     probably useful to look at Tor's change log to get an idea when
> >>     versions
> >>     have been updated, when versions have been sunset, and which
> versions
> >>     have long-term support.
> >
> >     We have a summary page for past releases and our release schedule:
> >
> https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/CoreTorReleases#Calendar
> :
> >
> >     T
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/metrics-team/attachments/20190324/9d01acdc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: server_age_version_age.png
Type: image/png
Size: 37893 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/metrics-team/attachments/20190324/9d01acdc/attachment-0001.png>


More information about the metrics-team mailing list