[metrics-team] Hello from blackbird

Karsten Loesing karsten at torproject.org
Thu Apr 4 15:59:24 UTC 2019


On 2019-03-24 20:01, Su Yu wrote:
> Hi Karsten,

Hi blackbird,

> Sorry for the late reply! I needed to learn how to read the consensuses
> (and I still have a lot to learn).
> 
> - How does the shorter release cycle affect update behavior?
> 
> I have been thinking about this question. With the data you shared, I
> plotted the age of servers vs. the age of their versions:
> 
> server_age_version_age.png
> The "Age of server" is computed as the number of days between today and
> the date that the server started running.

Wait, the data I gave you doesn't say anything about when the server
started running. It's the date when relays were listed as running in a
consensus.

> The "Age of version" is the
> field "days_since_first_recommended" from the data table. The left plot
> uses the data of relays and the right is of the bridges.
> 
> Overall, it seems that the older servers update more quickly, which is a
> little counter-intuitive. I guess this can be a result of how the
> dataset was constructed?
> 
> However, to really address this question, we will need longitudinal data
> of the update behavior, instead of only one snapshot. I think this
> requires parsing the consensus files from multiple time points and
> compiling them. Did you create the above dataset by parsing the
> consensus? If so, would you think this is viable?

The data I gave you is not a snapshot. It's based on parsing all
consensuses back to 2007.

I wonder if this analysis makes more sense if you're parsing consensus
files yourself. Sorry that my data apparently confused you, that was not
my intention.
 
> On a separate note: I saw this
> <https://metrics.torproject.org/uncharted-data-flow.html> on the Metrics
> website and feel very interested. I wonder if there is more work/data
> relevant to this network?

The data is all available on the Tor Metrics website, though it requires
parsing raw data to produce a visualization like that. Note that while
it's certainly a nice visualization, it's not really a priority for us
at the moment to have more of those.

If you'd still like to help out with Tor Metrics tasks, maybe take a
look at Trac for open tickets in the Metrics/* components and at the
relevant Git repositories. If there's something you'd like to work on,
just comment on the ticket.

Thanks!

> Thanks,
> blackbird

All the best,
Karsten



> On Mon, Mar 11, 2019 at 11:00 AM Karsten Loesing <karsten at torproject.org
> <mailto:karsten at torproject.org>> wrote:
> 
>     Hi,
> 
>     sorry for the late reply. I guess I was hoping to find time to produce
>     some more data for you, but that time has not materialized yet. Let me
>     respond to your questions anyway, and maybe you'll be able to produce
>     your own data from the original data.
> 
>     On 2019-03-01 22:27, Su Yu wrote:
>     > Sorry! I realized I should've added some caption to the figures. 
>     >
>     > - In the two histograms are the distribution of the age of Tor
>     versions.
>     > Most people have their Tor between 0 - 250 days old and there is a
>     > long-tail distribution.
>     > - In the box plot, the boxes contain the first quantile to the third
>     > quantile of data points, and the line in the center is the median. The
>     > upper and lower "whiskers" show the maximum and minimum of the
>     data, and
>     > the points above the top whisker are outliers. It appears that
>     there are
>     > less bridges with very old versions, but the bridges and relays are
>     > similar in keeping up with new versions.
>     >
>     > You're definitely right that the temporal changes are important. I'll
>     > focus on this in some follow-up analysis. I have a couple of questions
>     > regarding this:
>     >
>     > 1. What is the "date" column in the csv file you shared, specifically?
> 
>     It's the date when relays were listed as running in a consensus.
> 
>     > 2. What's a good way to see the the unattended updates data? I can
>     look
>     > into it if you could point me to a general direction.
> 
>     There's no explicit data about unattended updates. My idea was that, if
>     a relay updates really soon after a new version comes out, it's likely
>     using unattended updates. But we do not know which relays have
>     unattended updates configured on their system.
> 
>     > 3. It seems many of the potential questions will require new data. I'm
>     > happy to work on data generation/cleaning; but is there a good way to
>     > share the datasets or figures? They may also be too large for the mail
>     > list..
> 
>     Figures should be fine on the mailing list. If you have larger datasets,
>     can you upload them somewhere and link to them? Otherwise figures and
>     descriptions on the mailing list will have to do for now.
> 
>     So, if you want to work with the original data, you should take a look
>     at consensuses here:
> 
>     https://metrics.torproject.org/collector.html#type-network-status-consensus-3
> 
>     In particular, here are some lines contained in consensuses that might
>     be relevant:
> 
>     valid-after 2019-03-11 14:00:00
> 
>     server-versions
>     0.2.9.15,0.2.9.16,0.2.9.17,0.3.4.10,0.3.4.11,0.3.5.7,0.3.5.8,0.4.0.1-alpha,0.4.0.2-alpha
> 
>     r seele AAoQ1DAR6kkoo19hBAX5K0QztNw qbFrFVLkIeCAYnciYZP5lRs4P1s
>     2019-03-11 11:17:29 67.174.243.193 9001 0
> 
>     s Running Stable V2Dir Valid
> 
>     v Tor 0.3.5.7
> 
>     Consensuses are specified here:
> 
>     https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
> 
>     Thanks!
> 
>     All the best,
>     Karsten
> 
> 
>     >
>     > Hi Teor - thank you for chiming in! The release schedule page looks
>     > awesome. I notice there is not any mention of dev versions - is that
>     > information also available somewhere?
>     >
>     > Thanks!
>     >
>     > blackbird
>     >
>     >
>     >
>     >
>     >
>     > On Thu, Feb 28, 2019 at 6:43 PM teor <teor at riseup.net
>     <mailto:teor at riseup.net>
>     > <mailto:teor at riseup.net <mailto:teor at riseup.net>>> wrote:
>     >
>     >     Hi,
>     >
>     >     On 28 Feb 2019, at 19:00, Karsten Loesing
>     <karsten at torproject.org <mailto:karsten at torproject.org>
>     >     <mailto:karsten at torproject.org
>     <mailto:karsten at torproject.org>>> wrote:
>     >>
>     >>     On 2019-02-22 23:05, Su Yu wrote:
>     >>>
>     >>>     I did some quick plotting in Jupyter notebook (see below;
>     the figures
>     >>>     are also attached separately). Regarding the relays vs. bridges
>     >>>     question
>     >>>     that you mentioned, it seems the bridges are better at keeping
>     >>>     themselves not /too/ outdated, but they're actually not that
>     >>>     different
>     >>>     in keeping up-to-date?
>     >>
>     >>     Thanks for making these graphs. Though it's hard (for me) to
>     interpret
>     >>     these results.
>     >>
>     >>     One reason might be that these graphs are considering a time
>     frame of
>     >>     over 1 decade. A lot of things have changed over that time frame:
>     >>
>     >>     - The network has grown a lot over the years, which means
>     that recent
>     >>     years have a greater weight in those graphs than distant
>     years. This
>     >>     doesn't have to be a bad thing, it's just probably not
>     intended and
>     >>     possibly surprising when interpreting the results.
>     >>
>     >>     - Release cycles have changed, with a much shorter cycle in
>     the last
>     >>     year or two as compared to earlier years. This may skew results
>     >>     even more.
>     >>
>     >>     If I were to continue this analysis I'd try to look more at
>     >>     changes over
>     >>     time. Things I'd look at:
>     >>
>     >>     - How does the shorter release cycle affect update behavior? It's
>     >>     probably useful to look at Tor's change log to get an idea when
>     >>     versions
>     >>     have been updated, when versions have been sunset, and which
>     versions
>     >>     have long-term support.
>     >
>     >     We have a summary page for past releases and our release schedule:
>     >   
>      https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/CoreTorReleases#Calendar:
>     >
>     >     T
>     >
> 
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 528 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/metrics-team/attachments/20190404/4e950724/attachment.sig>


More information about the metrics-team mailing list