[tor-bugs] #33258 [Metrics]: Add CSV file export of graphed data

Mon Mar 30 09:08:30 UTC 2020

#33258: Add CSV file export of graphed data
-----------------------------------------+------------------------------
 Reporter:  karsten                      |          Owner:  metrics-team
     Type:  enhancement                  |         Status:  needs_review
 Priority:  Medium                       |      Milestone:
Component:  Metrics                      |        Version:
 Severity:  Normal                       |     Resolution:
 Keywords:  metrics-team-roadmap-2020Q1  |  Actual Points:
Parent ID:  #33327                       |         Points:  1
 Reviewer:  anarcat                      |        Sponsor:  Sponsor59
-----------------------------------------+------------------------------

Comment (by karsten):

 Replying to [comment:7 robgjansen]:
 > Karsten, I think all of the changes you have made here are improvements
 over the original graphs. I copied the original code from scripts I was
 using for Shadow experiments, which is why some of the plots probably
 don't make sense anymore outside of Shadow.
 >
 > > we should expect developers wanting to play with the underlying data
 and feeding it into their own tools
 >
 > This was exactly my sentiment too when I added the OnionPerf plotting; I
 thought it would be more useful to start with the code I had been using
 for Shadow than starting with nothing. :)

 Sounds good! Thanks for taking a look at my patch.

 > Also, I ended up copying a lot of the TGen parsing/plotting stuff into a
 python package in [https://github.com/shadow/tgen/tree/master/tools the
 tgen repository]. I thought it made more sense for the code that plots
 tgen logs to coexist and be synchronized with tgen itself. This means we
 now have two different tools to parse/plot tgen results, which I think is
 fine, but let me know if you don't want "us" to be maintaining two
 separate code-bases for this. I don't mind if you completely change the
 way OnionPerf is plotting things so that the plots are most useful for
 OnionPerf, and we'll make the tgen plots useful for Shadow. Does that
 approach make sense?

 Good to know. I certainly wouldn't want us to maintain code doing the same
 thing in two separate code bases, if we can avoid it. Let's talk about how
 much of this code overlaps and how much is different before deciding where
 and how to maintain this code in the future. Here are some random
 thoughts:

  - Having the log-parsing code close to the log-writing code sounds like a
 very good idea to me. The alternative is to update the log-parsing code in
 OnionPerf every time we update the log-writing code. I'm not exactly sure
 how easy it is to share this code, I just see the value in doing so.

  - Does Shadow produce any data that is worth visualizing in addition to
 tgen logs? If so, do you have plotting code in tgen and in Shadow?
 Assuming we keep plotting code in OnionPerf, would it make sense to move
 your tgen plotting code to Shadow? Of course, it doesn't hurt us or anyone
 to keep it in tgen, I'm just thinking out loudly here.

  - The patch I attached to this ticket only changes the tgen
 visualizations. But this was just the first step. In the next steps I'd
 like to add more visualizations that use more data than what we can learn
 from tgen logs. For example, I'd like to add filters (#33328) to remove
 measurements using certain relays and compare performance characteristics
 to the baseline. Basic filters would remove relays by fingerprint (#33260)
 using path information obtained from Tor control logs. More sophisticated
 filters would incorporate Tor descriptors like consensuses or server
 descriptors and allow filtering by relay flag, Tor version, or platform.
 My current plan is to extend OnionPerf's analyze mode for these filters
 and plot the result using newly added visualization code. I assume you
 wouldn't need to do something like this in Shadow, because you'd rather
 change the simulation parameters and re-run it, right?

  - If we combine parsing/plotting code, we'll have to discuss and agree on
 common definitions like Time To First Byte. The current OnionPerf
 visualizations plot time from 'command' to 'first_byte', but Tor Metrics
 uses 'start' to 'first_byte'. There's certainly value in talking about
 these definitions and streamlining them. But there's also costs involved
 in doing so.

 I'm open to collaborating closer on this visualization code if it makes
 sense. What do you think?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33258#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online