[tor-bugs] #7359 [Tor]: Design/implement method for collecting/reporting statistics

Tue May 28 20:50:35 UTC 2013

#7359: Design/implement method for collecting/reporting statistics
------------------------------------------------------------------------+---
 Reporter:  robgjansen                                                  |          Owner:                    
     Type:  enhancement                                                 |         Status:  needs_review      
 Priority:  normal                                                      |      Milestone:  Tor: 0.2.5.x-final
Component:  Tor                                                         |        Version:                    
 Keywords:  performance, simulation, statistics, tor-relay, tor-client  |         Parent:  #7357             
   Points:                                                              |   Actualpoints:                    
------------------------------------------------------------------------+---

Comment(by nickm):

 Replying to [comment:13 karsten]:
 > Replying to [comment:9 nickm]:
 > > quick comments:
 >
 > I finally made it through the list!  Please see my updated morestats4
 branch.  It compiles, makes check-spaces happy, passes `--enable-gcc-
 warnings`, and runs in Shadow, but still I expect we'll need another
 iteration or two.  Below I'm replying to all comments that require more
 than just: "done."

 Okay.  It's looking better.  Thanks there!

 > >  * append_cell_stats_by_command doesn't seem to document how long the
 arrays are.  I might be a bit more comfortable if it took a structure
 instead.  This
 >
 > Changed to a structure.  How does the last sentence end?

 Probably explaining why a structure would be a good idea, I guess.

 > >  * Is this really slow in practice? It seems like it's likely to be
 really slow.
 >
 > Do you mean slow or fast in your second sentence?
 >
 > If you refer to the comment explaining why we're re-using arrays for all
 circuits, maybe I didn't just mean slow, but also bad for memory
 fragmentation.

 I'm asking here (and again below) what kind of a performance hit gets
 taken for turning this feature on.

 > >  * Prefer TOR_SIMPLEQ to manual list management for new queues.
 > >  * This data structure ... how does it work out in practice? It looks
 like it's going to be hideously inefficient.
 >
 > I can't really think of a more efficient data structure.  What do you
 have in mind?  (I'll hold off changing the code to TOR_SIMPLEQ until I'm
 sure we want to use a queue there.)

 Hm. Assuming a 64-bit architecture, this is going to eat an 8 byte pointer
 for the "next" element, plus 8 bytes for chunk element overhead, for every
 5-byte entry we need to hold.  We could do better by using some kind of
 dynamic array structure, which wouldn't have a per-element overhead...

 but before we go and do that, we should know whether that actually
 matters.  Hence my asking about how it actually works out in practice. If
 the memory consumption on a busy relay isn't much comparatively, then
 there's not much need to re-do this.

 > >  * This code needs unit tests and functional abstraction badly.
 >
 > I think functional abstraction is improved and code is now easier to
 test.  I didn't write tests yet, because I'm not quite sure where to
 start.  Would you mind writing one example test that I could use as a
 start for these functions?
 >  - connection_buckets_note_empty_ts in connection.c
 >  - bucket_millis_empty in connection.c
 >  - sum_up_cell_stats_by_command in control.c
 >  - append_cell_stats_by_command in control.c
 >  - format_cell_stats in control.c

 Well, format_cell_stats is pretty easy: you would make one or more
 circuit_t* object with the cell statistics you want set on it, and then
 pass them to format_cell_stats, and then check whether the output is as
 you expect and whether the circuit is munged.  You could stick this in an
 existing test_*.c module in src/tests/ , or add a new module as
 appropriate.  For information on writing actual tinytest tests, see
 src/ext/tinytest*.

 Is there more you need to know there?  I'm not clear on what aspect of
 writing unit tests isn't making sense.

 > Are there any other new functions that need testing badly?

 Our default should not be "let's test the functions that need it badly".
 Our default should be "test whatever new and changed functions we can."

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/7359#comment:14>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online