[tor-bugs] #8081 [Tor]: 7802 tweaks per code review

Tue Jan 29 21:59:07 UTC 2013

#8081: 7802 tweaks per code review
--------------------+-------------------------------------------------------
 Reporter:  andrea  |          Owner:                    
     Type:  defect  |         Status:  new               
 Priority:  normal  |      Milestone:  Tor: 0.2.4.x-final
Component:  Tor     |        Version:                    
 Keywords:          |         Parent:                    
   Points:          |   Actualpoints:                    
--------------------+-------------------------------------------------------

Comment(by mikeperry):

 I'm working on the above. Replying only to segments where I have
 questions/comments/disagreements.

 Replying to [ticket:8081 andrea]:
 > 18:54 < athena> did you mean to say pathbias_should_count?  i'm not
 seeing a pathbias_should_close
 > 18:54 < nickm> sorry, should_count
 > 18:55 < athena> okay
 > 18:55 < athena> looks like it ignores circuits that have some purposes,
 that are one-hop or checks some config options
 > 18:55 < athena> do circuit purposes ever change during the lifetime of a
 circuit
 > 18:55 < athena> ?
 > 18:56 < nickm> I believe circuit purposes can change under some
 circumstances, though the rules are nontrivial
 > 18:56 < athena> hmm
 > 18:56 < nickm> maybe we should cache the result of this function if
 we've ever called it before?
 > 18:56 < nickm> maybe we should warn if it changes?
 > 18:57 < nickm> this isn't a new problem, if it's a problem
 > 18:57 < athena> yeah, we need a detection for that if we can't be sure
 it doesn't happen
 > 18:57 < nickm> NOTE. Let's ask mike?

 Hrmm. We could either create a new path_state or re-use
 PATH_STATE_ALREADY_COUNTED (and maybe rename it?), then log if
 pathbias_should_count() ever gets called with a path_state of
 PATH_STATE_ALREADY_COUNTED.

 This should future proof us from additional purpose transitions, but I'm
 pretty sure the hidserv purposes we're really concerned about can't be
 transitioned out of currently.

 > 19:26 < nickm> on to pathbias_check_close_rate.  That should probably
 get renamed too.
 > 19:27 < athena> pathbias_act_on_close_rate?

 I'm not sure I like act_on, but I agree it's better than check. I will
 refactor out the scaling code and see if I can think of a better name that
 captures the logging and guard-dropping idea. Perhaps evaluate? Measure?

 > 19:27 < nickm> yeah.  That one _does_ have its return value checked.
 > 19:27 < nickm> (lots of duplicate code here too I think)
 > 19:27 < nickm> Does this look like duplicate code to you? Does a later
 commit fix that?
 > 19:28 < athena> hold on, lemme look
 > 19:29 < athena> yeah, it's kinda duplicated
 > 19:29 < athena> but it's calling different functions in the log messages
 (get_use_success_count() vs.
 >                 get_close_success_count())
 > 19:30 < athena> getting rid of the duplication would mean passing some
 function pointers around, probably
 > 19:30 < athena> and i'm not sure where it gets scale_factor from
 > 19:31 < nickm> Or having one block at the top of the function that says
 if (counting_closed) {
 >                success_count=get_close_success_count()  } else
 {success_count=get_use_success_count();} and avoid the
 >                function pointers
 > 19:31 < athena> yeah, okay
 > 19:31 < athena> then should refactor to have less duplicate code i think
 > 19:32 < nickm> The scaling block is pretty different
 > 19:32 < nickm> isn't scale_factor coming from pathbias_get_scale_factor
 ?
 > 19:32 < athena> hmm
 > 19:32 < nickm> I hope everything scaled in that block is also a double,
 or our idea of having one function that returns
 >                a double is going to break.
 > 19:33 < athena> yeah
 > 19:34 < athena> well, we could split the scaling out into separate
 rescale_<use,close> functions, and handle them like
 >                 the success_count thing?
 > 19:35 < nickm> Seems good.  Also Given that there are two sets of values
 that get scaled independently, we should make
 >                sure that their documentation says which block is which,
 clearly.
 > 19:35 < nickm> NOTE on the refactor and NOTE on the documentation

 The log messages and thresholds are actually all different between the two
 rate check functions here. The use bias stuff only has two thresholds
 (notice and warn+drop), because use failures proved exceedingly rare and
 nowhere near as sensitive to network conditions as build failures.. I
 think we want to keep the two logging functions separate for clarity for
 that reason, but I agree that refactoring out the scaling is a good plan.
 I also think that should also be two different functions rather than one
 blob with a conditional check (and probably another enum/define) though.

 > 19:45 < nickm> Hm. I wonder what initializes node->use_attempts and
 node->use_successes and the other doubles if this
 >                part isn't called.
 > 19:46 < nickm> If they all come from a malloc_zero() or a memset, that's
 a little tricky.  Does memset(&dbl, 0,
 >                sizeof(double)) get you a good zero?
 > 19:46 < athena> not on things that aren't IEEE-754 in general
 > 19:47 < nickm> hm. we proably have other places where we do this.
 > 19:47 < athena> yeah
 > 19:47 < athena> it is theoretically a portability issue, but it's gotta
 be pretty rare to run into a machine like that
 > 19:48 < athena> the only examples of non-754 fp i can think of off-hand
 are VAX and some older crays
 > 19:48 < nickm> We could treat it like systems where memset(&ptr, 0,
 sizeof(void*)) doesn't set ptr to NULL.
 > 19:48 < athena> yeah
 > 19:48 < nickm> NOTE. let's do that.

 What does 'that' mean here? Hopefully "pretend they no longer exist"? :)

 > 20:08 < athena> a2db17a1aab77c4183f589815800a779a5f24524 ?
 > 20:08 < nickm> hang on.  So, if even one stream is detached, we roll
 back to USE_ATTEMPTED?
 > 20:09 < nickm> What if there are multiple streams and only one is
 detached?
 > 20:09 < nickm> It doesn't seem too harmful, but we should ask mike IMO.
 > 20:09 < athena> hmm
 > 20:09 < athena> yeah
 > 20:09 < nickm> NOTE let's ask him

 Yes, the idea is that we should probe these instances to ensure that the
 circuit is still actually viable after the detach. Rolling back the state
 to USE_ATTEMPTED will induce such a probe upon close. I mention this in
 the commit message, but I will also add that explanation to the refactored
 function.

 > 20:21 < nickm> so,  the next commit does the common code refactoring you
 mentioned before when it adds
 >                pathbias_count_circs_in_states
 > 20:22 < nickm> also NOTE that the pairs of arguments to
 pathbias_count_circs_in_states are good things to document in
 >                the documentation for path_state_t
 > 20:22 < athena> yeah, agree
 > 20:22 < nickm> As for the part of this change that changes how scaling
 happens... how does that look to you?
 > 20:24 < athena> it looks correct to me
 > 20:25 < nickm> I personally would kind of prefer it if we didn't add
 things to {circ,use}_{attempts,successes} until we
 >                had the corresponding changes in the other scaled
 variables.  then this commit wouldn't be needed.
 >                that's a potentially bigger refactoring though.  NOTE it
 for later?

 Yeah, that might be a better approach. I will put an XXX in the scaling
 code mentioning that we may want to have separate temporary "pending"
 counters in the entry_guard_t that don't get added to the attempt counts
 until circuit close, but I'm not sure we want to do this change at this
 point as it may have unexpected side effects... Or maybe we should create
 a separate ticket for this even... Hrmm..

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8081#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online