[tor-bugs] #8081 [Tor]: 7802 tweaks per code review

Tue Jan 29 22:31:55 UTC 2013

#8081: 7802 tweaks per code review
--------------------+-------------------------------------------------------
 Reporter:  andrea  |          Owner:                    
     Type:  defect  |         Status:  new               
 Priority:  normal  |      Milestone:  Tor: 0.2.4.x-final
Component:  Tor     |        Version:                    
 Keywords:          |         Parent:                    
   Points:          |   Actualpoints:                    
--------------------+-------------------------------------------------------

Comment(by nickm):

 Replying to [comment:1 mikeperry]:
 > I'm working on the above. Replying only to segments where I have
 questions/comments/disagreements.

 Thanks!

 By the way, as you address these, PLEASE do separate commits for separate
 issues.  It's hard to review the stuff that some people do where there's
 just one commit called "respond to nick's review."

 (I think you've gotten good about this, but easier to remind now than to
 split up later.)

 > Replying to [ticket:8081 andrea]:
 > > 18:54 < athena> did you mean to say pathbias_should_count?  i'm not
 seeing a pathbias_should_close
 > > 18:54 < nickm> sorry, should_count
 > > 18:55 < athena> okay
 > > 18:55 < athena> looks like it ignores circuits that have some
 purposes, that are one-hop or checks some config options
 > > 18:55 < athena> do circuit purposes ever change during the lifetime of
 a circuit
 > > 18:55 < athena> ?
 > > 18:56 < nickm> I believe circuit purposes can change under some
 circumstances, though the rules are nontrivial
 > > 18:56 < athena> hmm
 > > 18:56 < nickm> maybe we should cache the result of this function if
 we've ever called it before?
 > > 18:56 < nickm> maybe we should warn if it changes?
 > > 18:57 < nickm> this isn't a new problem, if it's a problem
 > > 18:57 < athena> yeah, we need a detection for that if we can't be sure
 it doesn't happen
 > > 18:57 < nickm> NOTE. Let's ask mike?
 >
 > Hrmm. We could either create a new path_state or re-use
 PATH_STATE_ALREADY_COUNTED (and maybe rename it?), then log if
 pathbias_should_count() ever gets called with a path_state of
 PATH_STATE_ALREADY_COUNTED.
 >
 > This should future proof us from additional purpose transitions, but I'm
 pretty sure the hidserv purposes we're really concerned about can't be
 transitioned out of currently.

 I'd prefer a simple adjunct variable to a new state.  For example, a
 tristate that says whether pathbias_should_count() has returned true,
 false, or has neer been called.  That way we can check our "it never
 changes" assumption without risking any more breakage.

 > > 19:26 < nickm> on to pathbias_check_close_rate.  That should probably
 get renamed too.
 > > 19:27 < athena> pathbias_act_on_close_rate?
 >
 > I'm not sure I like act_on, but I agree it's better than check. I will
 refactor out the scaling code and see if I can think of a better name that
 captures the logging and guard-dropping idea. Perhaps evaluate? Measure?

 evaluate and measure seem okay to me, but they don't capture that the
 guard is going to get dropped potentially.  Still, let's not waste our
 time brainstorming unless we come up with something awesome in the next N
 minutes of cogitation here.

 > > 19:27 < nickm> yeah.  That one _does_ have its return value checked.
 > > 19:27 < nickm> (lots of duplicate code here too I think)
 > > 19:27 < nickm> Does this look like duplicate code to you? Does a later
 commit fix that?
 > > 19:28 < athena> hold on, lemme look
 > > 19:29 < athena> yeah, it's kinda duplicated
 > > 19:29 < athena> but it's calling different functions in the log
 messages (get_use_success_count() vs.
 > >                 get_close_success_count())
 > > 19:30 < athena> getting rid of the duplication would mean passing some
 function pointers around, probably
 > > 19:30 < athena> and i'm not sure where it gets scale_factor from
 > > 19:31 < nickm> Or having one block at the top of the function that
 says if (counting_closed) {
 > >                success_count=get_close_success_count()  } else
 {success_count=get_use_success_count();} and avoid the
 > >                function pointers
 > > 19:31 < athena> yeah, okay
 > > 19:31 < athena> then should refactor to have less duplicate code i
 think
 > > 19:32 < nickm> The scaling block is pretty different
 > > 19:32 < nickm> isn't scale_factor coming from
 pathbias_get_scale_factor ?
 > > 19:32 < athena> hmm
 > > 19:32 < nickm> I hope everything scaled in that block is also a
 double, or our idea of having one function that returns
 > >                a double is going to break.
 > > 19:33 < athena> yeah
 > > 19:34 < athena> well, we could split the scaling out into separate
 rescale_<use,close> functions, and handle them like
 > >                 the success_count thing?
 > > 19:35 < nickm> Seems good.  Also Given that there are two sets of
 values that get scaled independently, we should make
 > >                sure that their documentation says which block is
 which, clearly.
 > > 19:35 < nickm> NOTE on the refactor and NOTE on the documentation
 >
 > The log messages and thresholds are actually all different between the
 two rate check functions here. The use bias stuff only has two thresholds
 (notice and warn+drop), because use failures proved exceedingly rare and
 nowhere near as sensitive to network conditions as build failures.. I
 think we want to keep the two logging functions separate for clarity for
 that reason, but I agree that refactoring out the scaling is a good plan.
 I also think that should also be two different functions rather than one
 blob with a conditional check (and probably another enum/define) though.

 I think any code-refactoring you can do to eliminate the duplicate parts
 here would be welcome. If there's more apparent duplicate code that you
 think won't actually work to refactor, just leave an "XXXX can we
 eliminate any more duplicate code here?" comment and leave it at that.

 > > 19:45 < nickm> Hm. I wonder what initializes node->use_attempts and
 node->use_successes and the other doubles if this
 > >                part isn't called.
 > > 19:46 < nickm> If they all come from a malloc_zero() or a memset,
 that's a little tricky.  Does memset(&dbl, 0,
 > >                sizeof(double)) get you a good zero?
 > > 19:46 < athena> not on things that aren't IEEE-754 in general
 > > 19:47 < nickm> hm. we proably have other places where we do this.
 > > 19:47 < athena> yeah
 > > 19:47 < athena> it is theoretically a portability issue, but it's
 gotta be pretty rare to run into a machine like that
 > > 19:48 < athena> the only examples of non-754 fp i can think of off-
 hand are VAX and some older crays
 > > 19:48 < nickm> We could treat it like systems where memset(&ptr, 0,
 sizeof(void*)) doesn't set ptr to NULL.
 > > 19:48 < athena> yeah
 > > 19:48 < nickm> NOTE. let's do that.
 >
 > What does 'that' mean here? Hopefully "pretend they no longer exist"? :)

 Treat it like we do systems where memset doesn't make pointers null:
 detect it and soil ourselves.  I'll write this one as designated autoconf
 monkey.

 > > 20:08 < athena> a2db17a1aab77c4183f589815800a779a5f24524 ?
 > > 20:08 < nickm> hang on.  So, if even one stream is detached, we roll
 back to USE_ATTEMPTED?
 > > 20:09 < nickm> What if there are multiple streams and only one is
 detached?
 > > 20:09 < nickm> It doesn't seem too harmful, but we should ask mike
 IMO.
 > > 20:09 < athena> hmm
 > > 20:09 < athena> yeah
 > > 20:09 < nickm> NOTE let's ask him
 >
 > Yes, the idea is that we should probe these instances to ensure that the
 circuit is still actually viable after the detach. Rolling back the state
 to USE_ATTEMPTED will induce such a probe upon close. I mention this in
 the commit message, but I will also add that explanation to the refactored
 function.

 Thanks!

 > > 20:21 < nickm> so,  the next commit does the common code refactoring
 you mentioned before when it adds
 > >                pathbias_count_circs_in_states
 > > 20:22 < nickm> also NOTE that the pairs of arguments to
 pathbias_count_circs_in_states are good things to document in
 > >                the documentation for path_state_t
 > > 20:22 < athena> yeah, agree
 > > 20:22 < nickm> As for the part of this change that changes how scaling
 happens... how does that look to you?
 > > 20:24 < athena> it looks correct to me
 > > 20:25 < nickm> I personally would kind of prefer it if we didn't add
 things to {circ,use}_{attempts,successes} until we
 > >                had the corresponding changes in the other scaled
 variables.  then this commit wouldn't be needed.
 > >                that's a potentially bigger refactoring though.  NOTE
 it for later?
 >
 > Yeah, that might be a better approach. I will put an XXX in the scaling
 code mentioning that we may want to have separate temporary "pending"
 counters in the entry_guard_t that don't get added to the attempt counts
 until circuit close, but I'm not sure we want to do this change at this
 point as it may have unexpected side effects... Or maybe we should create
 a separate ticket for this even... Hrmm..

 Right; that's a big-ish change. I say add a ticket and an XXX, but doing
 it in 0.2.4 is probably not wise at this point.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8081#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online