[tor-bugs] #8614 [BridgeDB]: BridgeDB should be able to return multiple transport types at the same time

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu Oct 10 04:36:14 UTC 2013


#8614: BridgeDB should be able to return multiple transport types at the same time
--------------------------+--------------------------
     Reporter:  asn       |      Owner:  sysrqb
         Type:  defect    |     Status:  needs_review
     Priority:  normal    |  Milestone:
    Component:  BridgeDB  |    Version:
   Resolution:            |   Keywords:  important
Actual Points:            |  Parent ID:  #8615
       Points:            |
--------------------------+--------------------------

Comment (by sysrqb):

 Replying to [comment:17 isis]:

 Thanks for the review!

 > In `lib/bridgedb/Bridges.py`, in function
 `reorderBridgesByTransportRequirement` from commit
 [https://gitweb.torproject.org/user/sysrqb/bridgedb.git/commitdiff/8641da059e2e925d6f5a45bd1ecf23f7e44b3d36
 8641da059e2e925d6f5a45bd1ecf23f7e44b3d36], the minimum required transport
 number is decremented, and then the maximum (if not `None`) is also
 decremented:
 > {{{
 > +    for tp,minnum,maxnum in transports:
 > +        trans[tp] = [minnum, maxnum]
 > [...]
 > +    for bridge in bridges:
 > +        for tpt in bridge.transports:
 > +            tp = tpt.methodname
 > +            if tp in trans:
 > +                if trans[tp][0] > 0:
 > [...]
 > +                    bridges2.insert(0, bridge2)
 > +                    trans[tp][0] -= 1
 > +                    if trans[tp][1]:
 > +                        trans[tp][1] -= 1
 > +                        assert trans[tp][0] <= trans[tp][1]
 > +                    break
 > }}}
 > It seems like `minnum` is not needed...for example, if the config file
 specified `FORCE_TRANS = [('obfs2, 2, 2)]` then after the first obfs2
 bridge is found, both counters are decremented to `('obfs2, 1, 1)` and
 then the assertion fails.

 Maybe I'm only seeing what I want to see, but why does the assertion fail?

 > If `FORCE_TRANS = [('obfs2', 1, 3)]` then we grab the first obfs2
 bridge,  put it in the `bridges2` list, and jump to this block:
 > {{{
 > +                elif trans[tp][1] and trans[tp][1] > 0:
 > +                    if len(bridge.transports) > 1:
 > +                        bridge2 = deepcopy(bridge)
 > +                        bridge2.transports = [
 > +                            PluggableTransport(bridge2, tp,
 tpt.address, tpt.port)
 > +                        ]
 > +                    else:
 > +                        bridge2 = bridge
 > +                    bridges2.insert(0, bridge2)
 > }}}
 > I think I'm not understanding why `minnum` is needed...wouldn't it just
 be easier to only have a maximum, and then try to add
 `PluggableTransport`s until either `maxnum` is reached or we can't find
 any more?
 >

 Yes, this makes more sense, done. Originally I only had a min and we
 stopped searching when we hit it, but then it seemed like a good idea to
 return more if we have them, so I added in a max.

 > -------
 >
 > There is a still a bit of code duplication which could be functionalised
 (which might also improve readability). For example, this block in
 `reorderBridgesByTransportRequirement` happens twice:
 > {{{
 > +                    if len(bridge.transports) > 1:
 > +                        bridge2 = deepcopy(bridge)
 > +                        bridge2.transports = [
 > +                            PluggableTransport(bridge2, tp,
 tpt.address, tpt.port)
 > +                        ]
 > +                    else:
 > +                        bridge2 = bridge
 > +                    bridges2.insert(0, bridge2)
 > }}}
 > Perhaps this could go into a new function like
 `copyBridgeIntoBridgesList` or something similar? (Doing this might depend
 on how the previously mentioned `minnum` thing is dealt with.)
 >

 Yup, gone.

 > --------
 >
 > This check, in `reorderBridgesByTransportRequirement`, to make sure that
 the lists of bridge fingerprints are the same:
 > {{{
 > +    copied = False
 > +    num_not_copied = 0
 > +    for bridge in bridges:
 > +        for bridge2 in bridges2:
 > +            if bridge.fingerprint == bridge2.fingerprint:
 > +                copied = True
 > +                break
 > +        if not copied:
 > +            num_not_copied += 1
 > +            bridges2.append(bridge)
 > +        copied = False
 > +
 > +    assert len(bridges2) == len(bridges), "Should be %d, but is %d.
 Appended %d" % (len(bridges), len(bridges2), num_not_copied)
 > }}}
 > seems a bit expensive, especially since you're already iterated over the
 bridges right before this block. Rather then indexing again to add the
 fingerprint here, you could do it during the previous loop. Or, in the
 previous loop, you could remove a `bridge` from `bridges` once done
 processing it's `bridge.transports`, and then just keep going until there
 aren't any `bridges left.
 >
 > The other thing which is less expensive would be to check like this:
 > {{{
 > bridgeSet1 = set(bridges)
 > bridgeSet2 = set(bridges2)
 > try:
 >      # All the bridges which are in one set or the other, but not in
 both
 >      assert bridgeSet1.symmetric_difference(bridgeSet2) == 0
 > except AssertionError:
 >      # All the bridges in `bridges` which didn't end up in `bridges2`:
 >      not_copied = bridgeSet1.difference(bridgeSet2)
 > }}}
 > and then just iterate over / deal with the `not_copied` ones. (Set
 operations are way faster.)
 >

 Hm, sets, that could work. I admit it is expensive, but in reality we're
 talking about a dozen bridges, so there performance penalty is minimal.
 Regardless, I agree we shouldn't be needlessly wasteful.

 > -------
 >
 > The documentation looks amazing! :D
 >
 You had to find something positive to comment on, huh. :)

 > -------
 >
 > Also, I am now noticing, again in
 `reorderBridgesByTransportRequirement`, that `tpt` is (or, at least, it
 ''should'' be) an instance of the `PluggableTransport` class:
 > {{{
 > +    for bridge in bridges:
 > +        for tpt in bridge.transports:
 > +            tp = tpt.methodname
 > +            if tp in trans:
 > +                if trans[tp][0] > 0:
 > +                    if len(bridge.transports) > 1:
 > +                        bridge2 = deepcopy(bridge)
 > +                        bridge2.transports = [
 > +                            PluggableTransport(bridge2, tp,
 tpt.address, tpt.port)
 > +                        ]
 > +                    else:
 > +                        bridge2 = bridge
 > +                    bridges2.insert(0, bridge2)
 > }}}
 > If that is the case, then I don't understand why the whole bridge must
 be copied, since `:attr:PluggableTransport.bridge` already is the bridge.
 In other words, why not just do:
 > {{{
 > +    for bridge in bridges:
 > +        for tpt in bridge.transports:
 > +            tp = tpt.methodname
 > +            if tp in trans:
 > +                if trans[tp][0] > 0:
 > -                    if len(bridge.transports) > 1:
 > -                        bridge2 = deepcopy(bridge)
 > -                        bridge2.transports = [
 > -                            PluggableTransport(bridge2, tp,
 tpt.address, tpt.port)
 > -                        ]
 > +                    bridge2 = deepcopy(tpt.bridge)
 > +
 bridge2.transports.append(PluggableTransport(bridge2, tp, tpt.address,
 tpt.port))
 > +                    bridges2.insert(0, bridge2)
 > }}}

 I don't quite understand this. What's the different between these?
 {{{
                     bridge2 = deepcopy(bridge)
 }}}

 {{{
                     bridge2 = deepcopy(tpt.bridge)
 }}}
 they should be identical, afaict, yes? The reasoning for
 {{{
                         bridge2.transports = [
                             PluggableTransport(bridge2, tp, tpt.address,
 tpt.port)
                         ]
 }}}
 was to overwrite the reference to the deepcopied list (because that's not
 what we want), and this is what you hint at below. The reasoning for doing
 this probably isn't clear, though. If we select a bridge specifically
 because it supports a PT, but we don't modify the bridge, then when we
 return it to the distributor it does not know which PT to tell the user.
 By making a copy of the bridge that only contains the needed PT we
 increase the chance that the distributor will tell the user the correct PT
 (verses the OR Port, this is where #9652 came from).

 > Appending the `PluggableTransport` to `bridge2.transports` might not
 even be necessary...since it's a deepcopy, and the `Bridge` and
 `PluggableTransport` classes circularly reference each other, the deepcopy
 might already include the `PluggableTransport`. However, since the
 `PluggableTransport` class references the ''instance'' of the `Bridge`
 class in its constructor, the deepcopied one would reference the `bridge`
 instance of `Bridge`, not the `bridge2` instance.
 >
 > Jesus fatherfucking shitpile son of a politician FUCK. ''grumble
 grumble''...should just rewrite the whole damned program...''grumble''...
 >
 > Either way, this one is a small problem, and probably not worth wasting
 your time with the effort to figure out what's going on with the circular
 references, especially if your version works.
 >
 > Ugh...note to self: XXX FIXME for the `Bridge` ⟷ `PluggableTransport`
 references.
 >

 aaaaand deep breaths!

 > ------
 >
 > More later.
 Thanks for this, so far!

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8614#comment:18>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list