[tor-bugs] #5232 [BridgeDB]: Import bridges into BridgeDB in a separate thread and database transaction

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Apr 2 10:19:48 UTC 2014


#5232: Import bridges into BridgeDB in a separate thread and database transaction
-------------------------+-------------------------------------------------
     Reporter:  karsten  |      Owner:  sysrqb
         Type:  defect   |     Status:  closed
     Priority:  major    |  Milestone:
    Component:           |    Version:
  BridgeDB               |   Keywords:  bridgedb-email, bridgedb-db,
   Resolution:  fixed    |  bridgedb-https, bridgedb-0.1.7
Actual Points:           |  Parent ID:
       Points:           |
-------------------------+-------------------------------------------------
Changes (by isis):

 * status:  needs_review => closed
 * resolution:   => fixed


Comment:

 Replying to [comment:19 sysrqb]:
 > Replying to [comment:18 isis]:
 > > Replying to [comment:17 sysrqb]:
 > >   1) For `43c6f8778b244ca4b19628677a293a83ae1d1e09`
 [...]
 > >      In the same function:
 > >      {{{
 > >      +            with bridgedb.Storage.getDB() as db:
 > >      [...]
 > >      +                        for timestamp in ts:
 > >      +                            logging.debug(
 > >      +                                "Updating BridgeHistory
 timestamps for %s: %s"
 > >      +                                % (bridge.fingerprint,
 timestamp))
 > >      +                            reactor.callFromThread(
 > >      +
 bridgedb.Stability.addOrUpdateBridgeHistory,
 > >      +                                bridge, timestamp)
 > >      +            logging.debug("Stability calculations complete")
 > >      +
 > >      +    reactor.callInThread(updateBridgeHistory, bridges,
 timestamps)
 > >      }}}
 > >      Calling `reactor.callFromThread` will run the call inside the
 main reactor loop... unless the `with bridgedb.Storage.getDB() as db` is
 doing some context magic to put it back into a separate thread again.
 Unless I'm misunderstanding something, which I probably am. Or, does it
 call it in the parent thread, i.e. the `reactor.callInThread(_reloadFn)`
 thread? Because the `t.i.interfaces.IReactorThreads` docs make it seem
 like `reactor.callFromThread()` will ''always'' call the main reactor
 thread.
 >
 > Nope, you're completey correct. I *think* that was a typo and I wanted
 to use callInThread(), but I don't actually know why I wanted to do that
 either, it's already in a different thread than reactor loop. I can't
 justify spawning a couple thousand threads for this so I deleted that
 invocation and now it directly calls addOrUpdateBridgeHistory().
 >

 Okay, my server startup and restart timings came out to the same as with
 the `callFromThread()`s in there, this makes more sense now. It seems to
 be roughly as fast, and adding thousands of threads sounds a bit
 frightening with how little memory ponticum has.

 > >   4) In `b67862d317a282d81e2f53726e4dbff8bb8cb1f1`
 > >      `More unit tests for Email Server`:
 > >       Why `from StringIO import StringIO`? Isn't that the old,
 deprecated one?
 > >
 >
 > 100% correct. That was my mistake. But then fixing that caused some
 breakage because we used StringIO.StringIO in EmailServer.py. When I
 switched that to io.StringIO it broke because io.StringIO only accepts
 unicode and we used MimeWriter.MimeWriter which used string.
 > As a result, #11370. Which should be resolved now.

 Cool, I've reviewed #11370 and am merging it too (they are both part of
 the same branch of yours anyway).

 > >   6) In `3c9e04ac9006ff21f30aeb033f6e2560f09541f6`
 > >      `Make sure leekspin created our descriptors`:
 > >      Sorry for the `/usr/share/dict` problem in leekspin, it's a good
 idea to add these checks.
 > >
 >
 > I debated deleting this commit but I decided to leave it because it
 caused no harm to check for the files.
 >

 Yeah, at some point it took me almost an hour to figure out where the
 TravisCI boxes were putting the files. Definitely doesn't hurt to have
 some extra checks.

 > >   7) Super! CHANGELOG entries! Awesome! :D
 > >
 >
 > And now there's another one!
 >

 Thanks!

 > > Overall, the context manager wrapping for the semaphores on db
 accesses is a bit of madness, but doing threading in Twisted Python is a
 bit of madness, and I'm amazed this is working. Threading in Twisted is
 seriously no small feat. You did a really brilliant job on this!
 > >
 >
 > This comment made it all worth it. Thanks! :)

 Still seriously impressed.

 If anyone ever doubts your Python skills, tell them to call me. :)

 > 5092364f7e3ce42b61b0336a95519639d0e59308 needs review for #11370

 Review is on the #11370 ticket.

 > bd05227c5eefc528b72d57d53ac3f139fc2f0d5e (was 91dc3a6e) contains the re-
 added letter (5)
 > 0556e7daa8504756b34384a0f9b5dfa753556c09 (was b67862d3) now uses
 io.StringIO (4)
 > bd652aa560cfaf15b9b37979aba73e9f8f548b4e (was 43c6f877) fixes
 callFromThread() and iterator (1)

 All good.

 > And I added 8dfcd5d7f77950828d3c95f93417baf7ea70c6b7 for a doc and to
 fix long lines.

 Good too.

 > Pushed
 [https://gitweb.torproject.org/user/sysrqb/bridgedb.git/shortlog/refs/heads/bug5232_adding_concurrent_processing_squashed_r1
 bug5232_adding_concurrent_processing_squashed_r1] for rereview!

 Merged! Thanks sysrqb! You are awesome!

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/5232#comment:22>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list