[tor-bugs] #32576 [Circumvention/Snowflake]: Fix race condition in snowflake broker

Tor Bug Tracker & Wiki blackhole at torproject.org
Fri Nov 22 22:45:59 UTC 2019


#32576: Fix race condition in snowflake broker
-------------------------------------+------------------------------
 Reporter:  cohosh                   |          Owner:  cohosh
     Type:  defect                   |         Status:  needs_review
 Priority:  Very High                |      Milestone:
Component:  Circumvention/Snowflake  |        Version:
 Severity:  Normal                   |     Resolution:
 Keywords:  metrics                  |  Actual Points:
Parent ID:                           |         Points:
 Reviewer:                           |        Sponsor:  Sponsor28
-------------------------------------+------------------------------
Changes (by cohosh):

 * status:  assigned => needs_review


Comment:

 This race condition occurs because both the proxy process and the client
 process try to `Pop`/`Remove` the same snowflake from the heap.

 When a proxy polls the broker, it starts a go routine that waits for an
 offer through the snowflake's offer channel, or waits for a timeout:
 {{{
         go func(request *ProxyPoll) {
             select {
             case offer := <-snowflake.offerChannel:
                 request.offerChannel <- offer
             case <-time.After(time.Second * ProxyTimeout):
                 // This snowflake is no longer available to serve clients.
                 // TODO: Fix race using a delete channel
                 heap.Remove(ctx.snowflakes, snowflake.index)
                 delete(ctx.idToSnowflake, snowflake.id)
                 request.offerChannel <- nil
             }
         }(request)
 }}}

 Snowflakes are removed from the heap when they time out or if they are
 popped off the heap by a client:
 {{{
 func clientOffers(ctx *BrokerContext, w http.ResponseWriter, r
 *http.Request) {

 ...

     // Delete must be deferred in order to correctly process answer
 request later.
     snowflake := heap.Pop(ctx.snowflakes).(*Snowflake)
     defer delete(ctx.idToSnowflake, snowflake.id)
     snowflake.offerChannel <- offer
 }}}

 A race can always occur where the timeout happens after the `Pop` removes
 the snowflake but before the offer is sent through `offerChannel`. This
 can be fixed through the use of locks and a check to see if
 `snowflake.index` has been set to `-1` (which happens after it has been
 popped off the heap). Here's a patch that adds synchronization to the
 broker to prevent simultaneous access to the heap as well as the
 `idToSnowflake` map: https://github.com/cohosh/snowflake/pull/14

 I'd like to do some tests before merging this to make sure that the
 synchronization doesn't slow things down too much, but a code review would
 be good at this point.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32576#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list