[tor-dev] Torsocks locking design

Sun Jun 16 00:07:29 UTC 2013

Matthew Finkel:
> On Sun, Jun 09, 2013 at 04:52:45PM -0400, David Goulet wrote:
>> Hi everyone,
>>
> 
> Hi David!
> 
>> I'm posting here the small design document I've made for the Torsocks locking
>> mechanism.
>>
>> I'm looking for reviews, comments, improvements, ... well anything useful! :).
>> (Even English mistakes since it's not my primary language).
>>
>> Future changes will be pushed here if any are needed after this thread is done with.
>>
>> https://github.com/dgoulet/torsocks/blob/rewrite/doc/proposals/01-Thread-safe-design.txt
>>
> 
> Great! Thanks for putting in the time/thought for this!
> 
>> Thanks people!
>> David
> 
>> Date: 09/06/2013
>> Author: David Goulet <dgoulet at ev0ke.net>
>> Contributors:
>> 	* Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
>>
>> This document details the design of torsocks locking for thread safety.
>>
>> The basis is that at *any* time any thread can interact with Torsocks
>> (connect(2), DNS resolution, ...). Considering that, some sort of
>> synchronization is needed since torsocks needs to keep track of per socket
>> connection. They are kept in a central registry that every thread needs to
>> access.
> 
> This sounds reasonable.
> 
>>
>> The reason why it needs to be shared accross threads is because of fd passing.
> 
> s/accross/across/ :)
> 
>> It is possible and even not uncommon that threads exchange file descriptor(s) so
>> we can't keep a registry of connections using TLS (Thread Local Storage).
>> Furthermore, a process could easily, for instance, close(2) a socket within a
>> signal handler making Torsocks access thread storage inside a signal handler and
>> this SHOULD NOT be done, just never.
>>
>> Considering the above, a locking mechanism is needed to protect the registry.
>> Insertion, deletion and lookup in that registry CAN occur at the same time so a
>> mutex has to be used to protect any action *on* it. This protection is provided
>> by a mutex named "registry lock".
>>
> 
> Sure, this sounds like a good idea.
> 
>> Now, the objects inside that registry have to be protected during their life
>> time (as long as a reference is held) after a lookup. To protect the free() from
>> concurrent access, a refcount is used. Basically, each possible action on a
>> connection object checks if the connection refcount is down to 0. If so, this
>> means that no one is holding a reference to the object and it is ready to be
>> destroyed. For that, the refcount needs to start with a value of 1.
>>
> 
> The beginning of this paragraph seems a bit unclear to me (maybe it's
> just me). As of right now, as I understand it, we have an opaque object
> that, at a minimum, holds references to the set of (open socket, ref
> count) pairs. (Please correct me if I'm wrong.) In addition to this,
> whenever this pair is retrieved, the refcount is incremented by 1 within
> the registry thus allowing for use of this reference outside of any
> critical sections. Then, when the function no longer needs the
> reference, it decrements the refcount. If refcount reaches 0, then the
> conn is not needed.

Exactly that!

I'll clarify the "object" contained in this registry in the document. So,
basically the registry holds connection object identified by each socket we've
intercept. There is also some other things but the important part is how it can
be used and accessed outside the critical section.

> 
>> For this scheme to work, the connection object MUST be immutable once created.
>> Any part that can change during the lifetime of the connection object MUST be
>> protected with a mutex.
>>
>> This mechanism is used to avoid heavy contention on the registry lock. Without
>> the refcount, a connection would have to be protected within the registry lock
>> to avoid race between an access to the object and freeing it. As long as the
>> lookups in the registry are not that frequent, this should scale since the
>> critical section is pretty short.
>>
> 
> I think this should be okay. As it is currently implemented, Torsocks
> only "remembers" socket that *it* needs. By this I mean that it only
> maintains a reference to a socket if it is currently using the
> connection in a SOCKS handshake or DNS resolution. Once Torsocks has
> finished with its necessary operations, it returns complete control to
> the application and "forgets" about the socket.
> 
> If you plan to keep a similar design, then registry entries will mostly
> be short-lived and this should scale well. If your plan is to have the
> registry maintain a complete list of all open connections, then this
> still may scale and have little, or no, adverse affect on performance
> but I'm unsure if this provides much benefit. Have you decided which of
> these designs you want to use?

The registry contains every connection tracked by Torsocks that goes through the
Tor network. For instance, as you said, a simple DNS resolution would create a
short live connection that does not need to go into the registry since once we
return the DNS answer, there is no point of keeping it.

> 
>> Here is the algorithm for a read/write and destroy operation.
>>
>> Prerequisites:
>> ----
>> Refcount of a connection object starts at 1. When down to 0, it can be
>> destroyed.
>>
>> Add to registry (e.g.: connect(2)):
>> ----
>> 1)	lock(registry)
>> 2)		new_conn refcount = 1;
>> 3)		add to registry
>> /* We could also make a lookup for duplicates. */
>> 4)	unlock(registry)
>>
>> Read/Write op. (e.g.: DNS Lookup):
>> ----
>> 1)	lock(registry)
>> 2)		conn = lookup
>> 3)		if conn:
>> 4)			atomic_inc(conn refcount)
>> 5)	unlock(registry)
>> 6)	[action using the conn]
>> 7)	if atomic_dec_return(conn refcount) == 0:
>> /*
>>  * This is safe because at this point the connection object is not visible
>>  * anymore to any thread so we can safely free the object after unlocking it.
>>  */
>> 8)		free conn
>>
>> Destroy from registry (e.g.: close(2)):
>> ----
>> 1)	lock(registry)
>> 2)		conn = lookup
>> 3)		if conn:
>> /*
>>  * Make sure the connection object is not visible once we release the registry
>>  * lock. Note that we DO NOT increment the refcount here because we are on the
>>  * destroy path so we have to make the refcount come down to 0.
>>  */
>> 4)			remove from registry
>> 5)	unlock(registry)
>> 6)	if atomic_dec_return(conn refcount) == 0:
>> 7)		free(conn)
> 
> I think these look good. They seem to lock a very small amount of code,
> which is good. I suppose this is a nitpicky question but
> regarding atomic_dec_return() and atomic_inc(), do they both return the
> updated counter value? Or is atomic_dec_return() the only one
> that does that (denoted by the _return suffix)?

Yes so "_return" means that it returns the result of the atomic action. This is
to avoid a race between an atomic action and the atomic read.

Ref: http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/_005f_005fatomic-Builtins.html

> 
> I think this is simple, and simple is good. I have some questions about
> the actual registry, but I'll defer asking them until they're more
> on-topic. I'll try to take another pass of this when I'm less asleep.

Please if you have questions about the registry, ask them :). I've not made any
final decisions yet for the data structure but an hash table makes the most
sense to me at the moment containing "connection object" indexed by socket fd
since we are sure it's unique during the lifetime process as long as we catch
the connect/close of course.

Thanks!
David

> 
> Thanks again!
> 
> - Matt
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20130615/268b5e68/attachment.pgp>