[tor-dev] Torsocks development status

Ian Goldberg iang at cs.uwaterloo.ca
Thu Jun 27 19:44:07 UTC 2013

On Thu, Jun 27, 2013 at 03:11:23PM -0400, David Goulet wrote:
> Ian Goldberg:
> > On Wed, Jun 26, 2013 at 03:55:58PM -0400, David Goulet wrote:
> >> Hi everyone,
> >>
> >> For those who don't know, I've been working on a new version of Torsocks
> >> in the last three weeks or so.
> >>
> >> https://lists.torproject.org/pipermail/tor-dev/2013-June/004959.html
> >>
> >> I just wanted to give a quick status report on the state of the development.
> >>
> >> The DNS resolution is working for domain name (PTR) and IPv4 address.
> >> Currently, Tor does not support IPv6 resolution but the torsocks code
> >> support it.
> >>
> >> Hidden service onion address resolution is also working using a "dead IP
> >> range" acting as cookie that is sent back to the user and mapped to the
> >> .onion address on the hijacked connect().
> >>
> >> I've changed quite a bit the configuration file (torsocks.conf) to fit
> >> the style of tor (torrc). At this point, the tor address and port can be
> >> configured as well as the "dead IP range" mention above. More is coming
> >> but pretty simple for now.
> >>
> >> Logging is working, connection registry and thread safety as well. There
> >> is also a compat layer for mutexes and once I start porting the project
> >> to other *nix system (BSD, OS X, ...) probably more subsystem will be
> >> added to that compat layer.
> >>
> >> So, in a nutshell, some libc calls still need to be implemented, *moar*
> >> tests and other OS supports. I'm confident to have a beta version to
> >> present to the community in a couple of weeks (if nothing goes wrong).
> >>
> >> Feel free to browse the code, comment on it, contribute!, etc...
> >>
> >> https://github.com/dgoulet/torsocks/tree/rewrite
> > 
> > Are non-blocking sockets, select/poll/etc. (especially at connect()
> > time), and optimistic data on the to-do list?
> Yes! Good point I should have put the todo list. So yes, non block socket support.
> For optimistic data, it is kind of tricky. I can use it for DNS resolution
> without a problem because torsocks control the complete flow of data from
> opening a SOCKS5 connection to closing it after the DNS response is received
> however for actual real data (sendmsg, send, ...) a connect is needed before so
> it would means that a connect() call will return "yes OK socket connected" but
> where in fact it is not really true. So, when the first data are sent, there is
> a possibility that the Tor connections failed or even we block for an unknown
> amount of time during the send*/write() call.
> Now the question is, is this the kind of behavior that would be acceptable
> meaning basically lying to the caller at connect() and possibly blocking I/O
> calls and returning something like ECONNRESET or ENOTCONN if the Tor socks5
> connection fails.
> This is *real* tricky especially with non blocking socket, if torsocks needs to
> do some possible blocking call for the SOCKS5 replies during an I/O call from
> the caller that is not suppose to block. Furthermore, having pending data that
> *might* come at any time on the connection from the SOCKS5 negotiation, the
> caller could put the file descriptor in poll() mode, wake up and try to receive
> the data but where in fact it's the socks5 reply... it's possible to handle that
> but it seems here a VERY intrusive behavior. Does optimistic data worth it here
> vis-a-vis the complexity of handling that it and high intrusiveness ?
> Cheers!
> David

It *is* kind of tricky.  (See #3711.)  But I don't think it's that much
trickier than properly handling non-blocking sockets in the first place.
For example:

- Application calls connect()
- Torsocks intercepts, calls connect()
- Now you have to do a fancy dance where the application is going to
  select() to wait for the connection to complete, but where torsocks
  has to get the connection to complete, *and* send the connect request,
  *and* wait for the connect reply.  (In fact, with optimistic data, you
  *don't* have to do that last step.)  So you have to play around a bit
  with the parameters to the select() call, etc.  The torsocks version
  of select, poll, etc., have to recognize when select is called, and
  *any* socket is not fully end-to-end connected, to add "ready for
  write" / "ready for read" events for those sockets as appropriate.
  If libc_select returns ready for those sockets, handle them inside
  torsocks before returning to the caller.  (But no blocking!)  In the
  case you say above, where the application is polling for read, but
  it's really just the socks5 reply that's come in, the poll() in
  torsocks will need to read the reply first and mark the socket as
  fully connected.  If there's more data (or if any other socket is
  ready), then great, return to the caller.  If not, poll() again.

The only thing optimistic data changes is that (a) you don't wait for
the Tor SOCKS5 connected response, (b) you have to be ready to eat that
response when it comes, and (c) if the response is an error, ECONNRESET
(or something) the socket.

Is it worth it?  There's *significant* (like up to 33%) improvement in
time-to-first-byte latency for client-speaks-first protocols (like
HTTP).  I believe that's worth it.  But you're the one doing the
implementation.  ;-)

   - Ian

More information about the tor-dev mailing list