[or-cvs] Commit incomplete howto document with missing bits and inte...

Mon Jul 11 19:10:30 UTC 2005

Update of /home/or/cvsroot/control/doc
In directory moria:/tmp/cvs-serv7756/doc

Added Files:
	howto.txt 
Log Message:
Commit incomplete howto document with missing bits and interface XXXXs

--- NEW FILE: howto.txt ---
$Id: howto.txt,v 1.1 2005/07/11 19:10:28 nickm Exp $

                Interfacing with Tor: Clients and Controllers
   Copyright 2005 Nick Mathewson -- see LICENSE for licensing information

0. About this document

   This document has instructions for writing programs to interface with
   Tor.  You should read it if you want to write a Tor controller, of if you
   want to make your programs work with Tor correctly.

0.1. Further reading

   You should probably have a good idea of what Tor does and how it works;
   see the main Tor documentation for more detail.

   If you want full specifications for the data formats and protocols Tor
   uses, see tor-spec.txt, control-spec.txt, and socks-extensions.txt, all of
   which are included with the Tor distribution.

1. Making a program use Tor

   Suppose you have a simple network application, and you want that
   application to send its traffic over Tor.  This is pretty simple to do:
   here's how.

     - Make sure your protocol is stream based.  If you're using TCP, you're
       fine; if you're using UDP or another non-TCP protocol, Tor can't cope
       right now.

     - Make sure that connections are unidirectional.  That is, make sure
       that your protocol can run with one host (the 'originating host' or
       'client') originating all the connections to the other (the
       'responding host' or 'server').  If the responding host has to open
       TCP connections back to the originating host, it won't be able to do
       so when the originating host is anonymous.

     - For anonymous clients: Get your program to support SOCKS4a or SOCKS5
       with hostnames.  Right now, when your clients open a connection, they
       probably do a two step process of:
         * Resolve the server's hostname to an IP address.
         * Connect to the server.
       Instead, make sure that they can:
         * Connect to a local SOCKS proxy.
         * Tell the SOCKS proxy about the server's hostname and port.
           In SOCKS4a, this is done by sending these bytes, in order:
             0x04                 (socks version)
             0x01                 (connect)
             PORT                 (two bytes, most signficant byte first)
             0x00 0x00 0x00 0x01  (fake IP address: tells proxy to use
                                   SOCKS4a)
             0x00                 (empty username field)
             HOSTNAME             (target hostname)
             0x00                 (marks the end of the hostname field)
         * Wait for the SOCKS proxy to connect to the server.
           In SOCKS4a, it will reply with these bytes in order:
             0x00                 (response version)
             STATUS               (0x5A means success; other values mean
                                   failure)
             PORT                 (not set)
             ADDRESS              (not set)

     - For hidden services: Make sure that your program can be configured to
       accept connections from the local host only.

   For more information on SOCKS, see references [1], [2], and [3].  For more
   information on Tor's extensions to the SOCKS protocol, see
   "socks-extensions.txt" in the Tor distribution.

1.1. Notes on DNS

   Note that above, we encourage you to use SOCKS4a or SOCKS5 with hostnames
   instead of using SOCKS4 or SOCKS5 with IP addresses.  This is because your
   program needs to make Tor do its hostname lookups anonymously.  If your
   program resolves hostnames on its own (by calling gethostbyname or a
   similar API), then it will effectively broadcast the names of the hosts it
   is about to connect to.

1.2. Notes on authentication

   If your service uses IP addresses to prevent abuse, you should consider
   switching to a different model.  Once your software works with Tor,
   annoying people may being using Tor to conceal their IP addresses.  If the
   best abuse-prevention scheme you have is IP based, you'll be forced to
   choose between blocking all users who want privacy, and allowing abuse.
   If you've implemented a better authorization scheme, you won't have this
   problem.

1.3. Cleaning your protocol

   You aren't done just because your connections are anonymous.  You need to
   consider whether the application itself is doing things to compromise your
   users' anonymity.  Here are some things to watch out for:

   Information Leaks
     - Does your application include any information about the user
       in the protocol?

     - Does your application include any information about the user's
       computer in the protocol?  This can include not only the computer's IP
       address or MAC address, but also the version of the software, the
       processor type, installed hardware, or any other information that can
       be used to tell users apart.

     - Do different instances of your application behave differently?  If
       there are configuration options that make it easy to tell users apart,
       are they really necessary?

2. Writing a controller

   If you want your application to use Tor in a more fine-grained manner (and
   not just to anonymize your application's connections) you need to write a
   "controller".  A controller is a program that connects to Tor and sends it
   commands.  With a controller, you can examine and change Tor's
   configuration on the fly, change how circuits are built, and other
   operations.

   As of the most recent version (0.1.0.11), Tor does not have its controller
   interface enabled by default.  You need to configure it to listen on some
   local port by using the "ControlPort" configuration directive, either in
   the torrc file, like this:

       ControlPort 9100

   Or on the command line, like this:

       tor -controlport 9100

   Then your controller can connect to Tor.  But see the notes on
   authentication below (x.x).

   This document covers the Python and Java interfaces to Tor, and the
   underlying "v1" control protocol introduced in Tor version
   0.1.1.0. Earlier versions used an older and tricker control protocol which
   is not covered here; see "control-spec-v0.txt" for details.

2.1. Getting started

   When you're writing a controller, you can either connect to Tor's control
   port and send it commands directly, or you can use one of the libraries
   we've written to automate this for you.  Right now, there are libraries in
   Java and Python.

   First, you need to load the library and open a new connection to the Tor
   process.  In Java:

     import net.freehaven.tor.control.TorControlConnection;
     import java.net.Socket;

     public class Demo {
       public static final void main(String[] args) {
         Socket s = new Socket("127.0.0.1", 9100);
         TorControlConnection conn = TorControlConnection.getConnection(s);
         conn.authenticate(new byte[0]); // See section x.x
         // ...
       }
     }

   In Python:

      import socket
      import TorCtl

      s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
      s.connect(("127.0.0.1", 9100))
      conn = TorCtl.get_connection(s)
      conn.authenticate("")  # See section x.x
      # ...

   The factory method that you use to create a connection will check whether
   the version of Tor you've connected to supports the newer ("v1")
   text-based control protocol or the older ("v0") binary control protocol.

   Using the v1 protocol, just connect to the control port and say:

      AUTHENTICATE

   (For more information on using the v1 protocol directly, see x.x)

2.2. Configuration and information

   Now that you've got a connection to Tor, what can you do with it?

   One of the easiest operations is manipulating Tor's configuration
   parameters.  You can retrieve or change the value of any configuration
   variable by calling the appropriate method.

   In Java:

       // Get one configuration variable.
       String contact = conn.getConf("contact");
       // Get a set of configuration variables.
       Map values = conn.getConf(Arrays.asList(new String[]{
              "contact", "orport", "socksport"}));
       // Change a single configuration variable
       conn.setConf("BandwidthRate", "1 MB");
       // Change several configuration variables
       conn.setConf(Arrays.asList(new String[]{
              "HiddenServiceDir /home/tor/service1",
              "HiddenServicePort 80",
       }));
       // Flush the configuration to disk.
       conn.saveConf();

       XXXX getconf must handle multiple values correctly.

   In Python:
       # Get one configuration variable
       _, contact = conn.get_option("contact") #XXXX bad iface
       # Get a set of configuration variables.
       options = conn.get_option(["contact", "orport", "socksport"])
       // Change a single configuration variable
       conn.set_option("BandwidthRate", "1 MB")
       // Change several configuration variables
       conn.set_option([
              ("HiddenServiceDir", "/home/tor/service1"),
              ("HiddenServicePort", "80")])
       // Flush the configuration to disk.
       conn.save_conf()

   Talking to Tor directly:

       GETCONF contact
       GETCONF contact orport socksport
       SETCONF bandwidthrate="1 MB"
       SETCONF HiddenServiceDir=/home/tor/service1 HiddenServicePort=80
       SAVECONF

   For a list of configuration options recognized by Tor, see the main Tor
   manual page.

2.2.1. Using order-sensitive configuration variables

   XXXX

2.3. Getting status information

   Tor exposes other status information beyond those set in configuration
   options.  You can access this information with the "getInfo" method.
   In Java:

       // get a single value.
       String version = conn.getInfo("version");
       // get several values
       Map vals = conn.getInfo(Arrays.asList(new String[]{
          "addr-mappings/config", "version"}));

   In Python:

       # Get a single value
       version = conn.get_info("version")
       # Get several values
       vals = conn.get_info(["addr-mappings/config", "version"])

   Using the v1 control interface directly:

       GETINFO version
       GETINFO addr-mappings/config version

   For a complete list of recognized keys, see "control-spec.txt".

2.4. Signals

   You can send named "signals" to the Tor process to have it perform
   certain recognized actions.  For example, the "RELOAD" signal makes Tor
   reload its configuration file.  (If you're used to Unix platforms, this
   has the same effect as sending a HUP to the Tor process.)

   In Java:

       conn.signal("RELOAD");

   In Python:

       conn.signal("RELOAD")

   Using the v1 control protocol:

       SIGNAL RELOAD

   The recognized signal names are:
       "RELOAD" -- Reload configuration information
       "SHUTDOWN" -- Start a clean shutdown of the Tor process
       "DUMP" -- Write current statistics to the logs
       "DEBUG" -- Switch the logs to debugging verbosity
       "HALT" -- Stop the Tor process immediately.

   (See control-spec.txt for an up-to-date list.)

2.5. Listening for events

   Tor can tell you when certain events happen.  To learn about these events,
   first you need to give the control connection an "EventHandler" object to
   receive the events of interest.  Then, you tell the Tor process which
   events it should send you.

   These examples intercept and display log messages.  In Java:

       import net.freehaven.tor.control.NullEventHandler;
       import net.freehaven.tor.control.EventHandler;
       // We extend NullEventHandler so that we don't need to provide empty
       // implementations for all the events we don't care about.
       // ...
       EventHandler eh = new NullEventHandler() {
          public void message(String severity, String msg) {
            System.out.println("["+severity+"] "+msg);
       };
       conn.setEventHandler(eh);
       conn.setEvents(Arrays.asList(new String[]{
          "INFO", "NOTICE", "WARN", "ERR"}));

   In Python:

       class LogHandler:
           def msg(self, severity, message):
               print "[%s] %s"%(severity, message)
       con.set_event_handler(LogHandler())
       con.set_events(["INFO", "NOTICE", "WARN", "ERR"])

    Using the v1 protocol:  (See x.x for information on parsing the results)
       SETEVENTS INFO NOTICE WARN ERR

2.5.1. Kinds of events

    The following event types are currently recognized:

      CIRC: The status of a circuit has changed.
        These events include an ID string to identify the circuit, the new status
        of the circuit, and a list of the the routers in the circuit's
        current path.  The possible status values are:
          LAUNCHED -- the circuit has just been started; no work has been
            done yet to build it.
          EXTENDED -- the circuit has just been extended a single step.
          BUILT -- the circuit is finished.
          FAILED -- the circuit could not be built, and has been abandoned.
          CLOSED -- a successfully built circuit is now closed.

      STREAM: The status of an application stream has changed.
        These events include an string to identity the stream, the new status
        of the stream, the ID of the circuit (if any) that the stream is
        using, and the destination of the stream.  Recognized status values
        are:
          NEW -- an application has asked for an anonymous connection
          NEWRESOLVED -- an application has asked for an anonymous hostname
              lookup
          SENTCONNECT -- the stream has been attached to a circuit, and we
              have sent a connection request down the circuit
          SENTRESOLVE -- the stream has been attached to a circuit, and we
              have sent a lookup request down the circuit
          SUCCEEDED -- the stream has been connected, or the lookup request
              has been answered
          FAILED -- the stream failed and cannot be retried
          CLOSED -- the stream closed normally
          DETACHED -- the stream was detached from its circuit, but could be
              reattached to another.

      ORCONN: The status of a connection to an OR has changed.
        These events include a string to identify the OR, and the status of
        the connection.  Current status values are:
          LAUNCHED -- we have started a connection to the OR
          CONNECTED -- we are successfully connected to the OR
          FAILED -- we could not successfully connect to the OR
          CLOSED -- an existing connection to the OR has been closed.

      BW: Amount of bandwidth used in the last second.
        These events include the number of bytes read, and the number of
        bytes written.

      INFO, NOTICE, WARN, ERR: Tor has logged a message.
        These events include the severity of the message, and its textual
        content.

      NEWDESC: A new server descriptor has been received.
        These events include a list of IDs for the servers whose descriptors
        have changed.

      ADDRMAP: Tor has added a new address mapping.
        These events include the address mapped, its new value, and the time
        when the mapping will expire.

   (See control-spec.txt for an up-to-date list.)

2.5.2. Threading issues

    In the Python and Java control libraries, responses from the Tor
    controller are handled in a separate thread of execution.  Ordinarily,
    this thread is a "daemon thread" that exits when your other threads are
    finished.  This could be a problem if you want your main thread to stop,
    and have the rest of your program's functionality handled by events from
    the Tor control interface.  To make the controller thread stay alive when
    your other threads are finished, call the controller's "launch thread"
    method after you create the controller, and before you call the
    authenticate method.

    In Java:

        conn.launchThread(false);  // Not in daemon mode

    In Python:

        conn.launch_thread(daemon=0)  # Not in daemon mode

2.6. Overriding directory functionality

   You can tell Tor about new server descriptors.  (Ordinarily, it learns
   about these from the directory server.)  In Java:

       // Get a descriptor from some source
       String desc = ...;
       // Tell Tor about it
       conn.postDescriptor(desc);

   In Python:

       # Get a descriptor from some source
       desc = ...
       # Tell Tor about it
       conn.post_descriptor(desc)

   [XXXX We need some way to adjust server status, and to tell tor not to
   download directories/network-status, and a way to force a download.]

2.7. Mapping addresses

   Sometimes it is desirable to map one address to another, so that a
   connection request to address "A" will result in a connection to address
   B.  For example, suppose you are writing an anonymized DNS resolver.  While
   you can ask Tor to resolve 

   [XXXX It would be nice to request address lookups from the controller
   without using SOCKS.]

2.7. Man

Mapaddress

extendcircuit, attachstream, closestream, redirectstream,
     closecircuit

x.x. Authentication and security

x.x. Getting started with the v1 control protocol

References:
 [1] http://archive.socks.permeo.com/protocol/socks4.protocol
 [2] http://archive.socks.permeo.com/protocol/socks4a.protocol
 [3] SOCKS5: RFC1928