[tor-dev] Design for an exit relay scanner: feedback appreciated

Damian Johnson atagar at torproject.org
Mon Dec 2 01:09:55 UTC 2013


> I now have similar code which is based on stem:
> https://github.com/NullHypothesis/exitmap

Hi Philipp, sorry about the long delay before responding. ExitMap looks great!

You might want to look into PEP8 [1], Python's de-facto style guide.
It's certainly up to you which bits you do/don't like, but coming
close will make your code more uniform with the rest of the Python
world. PyPI has a slick pep8 script you can run over your codebase
[2]. Personally I run this as part of the tests for Stem.

Would you be amenable to changes in that regard from me? This looks
like a fun project, so I'm a bit tempted to sink a weekend or two into
into seeing if I can simplify the codebase.

Some general code review feedback follows...

========================================
exitmap / circuit.py
========================================

> new = Circuit

Not especially pythonic, but works.

========================================
exitmap / circuitpool.py
========================================

> while (circuitID is None):

Parentheses aren't necessary.

> if len(self.exitRelays) == 0:

Can also be 'if not self.exitRelays:'.

> try:
>   circuitID = self.ctrl.new_circuit([const.FIRST_HOP, exitRelay],
>                                     await_build=awaitBuild)
> except (stem.InvalidRequest, stem.CircuitExtensionFailed,
>         stem.ControllerError) as error:

Stem's InvalidRequest and CircuitExtensionFailed extend
ControllerError, so this can be simplified to...

except stem.ControllerError as error:

Stem's exception hierarchy can be found on...

https://stem.torproject.org/api/control.html#exceptions-and-attribute-enums

Your _addCircuit() is soley used by _fillPool(), so personally I'd
probably do this as a generator...

def _generate_circuit(self, await_build):
  """
  Pops the top relay off our exitRelays and generates a circuit through it,
  going on to the next entry if it fails.

  :param bool await_build: block until the circuit is created if **True**

  :returns: :class:`~circuit.Circuit` for the circuit we establish
  """

  while self.exitRelays:
    exit_fingerprint = self.exitRelays.pop(0)

    logger.debug("Attempting to create circuit with '%s' as exit " \
                 "relay." % exit_fingerprint)

    try:
      circuit_id = self.ctrl.new_circuit(
        [const.FIRST_HOP, exit_fingerprint],
        await_build = await_build,
      )

      logger.debug("Created circuit #%s with '%s' as exit relay." %
                   (circuit_id, exit_fingerprint))

      yield circuit.Circuit(circuit_id)
    except stem.ControllerError as exc:
      logger.warning("Could not establish circuit with '%s'. " \
                     "Skipping to next exit (error=%s)." %
                     (exit_fingerprint, exc))

  logger.warning("No more exit relay fingerprints to create circuits with.")

def _fill_pool(self, await_build = False):
  if len(self.pool) == const.CIRCUIT_POOL_SIZE:
    return

  logger.debug("Attempting to refill the circuit pool to size %d." %
               const.CIRCUIT_POOL_SIZE)

  while self.exitRelays and len(self.pool) != const.CIRCUIT_POOL_SIZE:
    self.pool.append(self._generate_circuit(awaitBuild))

> # go over circuit pool once
> poolLen = len(self.pool)
> for idx in xrange(poolLen):
>
>     if idx >= len(self.pool):
>         return None
>
>     logger.debug("idx=%d, poolsize=%d" % (idx, len(self.pool)))
>
>     circuit = self.pool[idx] # that's a Circuit() object

Honestly I'm finding this class to be overcomplicated. Popping and
appending items between lists is making this more confusing than it
needs to be.

========================================
exitmap / command.py
========================================

Stem offers a friendlier method of calling commands. Mostly I intended
it for the system module's functions, but you might find it useful
here too...

https://stem.torproject.org/api/util/system.html#stem.util.system.call

========================================
exitmap / exitselector.py
========================================

> for desc in stem.descriptor.parse_file(open(consensus)):

Why read the consensus file directly? If you have a controller then
getting it via tor would be the best option. If not then fetching this
directly via the authorities is generally the easiest...

https://stem.torproject.org/api/descriptor/remote.html

> if not "Exit" in desc.flags:
>   continue

Perfectly fine, though stem does offer enums...

if stem.Flag.EXIT not in desc.flags:
  continue

> for (ip, port) in hosts:
>   if not desc.exit_policy.can_exit_to(ip, port):
>     continue

I don't think this'll actually work. The 'continue' will be for the
iteration over hosts.

========================================
exitmap / scanner.py
========================================

> stem.connection.authenticate_none(torCtrl)

There is no point in doing this unless you *only* want it to work
without authentication. If you opt for authenticate() instead then
this will also work for cookie auth.

Cheers! -Damian

[1] http://www.python.org/dev/peps/pep-0008/
[2] https://pypi.python.org/pypi/pep8


More information about the tor-dev mailing list