[tor-commits] [stem/master] Module for Tor manual information

atagar at torproject.org atagar at torproject.org
Sun Dec 6 21:57:11 UTC 2015


commit 53accd666eeceae94ab3ee7f54236c624fcfe174
Author: Damian Johnson <atagar at torproject.org>
Date:   Sat Nov 21 11:41:32 2015 -0800

    Module for Tor manual information
    
    Nyx, Erebus, and Stem's interpreter all have use for human readable information
    about tor's configuration options. Nyx has long parsed the man page for this
    but like most of Nyx it was pretty cobbled together. No tests, poor
    performance, and not very general purpose.
    
      https://trac.torproject.org/projects/tor/ticket/8251
    
    Moving the useful bits of Nyx over and expanding it to be an order of magnitude
    faster (by calling 'man -P cat' to avoid its pager) and have a great compliment
    of tests.
    
    Note that the tests presently expect the tor man page in our test directory but
    it doesn't exist. I'll be addressing this in an upcoming commit, but I don't
    want to add the man page to our git history (no need to artificially inflait
    our repository size).
---
 docs/api.rst                 |    1 +
 docs/api/manual.rst          |    5 +
 docs/change_log.rst          |    1 +
 docs/contents.rst            |    1 +
 setup.py                     |    2 +-
 stem/manual.cfg              |  264 +++++++++++++++++++++++++++++++++++++
 stem/manual.py               |  298 ++++++++++++++++++++++++++++++++++++++++++
 test/settings.cfg            |    2 +
 test/unit/manual.py          |  183 ++++++++++++++++++++++++++
 test/unit/tor.1_with_unknown |   51 ++++++++
 10 files changed, 807 insertions(+), 1 deletion(-)

diff --git a/docs/api.rst b/docs/api.rst
index 85cfdcb..208450b 100644
--- a/docs/api.rst
+++ b/docs/api.rst
@@ -15,6 +15,7 @@ Controller
 * **Types**
 
  * `stem.exit_policy <api/exit_policy.html>`_ - Relay policy for the destinations it will or won't allow traffic to.
+ * `stem.manual <api/manual.html>`_ - Information available about Tor from `its manual <https://www.torproject.org/docs/tor-manual.html.en>`_.
  * `stem.version <api/version.html>`_ - Tor versions that can be compared to determine Tor's capabilities.
 
 Descriptors
diff --git a/docs/api/manual.rst b/docs/api/manual.rst
new file mode 100644
index 0000000..7692673
--- /dev/null
+++ b/docs/api/manual.rst
@@ -0,0 +1,5 @@
+Manual
+======
+
+.. automodule:: stem.manual
+
diff --git a/docs/change_log.rst b/docs/change_log.rst
index 70bb05e..080bc54 100644
--- a/docs/change_log.rst
+++ b/docs/change_log.rst
@@ -45,6 +45,7 @@ The following are only available within Stem's `git repository
  * **Controller**
 
   * Dramatic, `300x performance improvement <https://github.com/DonnchaC/stem/pull/1>`_ for reading from the control port with python 3
+  * Added `stem.manual <api/manual.html>`_, which provides information available about Tor from `its manual <https://www.torproject.org/docs/tor-manual.html.en>`_ (:trac:`8251`)
   * :func:`~stem.connection.connect` and :func:`~stem.control.Controller.from_port` now connect to both port 9051 (relay's default) and 9151 (Tor Browser's default) (:trac:`16075`)
   * Added `support for NETWORK_LIVENESS events <api/response.html#stem.response.events.NetworkLivenessEvent>`_ (:spec:`44aac63`)
   * Added :func:`~stem.control.Controller.is_user_traffic_allowed` to the :class:`~stem.control.Controller`
diff --git a/docs/contents.rst b/docs/contents.rst
index ec41dda..930d4cf 100644
--- a/docs/contents.rst
+++ b/docs/contents.rst
@@ -34,6 +34,7 @@ Contents
    api/response
 
    api/exit_policy
+   api/manual
    api/version
 
    api/descriptor/descriptor
diff --git a/setup.py b/setup.py
index 9784441..002e982 100644
--- a/setup.py
+++ b/setup.py
@@ -17,6 +17,6 @@ distutils.core.setup(
   packages = ['stem', 'stem.descriptor', 'stem.interpreter', 'stem.response', 'stem.util'],
   keywords = 'tor onion controller',
   scripts = ['tor-prompt'],
-  package_data = {'stem.interpreter': ['settings.cfg'], 'stem.util': ['ports.cfg']},
+  package_data = {'stem': ['manual.cfg'], 'stem.interpreter': ['settings.cfg'], 'stem.util': ['ports.cfg']},
 )
 
diff --git a/stem/manual.cfg b/stem/manual.cfg
new file mode 100644
index 0000000..f7a29c0
--- /dev/null
+++ b/stem/manual.cfg
@@ -0,0 +1,264 @@
+################################################################################
+#
+# Information related to tor configuration options...
+#
+#   * manual.important   Most commonly used configuration options.
+#   * manual.summary     Short summary describing the option.
+#
+################################################################################
+
+manual.important BandwidthRate
+manual.important BandwidthBurst
+manual.important RelayBandwidthRate
+manual.important RelayBandwidthBurst
+manual.important ControlPort
+manual.important HashedControlPassword
+manual.important CookieAuthentication
+manual.important DataDirectory
+manual.important Log
+manual.important RunAsDaemon
+manual.important User
+
+manual.important Bridge
+manual.important ExcludeNodes
+manual.important MaxCircuitDirtiness
+manual.important SocksPort
+manual.important UseBridges
+
+manual.important BridgeRelay
+manual.important ContactInfo
+manual.important ExitPolicy
+manual.important MyFamily
+manual.important Nickname
+manual.important ORPort
+manual.important PortForwarding
+manual.important AccountingMax
+manual.important AccountingStart
+
+manual.important DirPortFrontPage
+manual.important DirPort
+
+manual.important HiddenServiceDir
+manual.important HiddenServicePort
+
+# General Config Options
+
+manual.summary.BandwidthRate Average bandwidth usage limit
+manual.summary.BandwidthBurst Maximum bandwidth usage limit
+manual.summary.MaxAdvertisedBandwidth Limit for the bandwidth we advertise as being available for relaying
+manual.summary.RelayBandwidthRate Average bandwidth usage limit for relaying
+manual.summary.RelayBandwidthBurst Maximum bandwidth usage limit for relaying
+manual.summary.PerConnBWRate Average relayed bandwidth limit per connection
+manual.summary.PerConnBWBurst Maximum relayed bandwidth limit per connection
+manual.summary.ConnLimit Minimum number of file descriptors for Tor to start
+manual.summary.ConstrainedSockets Shrinks sockets to ConstrainedSockSize
+manual.summary.ConstrainedSockSize Limit for the received and transmit buffers of sockets
+manual.summary.ControlPort Port providing access to tor controllers (nyx, vidalia, etc)
+manual.summary.ControlListenAddress Address providing controller access
+manual.summary.ControlSocket Socket providing controller access
+manual.summary.HashedControlPassword Hash of the password for authenticating to the control port
+manual.summary.CookieAuthentication If set, authenticates controllers via a cookie
+manual.summary.CookieAuthFile Location of the authentication cookie
+manual.summary.CookieAuthFileGroupReadable Group read permissions for the authentication cookie
+manual.summary.ControlPortWriteToFile Path for a file tor writes containing its control port
+manual.summary.ControlPortFileGroupReadable Group read permissions for the control port file
+manual.summary.DataDirectory Location for storing runtime data (state, keys, etc)
+manual.summary.DirServer Alternative directory authorities
+manual.summary.AlternateDirAuthority Alternative directory authorities (consensus only)
+manual.summary.AlternateHSAuthority Alternative directory authorities (hidden services only)
+manual.summary.AlternateBridgeAuthority Alternative directory authorities (bridges only)
+manual.summary.DisableAllSwap Locks all allocated memory so they can't be paged out
+manual.summary.FetchDirInfoEarly Keeps consensus information up to date, even if unnecessary
+manual.summary.FetchDirInfoExtraEarly Updates consensus information when it's first available
+manual.summary.FetchHidServDescriptors Toggles if hidden service descriptors are fetched automatically or not
+manual.summary.FetchServerDescriptors Toggles if the consensus is fetched automatically or not
+manual.summary.FetchUselessDescriptors Toggles if relay descriptors are fetched when they aren't strictly necessary
+manual.summary.Group GID for the process when started
+manual.summary.HttpProxy HTTP proxy for connecting to tor
+manual.summary.HttpProxyAuthenticator Authentication credentials for HttpProxy
+manual.summary.HttpsProxy SSL proxy for connecting to tor
+manual.summary.HttpsProxyAuthenticator Authentication credentials for HttpsProxy
+manual.summary.Socks4Proxy SOCKS 4 proxy for connecting to tor
+manual.summary.Socks5Proxy SOCKS 5 for connecting to tor
+manual.summary.Socks5ProxyUsername Username for connecting to the Socks5Proxy
+manual.summary.Socks5ProxyPassword Password for connecting to the Socks5Proxy
+manual.summary.KeepalivePeriod Rate at which to send keepalive packets
+manual.summary.Log Runlevels and location for tor logging
+manual.summary.LogMessageDomains Includes a domain when logging messages
+manual.summary.OutboundBindAddress Sets the IP used for connecting to tor
+manual.summary.PidFile Path for a file tor writes containing its process id
+manual.summary.ProtocolWarnings Toggles if protocol errors give warnings or not
+manual.summary.RunAsDaemon Toggles if tor runs as a daemon process
+manual.summary.LogTimeGranularity limits granularity of log message timestamps
+manual.summary.SafeLogging Toggles if logs are scrubbed of sensitive information
+manual.summary.User UID for the process when started
+manual.summary.HardwareAccel Toggles if tor attempts to use hardware acceleration
+manual.summary.AccelName OpenSSL engine name for crypto acceleration
+manual.summary.AccelDir Crypto acceleration library path
+manual.summary.AvoidDiskWrites Toggles if tor avoids frequently writing to disk
+manual.summary.TunnelDirConns Toggles if directory requests can be made over the ORPort
+manual.summary.PreferTunneledDirConns Avoids directory requests that can't be made over the ORPort if set
+manual.summary.CircuitPriorityHalflife Overwrite method for prioritizing traffic among relayed connections
+manual.summary.DisableIOCP Disables use of the Windows IOCP networking API
+manual.summary.CountPrivateBandwidth Applies rate limiting to private IP addresses
+
+# Client Config Options
+
+manual.summary.AllowInvalidNodes Permits use of relays flagged as invalid by authorities
+manual.summary.ExcludeSingleHopRelays Permits use of relays that allow single hop connections
+manual.summary.Bridge Available bridges
+manual.summary.LearnCircuitBuildTimeout Toggles adaptive timeouts for circuit creation
+manual.summary.CircuitBuildTimeout Initial timeout for circuit creation
+manual.summary.CircuitIdleTimeout Timeout for closing circuits that have never been used
+manual.summary.CircuitStreamTimeout Timeout for shifting streams among circuits
+manual.summary.ClientOnly Ensures that we aren't used as a relay or directory mirror
+manual.summary.ExcludeNodes Relays or locales never to be used in circuits
+manual.summary.ExcludeExitNodes Relays or locales never to be used for exits
+manual.summary.ExitNodes Preferred final hop for circuits
+manual.summary.EntryNodes Preferred first hops for circuits
+manual.summary.StrictNodes Never uses notes outside of Entry/ExitNodes
+manual.summary.FascistFirewall Only make outbound connections on FirewallPorts
+manual.summary.FirewallPorts Ports used by FascistFirewall
+manual.summary.HidServAuth Authentication credentials for connecting to a hidden service
+manual.summary.ReachableAddresses Rules for bypassing the local firewall
+manual.summary.ReachableDirAddresses Rules for bypassing the local firewall (directory fetches)
+manual.summary.ReachableORAddresses Rules for bypassing the local firewall (OR connections)
+manual.summary.LongLivedPorts Ports requiring highly reliable relays
+manual.summary.MapAddress Alias mappings for address requests
+manual.summary.NewCircuitPeriod Period for considering the creation of new circuits
+manual.summary.MaxCircuitDirtiness Duration for reusing constructed circuits
+manual.summary.NodeFamily Define relays as belonging to a family
+manual.summary.EnforceDistinctSubnets Prevent use of multiple relays from the same subnet on a circuit
+manual.summary.SocksPort Port for using tor as a Socks proxy
+manual.summary.SocksListenAddress Address from which Socks connections can be made
+manual.summary.SocksPolicy Access policy for the pocks port
+manual.summary.SocksTimeout Time until idle or unestablished socks connections are closed
+manual.summary.TrackHostExits Maintains use of the same exit whenever connecting to this destination
+manual.summary.TrackHostExitsExpire Time until use of an exit for tracking expires
+manual.summary.UpdateBridgesFromAuthority Toggles fetching bridge descriptors from the authorities
+manual.summary.UseBridges Make use of configured bridges
+manual.summary.UseEntryGuards Use guard relays for first hop
+manual.summary.NumEntryGuards Pool size of guard relays we'll select from
+manual.summary.SafeSocks Toggles rejecting unsafe variants of the socks protocol
+manual.summary.TestSocks Provide notices for if socks connections are of the safe or unsafe variants
+manual.summary.WarnUnsafeSocks Toggle warning of unsafe socks connection
+manual.summary.VirtualAddrNetwork Address range used with MAPADDRESS
+manual.summary.AllowNonRFC953Hostnames Toggles blocking invalid characters in hostname resolution
+manual.summary.AllowDotExit Toggles allowing exit notation in addresses
+manual.summary.FastFirstHopPK Toggle public key usage for the first hop
+manual.summary.TransPort Port for transparent proxying if the OS supports it
+manual.summary.TransListenAddress Address from which transparent proxy connections can be made
+manual.summary.NATDPort Port for forwarding ipfw NATD connections
+manual.summary.NATDListenAddress Address from which NATD forwarded connections can be made
+manual.summary.AutomapHostsOnResolve Map addresses ending with special suffixes to virtual addresses
+manual.summary.AutomapHostsSuffixes Address suffixes recognized by AutomapHostsOnResolve
+manual.summary.DNSPort Port from which DNS responses are fetched instead of tor
+manual.summary.DNSListenAddress Address for performing DNS resolution
+manual.summary.ClientDNSRejectInternalAddresses Ignores DNS responses for internal addresses
+manual.summary.ClientRejectInternalAddresses Disables use of Tor for internal connections
+manual.summary.DownloadExtraInfo Toggles fetching of extra information about relays
+manual.summary.FallbackNetworkstatusFile Path for a fallback cache of the consensus
+manual.summary.WarnPlaintextPorts Toggles warnings for using risky ports
+manual.summary.RejectPlaintextPorts Prevents connections on risky ports
+manual.summary.AllowSingleHopCircuits Makes use of single hop exits if able
+
+# Server Config Options
+
+manual.summary.Address Overwrites address others will use to reach this relay
+manual.summary.AllowSingleHopExits Toggles permitting use of this relay as a single hop proxy
+manual.summary.AssumeReachable Skips reachability test at startup
+manual.summary.BridgeRelay Act as a bridge
+manual.summary.ContactInfo Contact information for this relay
+manual.summary.ExitPolicy Traffic destinations that can exit from this relay
+manual.summary.ExitPolicyRejectPrivate Prevent exiting connection on the local network
+manual.summary.MaxOnionsPending Decryption queue size
+manual.summary.MyFamily Other relays this operator administers
+manual.summary.Nickname Identifier for this relay
+manual.summary.NumCPUs Number of processes spawned for decryption
+manual.summary.ORPort Port used to accept relay traffic
+manual.summary.ORListenAddress Address for relay connections
+manual.summary.PortForwarding Use UPnP or NAT-PMP if needed to relay
+manual.summary.PortForwardingHelper Executable for configuring port forwarding
+manual.summary.PublishServerDescriptor Types of descriptors published
+manual.summary.ShutdownWaitLength Delay before quitting after receiving a SIGINT signal
+manual.summary.HeartbeatPeriod Rate at which an INFO level heartbeat message is sent
+manual.summary.AccountingMax Amount of traffic before hibernating
+manual.summary.AccountingStart Duration of an accounting period
+manual.summary.RefuseUnknownExits Prevents relays not in the consensus from using us as an exit
+manual.summary.ServerDNSResolvConfFile Overriding resolver config for DNS queries we provide
+manual.summary.ServerDNSAllowBrokenConfig Toggles if we persist despite configuration parsing errors or not
+manual.summary.ServerDNSSearchDomains Toggles if our DNS queries search for addresses in the local domain
+manual.summary.ServerDNSDetectHijacking Toggles testing for DNS hijacking
+manual.summary.ServerDNSTestAddresses Addresses to test to see if valid DNS queries are being hijacked
+manual.summary.ServerDNSAllowNonRFC953Hostnames Toggles if we reject DNS queries with invalid characters
+manual.summary.BridgeRecordUsageByCountry Tracks geoip information on bridge usage
+manual.summary.ServerDNSRandomizeCase Toggles DNS query case randomization
+manual.summary.GeoIPFile Path to file containing geoip information
+manual.summary.CellStatistics Toggles storing circuit queue duration to disk
+manual.summary.DirReqStatistics Toggles storing network status counts and performance to disk
+manual.summary.EntryStatistics Toggles storing client connection counts to disk
+manual.summary.ExitPortStatistics Toggles storing traffic and port usage data to disk
+manual.summary.ConnDirectionStatistics Toggles storing connection use to disk
+manual.summary.ExtraInfoStatistics Publishes statistic data in the extra-info documents
+
+# Directory Server Options
+
+manual.summary.AuthoritativeDirectory Act as a directory authority
+manual.summary.DirPortFrontPage Publish this html file on the DirPort
+manual.summary.V1AuthoritativeDirectory Generates a version 1 consensus
+manual.summary.V2AuthoritativeDirectory Generates a version 2 consensus
+manual.summary.V3AuthoritativeDirectory Generates a version 3 consensus
+manual.summary.VersioningAuthoritativeDirectory Provides opinions on recommended versions of tor
+manual.summary.NamingAuthoritativeDirectory Provides opinions on fingerprint to nickname bindings
+manual.summary.HSAuthoritativeDir Toggles accepting hidden service descriptors
+manual.summary.HidServDirectoryV2 Toggles accepting version 2 hidden service descriptors
+manual.summary.BridgeAuthoritativeDir Acts as a bridge authority
+manual.summary.MinUptimeHidServDirectoryV2 Required uptime before accepting hidden service directory
+manual.summary.DirPort Port for directory connections
+manual.summary.DirListenAddress Address the directory service is bound to
+manual.summary.DirPolicy Access policy for the DirPort
+manual.summary.FetchV2Networkstatus Get the obsolete V2 consensus
+
+# Directory Authority Server Options
+
+manual.summary.RecommendedVersions Tor versions believed to be safe
+manual.summary.RecommendedClientVersions Tor versions believed to be safe for clients
+manual.summary.RecommendedServerVersions Tor versions believed to be safe for relays
+manual.summary.ConsensusParams Params entry of the networkstatus vote
+manual.summary.DirAllowPrivateAddresses Toggles allowing arbitrary input or non-public IPs in descriptors
+manual.summary.AuthDirBadDir Relays to be flagged as bad directory caches
+manual.summary.AuthDirBadExit Relays to be flagged as bad exits
+manual.summary.AuthDirInvalid Relays from which the valid flag is withheld
+manual.summary.AuthDirReject Relays to be dropped from the consensus
+manual.summary.AuthDirListBadDirs Toggles if we provide an opinion on bad directory caches
+manual.summary.AuthDirListBadExits Toggles if we provide an opinion on bad exits
+manual.summary.AuthDirRejectUnlisted Rejects further relay descriptors
+manual.summary.AuthDirMaxServersPerAddr Limit on the number of relays accepted per ip
+manual.summary.AuthDirMaxServersPerAuthAddr Limit on the number of relays accepted per an authority's ip
+manual.summary.BridgePassword Password for requesting bridge information
+manual.summary.V3AuthVotingInterval Consensus voting interval
+manual.summary.V3AuthVoteDelay Wait time to collect votes of other authorities
+manual.summary.V3AuthDistDelay Wait time to collect the signatures of other authorities
+manual.summary.V3AuthNIntervalsValid Number of voting intervals a consensus is valid for
+manual.summary.V3BandwidthsFile Path to a file containing measured relay bandwidths
+manual.summary.V3AuthUseLegacyKey Signs consensus with both the current and legacy keys
+manual.summary.RephistTrackTime Discards old, unchanged reliability informaition
+
+# Hidden Service Options
+
+manual.summary.HiddenServiceDir Directory contents for the hidden service
+manual.summary.HiddenServicePort Port the hidden service is provided on
+manual.summary.PublishHidServDescriptors Toggles automated publishing of the hidden service to the rendezvous directory
+manual.summary.HiddenServiceVersion Version for published hidden service descriptors
+manual.summary.HiddenServiceAuthorizeClient Restricts access to the hidden service
+manual.summary.RendPostPeriod Period at which the rendezvous service descriptors are refreshed
+
+# Testing Network Options
+
+manual.summary.TestingTorNetwork Overrides other options to be a testing network
+manual.summary.TestingV3AuthInitialVotingInterval Overrides V3AuthVotingInterval for the first consensus
+manual.summary.TestingV3AuthInitialVoteDelay Overrides TestingV3AuthInitialVoteDelay for the first consensus
+manual.summary.TestingV3AuthInitialDistDelay Overrides TestingV3AuthInitialDistDelay for the first consensus
+manual.summary.TestingAuthDirTimeToLearnReachability Delay until opinions are given about which relays are running or not
+manual.summary.TestingEstimatedDescriptorPropagationTime Delay before clients attempt to fetch descriptors from directory caches
+
diff --git a/stem/manual.py b/stem/manual.py
new file mode 100644
index 0000000..52f25e6
--- /dev/null
+++ b/stem/manual.py
@@ -0,0 +1,298 @@
+# Copyright 2015, Damian Johnson and The Tor Project
+# See LICENSE for licensing information
+
+"""
+Provides information available about Tor from `its manual
+<https://www.torproject.org/docs/tor-manual.html.en>`_.
+
+**Module Overview:**
+
+::
+
+  is_important - Indicates if a configuration option is of particularly common importance.
+
+  Manual - Information about Tor available from its manual.
+   +- from_man - Retrieves manual information from its man page.
+
+.. versionadded:: 1.5.0
+"""
+
+import collections
+import os
+
+import stem.prereq
+import stem.util.conf
+import stem.util.enum
+import stem.util.log
+import stem.util.system
+
+try:
+  # added in python 2.7
+  from collections import OrderedDict
+except ImportError:
+  from stem.util.ordereddict import OrderedDict
+
+try:
+  # added in python 3.2
+  from functools import lru_cache
+except ImportError:
+  from stem.util.lru_cache import lru_cache
+
+Category = stem.util.enum.Enum('GENERAL', 'CLIENT', 'RELAY', 'DIRECTORY', 'AUTHORITY', 'HIDDEN_SERVICE', 'TESTING', 'UNKNOWN')
+ConfigOption = collections.namedtuple('ConfigOption', ['category', 'name', 'usage', 'summary', 'description'])
+
+CATEGORY_SECTIONS = {
+  'GENERAL OPTIONS': Category.GENERAL,
+  'CLIENT OPTIONS': Category.CLIENT,
+  'SERVER OPTIONS': Category.RELAY,
+  'DIRECTORY SERVER OPTIONS': Category.DIRECTORY,
+  'DIRECTORY AUTHORITY SERVER OPTIONS': Category.AUTHORITY,
+  'HIDDEN SERVICE OPTIONS': Category.HIDDEN_SERVICE,
+  'TESTING NETWORK OPTIONS': Category.TESTING,
+}
+
+
+ at lru_cache()
+def _config():
+  """
+  Provides a dictionary for our manual.cfg. This has a couple categories...
+
+    * manual.important (list) - list of lowercase configuration options
+      considered to be important
+
+    * manual.summary.* (str) - summary descriptions of config options, key uses
+      the lowercase configuration option
+  """
+
+  config = stem.util.conf.Config()
+  config_path = os.path.join(os.path.dirname(__file__), 'manual.cfg')
+
+  try:
+    config.load(config_path)
+    config_dict = dict([(key.lower(), config.get_value(key)) for key in config.keys()])
+    config_dict['manual.important'] = [name.lower() for name in config.get_value('manual.important', [], multiple = True)]
+    return config_dict
+  except Exception as exc:
+    stem.util.log.warn("BUG: stem failed to load its internal manual information from '%s': %s" % (config_path, exc))
+    return {}
+
+
+def is_important(option):
+  """
+  Indicates if a configuration option of particularly common importance or not.
+
+  :param str option: tor configuration option to check
+
+  :returns: **bool** that's **True** if this is an important option and
+    **False** otherwise
+  """
+
+  return option.lower() in _config()['manual.important']
+
+
+class Manual(object):
+  """
+  Parsed tor man page. Tor makes no guarantees about its man page format so
+  this may not always be compatible. If not you can use the cached manual
+  information stored with Stem.
+
+  This does not include every bit of information from the tor manual. For
+  instance, I've excluded the 'THE CONFIGURATION FILE FORMAT' section. If
+  there's a part you'd find useful then `file an issue
+  <https://trac.torproject.org/projects/tor/wiki/doc/stem/bugs>`_ and we can
+  add it.
+
+  :var str name: brief description of the tor command
+  :var str synopsis: brief tor command usage
+  :var str description: general description of what tor does
+
+  :var dict commandline_options: mapping of commandline arguments to their descripton
+  :var dict signals: mapping of signals tor accepts to their description
+  :var dict files: mapping of file paths to their description
+
+  :var dict config_option: **ConfigOption** tuples for tor configuration options
+  """
+
+  def __init__(self, name, synopsis, description, commandline_options, signals, files, config_options):
+    self.name = name
+    self.synopsis = synopsis
+    self.description = description
+    self.commandline_options = commandline_options
+    self.signals = signals
+    self.files = files
+    self.config_options = config_options
+
+  @staticmethod
+  def from_man(man_path = 'tor'):
+    """
+    Reads and parses a given man page.
+
+    :param str man_path: path argument for 'man', for example you might want
+      '/path/to/tor/doc/tor.1' to read from tor's git repository
+    """
+
+    try:
+      man_output = stem.util.system.call('man -P cat %s' % man_path)
+    except OSError as exc:
+      raise IOError("Unable to run 'man -P cat %s': %s" % (man_path, exc))
+
+    categories, config_options = _get_categories(man_output), OrderedDict()
+
+    for category_header, category_enum in CATEGORY_SECTIONS.items():
+      _add_config_options(config_options, category_enum, categories.get(category_header, []))
+
+    for category in categories:
+      if category.endswith(' OPTIONS') and category not in CATEGORY_SECTIONS and category != 'COMMAND-LINE OPTIONS':
+        _add_config_options(config_options, Category.UNKNOWN, categories.get(category, []))
+
+    return Manual(
+      _join_lines(categories.get('NAME', [])),
+      _join_lines(categories.get('SYNOPSIS', [])),
+      _join_lines(categories.get('DESCRIPTION', [])),
+      _get_indented_descriptions(categories.get('COMMAND-LINE OPTIONS', [])),
+      _get_indented_descriptions(categories.get('SIGNALS', [])),
+      _get_indented_descriptions(categories.get('FILES', [])),
+      config_options,
+    )
+
+
+def _get_categories(content):
+  """
+  The man page is headers followed by an indented section. First pass gets
+  the mapping of category titles to their lines.
+  """
+
+  # skip header and footer lines
+
+  if content and 'TOR(1)' in content[0]:
+    content = content[1:]
+
+  if content and 'TOR(1)' in content[-1]:
+    content = content[:-1]
+
+  categories = {}
+  category, lines = None, []
+
+  for line in content:
+    # replace non-ascii characters
+    #
+    #   \u2019 - smart single quote
+    #   \u2014 - extra long dash
+    #   \xb7 - centered dot
+
+    char_for = chr if stem.prereq.is_python_3() else unichr
+    line = line.replace(char_for(0x2019), "'").replace(char_for(0x2014), '-').replace(char_for(0xb7), '*')
+
+    if line and not line.startswith(' '):
+      if category:
+        if lines[-1] == '':
+          lines = lines[:-1]  # sections end with an extra empty line
+
+        categories[category] = lines
+
+      category, lines = line.strip(), []
+    else:
+      if line.startswith('       '):
+        line = line[7:]  # contents of a section have a seven space indentation
+
+      lines.append(line)
+
+  if category:
+    categories[category] = lines
+
+  return categories
+
+
+def _get_indented_descriptions(lines):
+  """
+  Parses the commandline argument and signal sections. These are options
+  followed by an indented description. For example...
+
+  ::
+
+    -f FILE
+        Specify a new configuration file to contain further Tor configuration
+        options OR pass - to make Tor read its configuration from standard
+        input. (Default: /usr/local/etc/tor/torrc, or $HOME/.torrc if that file
+        is not found)
+
+  There can be additional paragraphs not related to any particular argument but
+  ignoring those.
+  """
+
+  options, last_arg = OrderedDict(), None
+
+  for line in lines:
+    if line and not line.startswith(' '):
+      options[line], last_arg = [], line
+    elif last_arg and line.startswith('    '):
+      options[last_arg].append(line[4:])
+
+  return dict([(arg, ' '.join(desc_lines)) for arg, desc_lines in options.items() if desc_lines])
+
+
+def _add_config_options(config_options, category, lines):
+  """
+  Parses a section of tor configuration options. These have usage information,
+  followed by an indented description. For instance...
+
+  ::
+
+    ConnLimit NUM
+        The minimum number of file descriptors that must be available to the
+        Tor process before it will start. Tor will ask the OS for as many file
+        descriptors as the OS will allow (you can find this by "ulimit -H -n").
+        If this number is less than ConnLimit, then Tor will refuse to start.
+
+
+        You probably don't need to adjust this. It has no effect on Windows
+        since that platform lacks getrlimit(). (Default: 1000)
+  """
+
+  last_option, usage, description = None, None, []
+
+  if lines and lines[0].startswith('The following options'):
+    lines = lines[lines.index(''):]  # drop the initial description
+
+  for line in lines:
+    if line and not line.startswith(' '):
+      if last_option:
+        summary = _config().get('manual.summary.%s' % last_option.lower(), '')
+        config_options[last_option] = ConfigOption(category, last_option, usage, summary, _join_lines(description).strip())
+
+      if ' ' in line:
+        last_option, usage = line.split(' ', 1)
+      else:
+        last_option, usage = line, ''
+
+      description = []
+    else:
+      if line.startswith('    '):
+        line = line[4:]
+
+      description.append(line)
+
+  if last_option:
+    summary = _config().get('manual.summary.%s' % last_option.lower(), '')
+    config_options[last_option] = ConfigOption(category, last_option, usage, summary, _join_lines(description).strip())
+
+
+def _join_lines(lines):
+  """
+  The man page provides line-wrapped content. Attempting to undo that. This is
+  close to a simple join, but we still want empty lines to provide newlines.
+  """
+
+  content = []
+
+  for line in lines:
+    if line:
+      if content and content[-1][-1] != '\n':
+        line = ' ' + line
+
+      content.append(line)
+    else:
+      if content and content[-1][-1] != '\n':
+        content.append('\n\n')
+
+  return ''.join(content)
diff --git a/test/settings.cfg b/test/settings.cfg
index fffbdcf..13fb342 100644
--- a/test/settings.cfg
+++ b/test/settings.cfg
@@ -135,6 +135,7 @@ pyflakes.ignore run_tests.py => 'unittest' imported but unused
 pyflakes.ignore stem/__init__.py => undefined name 'long'
 pyflakes.ignore stem/__init__.py => undefined name 'unicode'
 pyflakes.ignore stem/control.py => undefined name 'controller'
+pyflakes.ignore stem/manual.py => undefined name 'unichr'
 pyflakes.ignore stem/prereq.py => 'RSA' imported but unused
 pyflakes.ignore stem/prereq.py => 'asn1' imported but unused
 pyflakes.ignore stem/prereq.py => 'unittest' imported but unused
@@ -176,6 +177,7 @@ test.unit_tests
 |test.unit.exit_policy.rule.TestExitPolicyRule
 |test.unit.exit_policy.policy.TestExitPolicy
 |test.unit.version.TestVersion
+|test.unit.manual.TestManual
 |test.unit.tutorial.TestTutorial
 |test.unit.tutorial_examples.TestTutorialExamples
 |test.unit.response.add_onion.TestAddOnionResponse
diff --git a/test/unit/manual.py b/test/unit/manual.py
new file mode 100644
index 0000000..c9e682a
--- /dev/null
+++ b/test/unit/manual.py
@@ -0,0 +1,183 @@
+"""
+Unit tessts for the stem.manual module. Test data comes from the following...
+
+  * test/unit/tor.1 - Tor version 0.2.8.0-alpha-dev (git-3c6782395743a089)
+"""
+
+import codecs
+import os
+import unittest
+
+import stem.manual
+import stem.util.system
+import test.runner
+
+from stem.manual import Category
+
+try:
+  # added in python 3.2
+  from functools import lru_cache
+except ImportError:
+  from stem.util.lru_cache import lru_cache
+
+TEST_MAN_PAGE = os.path.join(os.path.dirname(__file__), 'tor.1')
+
+EXPECTED_CATEGORIES = set([
+  'NAME',
+  'SYNOPSIS',
+  'DESCRIPTION',
+  'COMMAND-LINE OPTIONS',
+  'THE CONFIGURATION FILE FORMAT',
+  'GENERAL OPTIONS',
+  'CLIENT OPTIONS',
+  'SERVER OPTIONS',
+  'DIRECTORY SERVER OPTIONS',
+  'DIRECTORY AUTHORITY SERVER OPTIONS',
+  'HIDDEN SERVICE OPTIONS',
+  'TESTING NETWORK OPTIONS',
+  'SIGNALS',
+  'FILES',
+  'SEE ALSO',
+  'BUGS',
+  'AUTHORS',
+])
+
+EXPECTED_CLI_OPTIONS = set(['-h, -help', '-f FILE', '--allow-missing-torrc', '--defaults-torrc FILE', '--ignore-missing-torrc', '--hash-password PASSWORD', '--list-fingerprint', '--verify-config', '--service install [--options command-line options]', '--service remove|start|stop', '--nt-service', '--list-torrc-options', '--version', '--quiet|--hush'])
+EXPECTED_SIGNALS = set(['SIGTERM', 'SIGINT', 'SIGHUP', 'SIGUSR1', 'SIGUSR2', 'SIGCHLD', 'SIGPIPE', 'SIGXFSZ'])
+
+EXPECTED_OPTION_COUNTS = {
+  Category.GENERAL: 74,
+  Category.CLIENT: 86,
+  Category.RELAY: 47,
+  Category.DIRECTORY: 5,
+  Category.AUTHORITY: 34,
+  Category.HIDDEN_SERVICE: 11,
+  Category.TESTING: 32,
+  Category.UNKNOWN: 0,
+}
+
+EXPECTED_DESCRIPTION = """
+Tor is a connection-oriented anonymizing communication service. Users choose a source-routed path through a set of nodes, and negotiate a "virtual circuit" through the network, in which each node knows its predecessor and successor, but no others. Traffic flowing down the circuit is unwrapped by a symmetric key at each node, which reveals the downstream node.
+
+Basically, Tor provides a distributed network of servers or relays ("onion routers"). Users bounce their TCP streams - web traffic, ftp, ssh, etc. - around the network, and recipients, observers, and even the relays themselves have difficulty tracking the source of the stream.
+
+By default, tor will only act as a client only. To help the network by providing bandwidth as a relay, change the ORPort configuration option - see below. Please also consult the documentation on the Tor Project's website.
+""".strip()
+
+EXPECTED_FILE_DESCRIPTION = 'Specify a new configuration file to contain further Tor configuration options OR pass - to make Tor read its configuration from standard input. (Default: /usr/local/etc/tor/torrc, or $HOME/.torrc if that file is not found)'
+
+EXPECTED_BANDWIDTH_RATE_DESCRIPTION = 'A token bucket limits the average incoming bandwidth usage on this node to the specified number of bytes per second, and the average outgoing bandwidth usage to that same value. If you want to run a relay in the public network, this needs to be at the very least 30 KBytes (that is, 30720 bytes). (Default: 1 GByte)\n\nWith this option, and in other options that take arguments in bytes, KBytes, and so on, other formats are also supported. Notably, "KBytes" can also be written as "kilobytes" or "kb"; "MBytes" can be written as "megabytes" or "MB"; "kbits" can be written as "kilobits"; and so forth. Tor also accepts "byte" and "bit" in the singular. The prefixes "tera" and "T" are also recognized. If no units are given, we default to bytes. To avoid confusion, we recommend writing "bytes" or "bits" explicitly, since it\'s easy to forget that "B" means bytes, not bits.'
+
+
+ at lru_cache()
+def man_content():
+  return stem.util.system.call('man -P cat %s' % TEST_MAN_PAGE)
+
+
+class TestManual(unittest.TestCase):
+  def test_is_important(self):
+    self.assertTrue(stem.manual.is_important('ExitPolicy'))
+    self.assertTrue(stem.manual.is_important('exitpolicy'))
+    self.assertTrue(stem.manual.is_important('EXITPOLICY'))
+
+    self.assertFalse(stem.manual.is_important('ConstrainedSockSize'))
+
+  def test_get_categories(self):
+    if stem.util.system.is_windows():
+      test.runner.skip(self, '(unavailable on windows)')
+      return
+
+    categories = stem.manual._get_categories(man_content())
+    self.assertEqual(EXPECTED_CATEGORIES, set(categories.keys()))
+    self.assertEqual(['tor - The second-generation onion router'], categories['NAME'])
+    self.assertEqual(['tor [OPTION value]...'], categories['SYNOPSIS'])
+    self.assertEqual(8, len(categories['DESCRIPTION']))  # check parsing of multi-line entries
+
+  def test_escapes_non_ascii(self):
+    if stem.util.system.is_windows():
+      test.runner.skip(self, '(unavailable on windows)')
+      return
+
+    def check(content):
+      try:
+        codecs.ascii_encode(content, 'strict')
+      except UnicodeEncodeError as exc:
+        self.fail("Unable to read '%s' as ascii: %s" % (content, exc))
+
+    categories = stem.manual._get_categories(man_content())
+
+    for category, lines in categories.items():
+      check(category)
+
+      for line in lines:
+        check(line)
+
+  def test_has_all_summaries(self):
+    if stem.util.system.is_windows():
+      test.runner.skip(self, '(unavailable on windows)')
+      return
+
+    test.runner.skip(self, 'coming soon!')  # TODO: yup, got a few to fill in...
+
+    manual = stem.manual.Manual.from_man(TEST_MAN_PAGE)
+    missing_summary = []
+
+    for config_option in manual.config_options.values():
+      if not config_option.summary and config_option.category != Category.TESTING:
+        missing_summary.append(config_option.name)
+
+    if missing_summary:
+      self.fail("The following config options are missing summaries: %s" % ', '.join(missing_summary))
+
+  def test_attributes(self):
+    if stem.util.system.is_windows():
+      test.runner.skip(self, '(unavailable on windows)')
+      return
+
+    manual = stem.manual.Manual.from_man(TEST_MAN_PAGE)
+
+    self.assertEqual('tor - The second-generation onion router', manual.name)
+    self.assertEqual('tor [OPTION value]...', manual.synopsis)
+    self.assertEqual(EXPECTED_DESCRIPTION, manual.description)
+
+    self.assertEqual(EXPECTED_CLI_OPTIONS, set(manual.commandline_options.keys()))
+    self.assertEqual('Display a short help message and exit.', manual.commandline_options['-h, -help'])
+    self.assertEqual(EXPECTED_FILE_DESCRIPTION, manual.commandline_options['-f FILE'])
+
+    self.assertEqual(EXPECTED_SIGNALS, set(manual.signals.keys()))
+    self.assertEqual('Tor will catch this, clean up and sync to disk if necessary, and exit.', manual.signals['SIGTERM'])
+
+    self.assertEqual(31, len(manual.files))
+    self.assertEqual('The tor process stores keys and other data here.', manual.files['/usr/local/var/lib/tor/'])
+
+    for category, expected_count in EXPECTED_OPTION_COUNTS.items():
+      self.assertEqual(expected_count, len([entry for entry in manual.config_options.values() if entry.category == category]))
+
+    option = manual.config_options['BandwidthRate']
+    self.assertEqual(Category.GENERAL, option.category)
+    self.assertEqual('BandwidthRate', option.name)
+    self.assertEqual('N bytes|KBytes|MBytes|GBytes|KBits|MBits|GBits', option.usage)
+    self.assertEqual('Average bandwidth usage limit', option.summary)
+    self.assertEqual(EXPECTED_BANDWIDTH_RATE_DESCRIPTION, option.description)
+
+  def test_with_unknown_options(self):
+    if stem.util.system.is_windows():
+      test.runner.skip(self, '(unavailable on windows)')
+      return
+
+    manual = stem.manual.Manual.from_man(TEST_MAN_PAGE + '_with_unknown')
+
+    self.assertEqual('tor - The second-generation onion router', manual.name)
+    self.assertEqual('', manual.synopsis)
+    self.assertEqual('', manual.description)
+    self.assertEqual({}, manual.commandline_options)
+    self.assertEqual({}, manual.signals)
+
+    self.assertEqual(2, len(manual.config_options))
+
+    option = [entry for entry in manual.config_options.values() if entry.category == Category.UNKNOWN][0]
+    self.assertEqual(Category.UNKNOWN, option.category)
+    self.assertEqual('SpiffyNewOption', option.name)
+    self.assertEqual('transport exec path-to-binary [options]', option.usage)
+    self.assertEqual('', option.summary)
+    self.assertEqual('Description of this new option.', option.description)
diff --git a/test/unit/tor.1_with_unknown b/test/unit/tor.1_with_unknown
new file mode 100644
index 0000000..b5e0b82
--- /dev/null
+++ b/test/unit/tor.1_with_unknown
@@ -0,0 +1,51 @@
+'\" t
+.\"     Title: tor
+.\"    Author: [see the "AUTHORS" section]
+.\" Generator: DocBook XSL Stylesheets v1.76.1 <http://docbook.sf.net/>
+.\"      Date: 10/03/2015
+.\"    Manual: Tor Manual
+.\"    Source: Tor
+.\"  Language: English
+.\"
+.TH "TOR" "1" "10/03/2015" "Tor" "Tor Manual"
+.\" -----------------------------------------------------------------
+.\" * Define some portability stuff
+.\" -----------------------------------------------------------------
+.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.\" http://bugs.debian.org/507673
+.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
+.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.ie \n(.g .ds Aq \(aq
+.el       .ds Aq '
+.\" -----------------------------------------------------------------
+.\" * set default formatting
+.\" -----------------------------------------------------------------
+.\" disable hyphenation
+.nh
+.\" disable justification (adjust text to left margin only)
+.ad l
+.\" -----------------------------------------------------------------
+.\" * MAIN CONTENT STARTS HERE *
+.\" -----------------------------------------------------------------
+.SH "NAME"
+tor \- The second\-generation onion router
+.SH "GENERAL OPTIONS"
+.PP
+\fBBandwidthRate\fR \fIN\fR \fBbytes\fR|\fBKBytes\fR|\fBMBytes\fR|\fBGBytes\fR|\fBKBits\fR|\fBMBits\fR|\fBGBits\fR
+.RS 4
+A token bucket limits the average incoming bandwidth usage on this node to the specified number of bytes per second, and the average outgoing bandwidth usage to that same value\&. If you want to run a relay in the public network, this needs to be
+\fIat the very least\fR
+30 KBytes (that is, 30720 bytes)\&. (Default: 1 GByte)
+
+
+With this option, and in other options that take arguments in bytes, KBytes, and so on, other formats are also supported\&. Notably, "KBytes" can also be written as "kilobytes" or "kb"; "MBytes" can be written as "megabytes" or "MB"; "kbits" can be written as "kilobits"; and so forth\&. Tor also accepts "byte" and "bit" in the singular\&. The prefixes "tera" and "T" are also recognized\&. If no units are given, we default to bytes\&. To avoid confusion, we recommend writing "bytes" or "bits" explicitly, since it\(cqs easy to forget that "B" means bytes, not bits\&.
+.RE
+.PP
+.SH "NEW OPTIONS"
+.PP
+\fBSpiffyNewOption\fR \fItransport\fR exec \fIpath\-to\-binary\fR [options]
+.RS 4
+Description of this new option.
+.RE
+.PP
+





More information about the tor-commits mailing list