[tor-commits] [tor/master] Rewrite "common" overview into a "lib" overview.

nickm at torproject.org nickm at torproject.org
Mon Oct 14 19:56:57 UTC 2019


commit 8ef5d96c2e7c026feff3a4dd20f0096f6d8cf901
Author: Nick Mathewson <nickm at torproject.org>
Date:   Mon Oct 14 13:49:27 2019 -0400

    Rewrite "common" overview into a "lib" overview.
---
 doc/HACKING/design/01.00-lib-overview.md | 206 +++++++++++++++++++------------
 1 file changed, 128 insertions(+), 78 deletions(-)

diff --git a/doc/HACKING/design/01.00-lib-overview.md b/doc/HACKING/design/01.00-lib-overview.md
index 79a6a7b7d..08dec51a0 100644
--- a/doc/HACKING/design/01.00-lib-overview.md
+++ b/doc/HACKING/design/01.00-lib-overview.md
@@ -1,121 +1,171 @@
 
-## Utility code in Tor
+## Library code in Tor.
 
-Most of Tor's utility code is in modules in the src/common subdirectory.
+Most of Tor's utility code is in modules in the `src/lib` subdirectory.  In
+general, this code is not necessarily Tor-specific, but is instead possibly
+useful for other applications.
 
-These are divided, broadly, into _compatibility_ functions, _utility_
-functions, _containers_, and _cryptography_.  (Someday in the future, it
-would be great to split these modules into separate directories.  Also, some
-functions are probably put in the wrong modules)
+This code includes:
 
-### Compatibility code
+  * Compatibility wrappers, to provide a uniform API across different
+    platforms.
 
-These functions live in src/common/compat\*.c; some corresponding macros live
-in src/common/compat\*.h.  They serve as wrappers around platform-specific or
-compiler-specific logic functionality.
+  * Library wrappers, to provide a tor-like API over different libraries
+    that Tor uses for things like compression and cryptography.
 
-In general, the rest of the Tor code *should not* be calling platform-specific
-or otherwise non-portable functions.  Instead, they should call wrappers from
-compat.c, which implement a common cross-platform API.  (If you don't know
-whether a function is portable, it's usually good enough to see whether it
-exists on OSX, Linux, and Windows.)
+  * Containers, to implement some general-purpose data container types.
 
-Other compatibility modules include backtrace.c, which generates stack traces
-for crash reporting; sandbox.c, which implements the Linux seccomp2 sandbox;
-and procmon.c, which handles monitoring a child process.
+The modules in `src/lib` are currently well-factored: each one depends
+only on lower-level modules.  You can see an up-to-date list of the
+modules sorted from lowest to highest level by running
+`./scripts/maint/practracker/includes.py --toposort`.
 
-Parts of address.c are compatibility code for handling network addressing
-issues; other parts are in util.c.
+As of this writing, the library modules are (from lowest to highest
+level):
 
-Notable compatibility areas are:
+   * `lib/cc` -- Macros for managing the C compiler and
+     language. Includes macros for improving compatibility and clarity
+     across different C compilers.
 
-   * mmap support for mapping files into the address space (read-only)
+   * `lib/version` -- Holds the current version of Tor.
 
-   * Code to work around the intricacies
+   * `lib/testsupport` -- Helpers for making test-only code and test
+     mocking support.
 
-   * Workaround code for Windows's horrible winsock incompatibilities and
-     Linux's intricate socket extensions.
+   * `lib/defs` -- Lowest-level constants used in many places across the
+     code.
 
-   * Helpful string functions like memmem, memstr, asprintf, strlcpy, and
-     strlcat that not all platforms have.
+   * `lib/subsys` -- Types used for declaring a "subsystem". A subsystem
+     is a module with support for initialization, shutdown,
+     configuration, and so on.
 
-   * Locale-ignoring variants of the ctypes functions.
+   * `lib/conf` -- Types and macros used for declaring configuration
+     options.
 
-   * Time-manipulation functions
+   * `lib/arch` -- Compatibility functions and macros for handling
+     differences in CPU architecture.
 
-   * File locking function
+   * `lib/err` -- Lowest-level error handling code: responsible for
+     generating stack traces, handling raw assertion failures, and
+     otherwise reporting problems that might not be safe to report
+     via the regular logging module.
 
-   * IPv6 functions for platforms that don't have enough IPv6 support
+   * `lib/malloc` -- Wrappers and utilities for memory management.
 
-   * Endianness functions
+   * `lib/intmath` -- Utilities for integer mathematics.
 
-   * OS functions
+   * `lib/fdio` -- Utilities and compatibility code for reading and
+      writing data on file descriptors (and on sockets, for platforms
+      where a socket is not a kind of fd).
 
-   * Threading and locking functions.
+   * `lib/lock` -- Compatibility code for declaring and using locks.
+      Lower-level than the rest of the threading code.
 
-=== Utility functions
+   * `lib/ctime` -- Constant-time implementations for data comparison
+     and table lookup, used to avoid timing side-channels from standard
+     implementations of memcmp() and so on.
 
-General-purpose utilities are in util.c; they include higher-level wrappers
-around many of the compatibility functions to provide things like
-file-at-once access, memory management functions, math, string manipulation,
-time manipulation, filesystem manipulation, etc.
+   * `lib/string` -- Low-level compatibility wrappers and utility
+     functions for string manipulation.
 
-(Some functionality, like daemon-launching, would be better off in a
-compatibility module.)
+   * `lib/wallclock` -- Compatibility and utility functions for
+     inspecting and manipulating the current (UTC) time.
 
-In util_format.c, we have code to implement stuff like base-32 and base-64
-encoding.
+   * `lib/osinfo` -- Functions for inspecting the version and
+     capabilities of the operating system.
 
-The address.c module interfaces with the system resolver and implements
-address parsing and formatting functions.  It converts sockaddrs to and from
-a more compact tor_addr_t type.
+   * `lib/smartlist_core` -- The bare-bones pieces of our dynamic array
+     ("smartlist") implementation. There are higher-level pieces, but
+     these ones are used by (and therefore cannot use) the logging code.
 
-The di_ops.c module provides constant-time comparison and associative-array
-operations, for side-channel avoidance.
+   * `lib/log` -- Implements the logging system used by all higher-level
+     Tor code.  You can think of this as the logical "midpoint" of the
+     library code: much of the higher-level code is higher-level
+     _because_ it uses the logging module, and much of the lower-level
+     code is specifically written to avoid having to log, because the
+     logging module depends on it.
 
-The logging subsystem in log.c supports logging to files, to controllers, to
-stdout/stderr, or to the system log.
+   * `lib/container` -- General purpose containers, including dynamic arrays,
+     hashtables, bit arrays, weak-reference-like "handles", bloom
+     filters, and a bit more.
 
-The abstraction in memarea.c is used in cases when a large amount of
-temporary objects need to be allocated, and they can all be freed at the same
-time.
+   * `lib/trace` -- A general-purpose API for introducing
+     function-tracing functionality into Tor.  Currently not much used.
 
-The torgzip.c module wraps the zlib library to implement compression.
+   * `lib/thread` -- Threading compatibility and utility functionality,
+     other than low-level locks (which are in `lib/lock`) and
+     workqueue/threadpool code (which belongs in `lib/evloop`).
 
-Workqueue.c provides a simple multithreaded work-queue implementation.
+   * `lib/term` -- Code for terminal manipulation functions (like
+     reading a password from the user).
 
-### Containers
+   * `lib/memarea` -- A data structure for a fast "arena" style allocator,
+     where the data is freed all at once.  Used for parsing.
 
-The container.c module defines these container types, used throughout the Tor
-codebase.
+   * `lib/encoding` -- Implementations for encoding data in various
+     formats, datatypes, and transformations.
 
-There is a dynamic array called **smartlist**, used as our general resizeable
-array type.  It supports sorting, searching, common set operations, and so
-on.  It has specialized functions for smartlists of strings, and for
-heap-based priority queues.
+   * `lib/dispatch` -- A general-purpose in-process message delivery
+     system.  Used by `lib/pubsub` to implement our inter-module
+     publish/subscribe system.
 
-There's a bit-array type.
+   * `lib/sandbox` -- Our Linux seccomp2 sandbox implementation.
 
-A set of mapping types to map strings, 160-bit digests, and 256-bit digests
-to void \*.  These are what we generally use when we want O(1) lookup.
+   * `lib/pubsub` -- Code and macros to implement our publish/subscribe
+     message passing system.
 
-Additionally, for containers, we use the ht.h and tor_queue.h headers, in
-src/ext.  These provide intrusive hashtable and linked-list macros.
+   * `lib/fs` -- Utility and compatibility code for manipulating files,
+     filenames, directories, and so on.
 
-###  Cryptography
+   * `lib/confmgt` -- Code to parse, encode, and manipulate our
+     configuration files, state files, and so forth.
 
-Once, we tried to keep our cryptography code in a single "crypto.c" file,
-with an "aes.c" module containing an AES implementation for use with older
-OpenSSLs.
+   * `lib/crypt_ops` -- Cryptographic operations. This module contains
+     wrappers around the cryptographic libraries that we support,
+     and implementations for some higher-level cryptographic
+     constructions that we use.
 
-Now, our practice has become to introduce crypto_\*.c modules when adding new
-cryptography backend code.  We have modules for Ed25519, Curve25519,
-secret-to-key algorithms, and password-based boxed encryption.
+   * `lib/meminfo` -- Functions for inspecting our memory usage, if the
+     malloc implementation exposes that to us.
 
-Our various TLS compatibility code, wrappers, and hacks are kept in
-tortls.c, which is probably too full of Tor-specific kludges.  I'm
-hoping we can eliminate most of those kludges when we finally remove
-support for older versions of our TLS handshake.
+   * `lib/time` -- Higher level time functions, including fine-gained and
+      monotonic timers.
 
+   * `lib/math` -- Floating-point mathematical utilities, including
+     compatibility code, and probability distributions.
 
+   * `lib/buf` -- A general purpose queued buffer implementation,
+     similar to the BSD kernel's "mbuf" structure.
 
+   * `lib/net` -- Networking code, including address manipulation,
+     compatibility wrappers,
+
+   * `lib/compress` -- A compatibility wrapper around several
+     compression libraries, currently including zlib, zstd, and lzma.
+
+   * `lib/geoip` -- Utilities to manage geoip (IP to country) lookups
+      and formats.
+
+   * `lib/tls` -- Compatibility wrappers around the library (NSS or
+     OpenSSL, depending on configuration) that Tor uses to implement the
+     TLS link security protocol.
+
+   * `lib/evloop` -- Tools to manage the event loop and related
+     functionality, in order to implement asynchronous networking,
+     timers, periodic events, and other scheduling tasks.
+
+   * `lib/process` -- Utilities and compatibility code to launch and
+     manage subprocesses.
+
+### What belongs in lib?
+
+In general, if you can imagine some program wanting the functionality
+you're writing, even if that program had nothing to do with Tor, your
+functionality belongs in lib.
+
+If it falls into one of the existing "lib" categories, your
+functionality belongs in lib.
+
+If you are using platform-specific `#ifdef`s to manage compatibility
+issues among platforms, you should probably consider whether you can
+put your code into lib.





More information about the tor-commits mailing list