[tor-dev] Draft proposal: Tor Consensus Transparency

Linus Nordberg linus at nordberg.se
Sat Jul 5 00:05:03 UTC 2014


Hi Tor devs,

It's surprisingly hard to work on Tor during the Tor developer
meetings! My apologies for not publishing this text until now, despite
my repeated ranting about the subject the last few days.

Well, here it is, in an early draft version. Thank you all who've
listened patiently and given valuable feedback. I welcome more feedback
from the list. Thanks in advance.

--8<---------------cut here---------------start------------->8---
Filename: xxx-tor-consensus-transparency.txt
Title: Tor Consensus Transparency
Author: Linus Nordberg
Created: 2014-06-28
Status: Draft

0. Introduction

   WARNING!!! EARLY DRAFT -- MISSING IMPORTANT BITS AND PIECES!

   This document describes how to provide and use public, append-only,
   untrusted logs containing Tor consensus documents, much like what
   Certificate Transparency [RFC6962] does for X.509 certificates. Tor
   relays and clients can then refuse using a consensus not present in
   logs of their choosing.

   WARNING!!! EARLY DRAFT -- MISSING IMPORTANT BITS AND PIECES!

1. Overview

   Using a public, append-only, untrusted log like the history tree
   described in [CrosbyWallach], Tor clients and relays verify that
   consensus documents are present in one or more logs before using
   them.

   Consensus-users, i.e. Tor clients and relays, expect to receive one
   or more "proof of inclusions" with new consensus documents. A proof
   of inclusion is a hash sum representing the tree head of a log,
   signed by the logs private key, and an audit path listing the nodes
   in the tree needed to recreate the tree head. Consensus-users are
   configured to use one or more logs by listing a log address and a
   public key for each log. This is used to verify that a given
   consensus document is present in a given log.

   Anyone can submit a properly formatted and signed consensus
   document to a log and get a signed proof of inclusion in
   return. Directory authorities should do this and include the proofs
   when serving consensus documents. Directory caches and
   consensus-users receiving a consensus not including a proof of
   inclusion submit the document and use the proof they receive in
   return.

   Auditing log behaviour and monitoring the contents of logs is
   performed in cooperation between the Tor network and external
   services. Directory caches act as log auditors with help from Tor
   clients gossiping about what they see. Directory authorities are
   good candidates for monitoring log content since they know what
   documents they have issued. Anybody can run both an auditor and a
   monitor though, which is an important property of the proposed
   system.

   Summary of proposed changes to Tor:

   - Directory authorities start submitting newly created consensuses
     to at least one public log.

   - Tor clients and relays receiving a consensus not accompanied by a
     proof of inclusion start submitting that to at least one public
     log.

   - Consensus-users start rejecting consensuses accompanied by an
     invalid proof of inclusion.

   - A new cell type LOG_GOSSIP is defined, for clients and relays to
     exchange information about tree heads seen and their validity.

   - Consensus-users send LOG_GOSSIP cells with seen tree heads to
     relays.

   - Relays validate tree heads received in LOG_GOSSIP cells (section
     3.2.2) and send results to consensus-users in LOG_GOSSIP cells.

2. Motivation

   Popping five boxes or factoring five RSA keys should not be ruled
   out as a possible attack against a subset of the Tor network. An
   attacker controlling a majority of the directory authorities
   signing keys can, using man-in-the-middle or man-on-the-side
   attacks, serve consensus documents listing relays under their
   control. If mounted on a small subset of the network, the chance of
   detection is probably low. This proposal increases the cost for
   such an attack by raising the chances of it to be detected.

   The complexity of the proposed solution is motivated by the value
   of the decentralisation given. Anybody can run their own log and
   use it. Anybody can audit any existing logs and verify their
   correct behaviour. This empowers people outside the group of Tor
   directory authority operators and the people who trust them on a
   personal basis.

3. Design

   Communication with logs is done over http(s) similar to what
   [RFC6962] defines. This proposal does not use [[the TLS data
   structures]] but instead structures based on [[FIXME]]. Parameters
   for POSTs and all responses are encoded as name/value pairs in JSON
   objects [RFC4627].

   Definitions:

   - Log id: The SHA-256 hash of the log's public key, to be treated
     as an opaque byte string identifying the log.

3.1 Consensus submission

   Logs accept consensus submissions from anyone as long as the
   consensus is signed by a majority of the Tor directory authorities
   of the Tor network that the log is logging.

   [[TODO: Move most of this to "specification" section?]]

   Consensus documents are POST:ed to well-known URL

     https://<log server>/tct/v1/add

   Input:

     consensus: A consensus status document as defined in [dir-spec]
       section 3.4.1.

   Output:

     id: The log id, base64 encoded.

     tree_size: The size of the tree, in entries, in decimal.

     timestamp: The timestamp, in decimal.

     sha256_root_hash: The Merkle Tree Hash of a tree including the
       submitted entry, in base64.

     tree_head_signature: A TreeHeadSignature ([RFC6962] section 3.5)
       for the above data.

     audit_path: An array of base64-encoded Merkle tree nodes proving
       the inclusion of the submitted entry in the tree denoted by
       sha256_root_hash (see [RFC6962] section 2.1.1).

   The output is what we call a proof of inclusion.

   The tree_head_signature is signed with the private key of the log.

3.2 Consensus verification

3.2.1 Log entry membership

   Calculate a tree head from the hash of the received consensus and
   the audit path in proof. Verify that it's identical to the tree
   head in the proof. This can easily be done by consensus-users for
   each received consensus.

   We now know that the consensus is part of a tree which the log
   claims to be The Tree. Whether this tree is the same tree that
   everybody else see is unknown at this point.

3.2.2 Append-only property of the log

   Ask the log for a consistency proof between the received tree head
   and a previously known good tree head. The known good head can be
   the empty tree. [[TODO add text about how to deal with received
   heads that are older than the last known good tree.]] Communication
   with logs is done over http(s) [[as described in [RFC6962] section
   4 -- TODO specify protocol and encoding]].

   [[description of consistency verification goes here]]

   Tor relays may do this for tree heads received in LOG_GOSSIP cells
   and communicate results in the same cells. [[TODO: Do this
   synchronously or asynchronously?]] Relays cache results to minimise
   the need for communication with log servers and calculations.

   We now know that the received tree is a superset of the known good
   tree.

3.3 Log auditing

   A log auditor verifies that the log presents the same view to all
   its clients and its append-only property, i.e. that no entries once
   accepted by the log are ever changed or removed. [[TODO describe
   the Tor networks role in auditing a bit more than what's mentioned
   in 3.2.2]]

3.4 Log monitoring

   A log monitor verifies that the contents of the log is consistent
   with the rules of the Tor network, notably that all entries are
   properly formed and signed Tor consensus documents. Note that there
   can be more than one valid consensus documents for a given point in
   time. One reason for this is that the number of signatures can
   differ due to consensus voting timing details. [[Are there more?]]

   [[TODO expand on monitoring strategies -- even if this is not part
   of proposed extensions to the Tor network it's good for
   understanding]]

3.5 Consensus-user behaviour

   Keep an on-disk cache of consensus documents. Mark them as being in
   on of three states:

   LOG_STATE_UNKNOWN -- don't know whether it's present in enough logs
                        or not
   LOG_STATE_LOGGED -- have seen good proof(s) of inclusion
   LOG_STATE_LOGGED_GOOD -- confident about the tree head representing
                            a good tree

   Newly arrived consensus documents start in LOG_STATE_UNKNOWN or
   LOG_STATE_LOGGED depending on whether they are accompanied by
   enough proofs or not. There are two possible state transitions:

   - LOG_STATE_UNKNOWN --> LOG_STATE_LOGGED: Seen enough proofs of
     inclusion verifying correctly according to section 3.2.1. The
     number of good proofs needed is a policy setting in the
     configuration of the consensus-user.

   - LOG_STATE_LOGGED --> LOG_STATE_LOGGED_GOOD: Seen enough gossiping
     to know that the tree head in the proof belongs to a known log.

   Consensuses in state LOG_STATE_UNKNOWN are not used but are instead
   submitted to one or more logs. This may take the consensus to
   LOG_STATE_LOGGED.

   Consensuses in state LOG_STATE_LOGGED are used despite not being
   fully verified with regard to logging. LOG_GOSSIP cells with the
   tree heads from received proofs are being sent to relays for
   further verified. Clients send to all relays that they have a
   circuit to to. Relays send to three random relays that they have a
   circuit to.

3.6 Relay behaviour when acting as an auditor

   TODO

3.7 Notable differences from Certificate Transparency

   - The data logged is "strictly time-stamped", i.e. ordered.

   - Much shorter lifetime of logged data -- a day rather than a
     year. Is the effects of this difference of importance only for
     "one-shot attacks"?

   - Directory authorities have consensus about what they're
     signing -- there are no "web sites knowing better".

   - Submitters are not in the same hurry as CA:s and can wait minutes
     rather than seconds for a proof of inclusion.

4. Security implications

  TODO

5. Specification

  TODO

? Compatibility
? Implementation
? Performance and scalability notes

A. Open issues

   - handle all consensus flavours (i.e. microdescriptor consensuses)
   - don't use "consensus verification" since that's misleading
   - maybe add hash function agility, i.e. don't fixate SHA-256 (but see
     CT discussion about why not and TODO summarize it here)
   - add a blurb about the values of publishing logs as Tor hidden services
   - should relays gossip amongst each others too?
   - discuss compromise of log keys
   - add 'version' and 'extensions' fields to the submission response?
   - maybe log votes as well

B. Acknowledgements

   This proposal leans heavily on [RFC6962]. Some definitions are
   copied verbatim from that document. Valuable feedback has been
   received from Ben Laurie and Karsten Loesing.

C. References

   [CrosbyWallach] http://static.usenix.org/event/sec09/tech/full_papers/crosby.pdf
   [dir-spec] https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt
   [RFC4627] https://tools.ietf.org/html/rfc4627
   [RFC6962] https://tools.ietf.org/html/rfc6962
--8<---------------cut here---------------end--------------->8---


More information about the tor-dev mailing list