[tor-scaling] update on next steps at the Tor project

Gaba gaba at torproject.org
Wed Oct 2 17:48:44 UTC 2019


Hi!

We have been busy shaping two projects to get funding for and submitted
one to a funder last week and submitting other one this week. I want to
share with you all the main parts of them (sorry for how long this mail is).

1) Project Name: Building the foundation to improve Tor network performance

In this project, we will improve the utilities we use to monitor the Tor
network. This project prepares us to establish baseline measurements on
the network, the first critical step in evaluating, developing, tuning,
and deploying Tor network improvements.

Objective 1: Make operational improvements to existing OnionPerf
deployments and make it easier to deploy new OnionPerf instances

OnionPerf is one of the utilities that we use to measure the Tor
network. OnionPerf uses multiple processes and threads to download
random data through Tor while tracking the performance of those
downloads. The data collected through OnionPerf is used to visualize
changes in Tor client performance over time.

The Tor Metrics team (established with the support of MOSS) operates
four OnionPerf instances at geographically diverse locations to collect
baseline measurements on the Tor network, and much of this data is
publicly available on the Tor Metrics portal
(https://metrics.torproject.org).

As it stands, we don’t have the right tools to closely monitor the
operation of existing OnionPerf instances or to deploy upgrades and
custom configuration changes, which can make it difficult to run new
kinds of experiments or modify how measurements are taken. We also don’t
have an easy, straightforward way to deploy new OnionPerf instances so
we conduct a variety of experiments and collect sufficient measurements
on the network to inform performance improvements.

This objective will improve the operational procedures for administering
and monitoring existing OnionPerf deployments and make it easier to
deploy similar and custom setups in the future.

O1.1 Improve monitoring: We will produce a Nagios plugin for monitoring
OnionPerf instances to ensure that they are operating correctly.

O1.2 Improve ease of deployment and maintenance: We will produce Ansible
tasks for deploying and managing deployments of OnionPerf instances,
which also allow for performing upgrades and custom configuration
changes. In this task, we will also make software changes that solve an
existing disk space issue on OnionPerf instances. Right now, an
OnionPerf instance keeps writing logs to its disk until that disk gets
full. To solve this problem, we will sync logs to another location and
delete them after a short while to keep storage requirements for
OnionPerf instances constant. Ideally, we would make these logs publicly
available.

O1.3 Create ability for developers to easily spin up their own OnionPerf
instances based on specific tor branches and configurations. In order to
gather sufficient network measurements and conduct a variety of
experiments with multiple instances of OnionPerf, it must be
straightforward and easy to spin up many OnionPerfs. With many OnionPerf
instances, we can either perform more measurements from more vantage
points, or perform many experimental measurements to compare with the
baseline measurements.


Objective 2: Expand the kinds of measurements OnionPerf can take by
making improvements to its codebase.

While the ability to spin up more OnionPerf instances will allow us to
collect even more measurements, we also need to expand the kinds of
measurements we can collect.

Under this objective, we will increase the number of different kinds of
measurements we can take on the Tor network by making changes to the
OnionPerf codebase. This will allow us to gain insight into the network
in ways that are currently unavailable.

We will also develop a way to distinguish these new measurements from
the currently deployed OnionPerf instances and other experimental
measurements. This will allow us to filter through the results and
analyze them in a meaningful way.

O2.1 Add instance metadata to JSON format: We need a way to distinguish
our current four long-term OnionPerf measurements that are automatically
published to the Metrics portal (https://metrics.torproject.org) from
short-term experimental measurements. In this task, we will add instance
metadata to OnionPerf’s JSON results format in order to differentiate
each experiment; we will store that data along with the actual
measurement data in a separate, single archive.

O2.2 Develop at least one new OnionPerf model: We will evaluate new
OnionPerf models that will allow us to measure different network
workloads (ie, ping/echo service or bulk download models). We will
deploy at least one instance of these permanently. Deploying a new
OnionPerf model allows us to gain new insight into the Tor network by
taking measurements that are currently unavailable.

O2.3 Implement guard node support for OnionPerf, with at least one
instance using guard nodes. Guard node support will allow us to measure
Tor performance closer to how clients experience it.


Objective 3: Make improvements to the way we analyze performance metrics.
While Objective 1 and 2 make it possible for us to take new kinds of
measurements on the Tor network, this objective enables us to parse,
graph, and analyze this new data. Under this objective, we will develop
and improve developer-facing tools, including graphs that can filter out
different measurements depending on the context of a given network
experiment.

The tools we develop under this objective will help developers to
understand, compare, and analyze the measurements collected from the
tools built in the first two objectives of this project. These tools
will help visualize the results network experiments and will inform
development decisions we make going forward.

O3.1 Develop developer-facing tooling to quickly graph baseline
performance metrics. In order for developers to evaluate performance
metrics as we make network improvements, we will create developer tools
that allow us to produce CDFs from snapshots of time for these OnionPerf
metrics for periods of time that live network experiments are running.
These graphing tools will also work with OnionPerfs that are attached to
the Tor network simulator Shadow, allowing us to compare the live
network to the simulator.

O3.2 Include additional OnionPerf filters: We will add the ability to
filter out arbitrary relays from arbitrary time periods of historical
OnionPerf data and compare the performance metrics to the baseline for
that period (as CDF graphs). For example, we will add the ability to
select or exclude relays that appear in OnionPerf paths in any position
or a specific position, based on:
- Tor software version or operating system,
- Consensus weight as raw numerical value or percentile cutoff,
- Relay flag like Fast, Stable, Exit, or Guard and their combinations,
- Circuit build time as absolute time value or percentile cutoff, or
- Lists of relay fingerprints or IP net masks.


--------

2) Project name: Making the Tor network faster and more reliable for
users in the Global South


In this proposed project, the Tor Project aims to make the Tor network
faster for users with slow connections and old devices by: streamlining
the tuning of the network; deploying smarter methods for balancing
traffic, bootstrapping, and building initial circuits; evaluating and
implementing promising congestion control, load balancing, and
scalability research; and  proactively detecting, diagnosing, and
resolving user-facing performance issues.

Objectives

Objective 1: Improve user-facing performance metrics by streamlining the
tuning of the Tor network. We have developed and deployed a series of
improvements the Tor network over the years, but as the network has
changed, so has the impact of these improvements. We need to tune these
improvements on the live Tor network as it exists today (larger, more
users) so that our users can experience faster connections. By creating
an easy way to tune and re-tune, we will be able to make user-facing
performance improvements now, as well as easily replicate these actions
in the future, ensuring we are always bringing performance improvements
to our users through continuous evaluation and adjustment.

O1.1: Optimize user-facing performance by tuning parameters of
previously deployed Tor network improvements.

O1.2: Ensure quick deployment of performance enhancements by improving
how we test and deploy efficiency adjustments to the network.

O1.2.1: Calibrate network simulators, Shadow and NetMirage to evaluate
potential solutions to performance issues.

O1.2.2: Develop mechanisms to safely coordinate and run live network
performance experiments.

O1.3 Build a strategy to make sure Tor relays are updated with our
user-facing performance improvements faster so our users can quickly
experience benefits.

O1.3.1: Create new support policies and distribution packages for relay
operators.

O1.3.2: Improve mechanisms used to notify relay operators of software
upgrades.

Objective 2: Decrease latency for end users by deploying smarter load
balancing, bootstrapping, and circuit building mechanisms. We can make
significant changes to user-facing performance metrics and to
user-perceived performance delays by improving the way traffic is
balanced across the Tor network, improving first connection
bootstrapping, and tuning the way the network predictively builds the
fastest circuits.

O2.1: Reduce the number of slow and extremely slow sessions for our
users by developing and deploying load balancing improvements.

O2.1.1: Improve how the fastest Tor relays are selected for each connection.

O2.1.2: Ensure network security improvements do not negatively impact
network speeds.

O2.1.3: Optimize entry relay usage to better leverage all available
network bandwidth.

O2.1.4: Improve load balancing for select pluggable transports and
bridges so that all users who need to circumvent censorship against Tor
have a fast connection and so that a surge of users (e.g., during a
censorship event) won’t negatively impact connection speeds.

O2.2 Reduce user-perceived delays by developing smarter ways of
bootstrapping first connections and building circuits.

O2.2.1: Investigate and implement usability fixes for issues that make
Tor’s first connection feel slow.

O2.2.2: Test and tune the way the network predictively builds the
fastest circuits.

Objective 3: Evaluate and implement a selected research solution on
congestion control, load balancing, or scalability with the highest
impact for our users. A community of academic researchers surround the
Tor and our tools. Many have published or are actively researching
solutions to network-level performance issues that cause sluggishness
for users. Under this objective, we will evaluate selected peer-reviewed
literature: Adaptive Stream Timeout, Conflux, Walking Onions, and
Congestion Aware Tor to find the highest impact research on our
performance metrics, and implement it into the live Tor network.

O3.1: Evaluate performance improvements presented in research literature.

O3.2: Implement promising performance improvements from evaluation in O3.1

Objective 4: Improve our ability to proactively detect, diagnose, and
resolve user-facing performance issues. Under this objective, we will
improve the way we monitor the health of the Tor network. The more
insight we have into the health of the network, the faster we can detect
and diagnose performance-related problems that impact our users. The
goal of this objective is to identify user-facing performance problems
we can immediately address and to improve our ability to respond quickly
to future performance issues.

O4.1: Improve and implement network health monitoring and scanning.

O4.1.1: Make, streamline, and monitor network and Exit node reachability
scanners.

O4.1.2: Implement developer-facing tools to monitor relay health.

O4.1.3: Implement reports for relay operators to understand the health
of their relay.

O4.1.4: Implement relay descriptor changes for relays to self-report
overload, problems, diagnostic information, and deploy targeted
user-facing solutions to these issues.

O4.2: Find and fix performance-impacting issues and bugs discovered from
monitoring and scanning.


cheers,
gaba
-- 
Project Manager: Network, Anti-Censorship and Metrics teams
gaba at torproject.org
she/her are my pronouns
GPG Fingerprint EE3F DF5C AD91 643C 21BE  8370 180D B06C 59CA BD19


More information about the tor-scaling mailing list