tor-dev
Threads by month
- ----- 2025 -----
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- 2 participants
- 3612 discussions
This near-book-length mail is a pre-proposal to help us review previous
congestion control ideas for Tor, and brainstorm new ones. My goal is to
turn this material into one or more proposals and provide ways of
evaluating these options. With proper voodoo, maybe we can resurrect a
couple oldies from the graveyard of mixnet lore, and remix them into
some Vaporwave bangers. If we're real lucky with some fresh ideas, maybe
we can even make a hip hop Internet streaming chart buster from this mix.
But long mail is looong, ok? It's sooo long it has its own soundtrack.
Grab your favorite links from the references at the end, maybe print
everything out, find the nearest sofa or bed, and curl up with it all.
Flip on some Pretty Lights. Let's get that flow Finally Moving. Turn off
the tv! The flow is More Important Than Michael Jordan. Listen: the flow
is Hot Like Sauce. I hope you're an Organ Donor - Extended Overhaul
might be necessary. We're going on a mental Midnight Voyage by Ghostland
Observatory. But don't worry, we'll make it by 4 AM, without being
trapped there, I swear. Hopefully that's not Pushing Time with ya Ample
Mammal(s). Relax: there will still be time for some Codeine(g) Dreaming.
If you finish reading all of it before you go ROCKABYE BABY, you're
definitely a High Roller. All of this is almost exactly an hour of
Bumpin' In The Voodoo with Manic Focus.
If you'd rather take your time and chillax with something Muy Tranquilo
but less Gramatik, just put on some Blank Banshee. But don't fall
asleep, or George Clanton will Kill You In Bed.
Ready? Ok, back to serious business.
Motivation: Load balancing (TorFlow/sbws), circuit build timeout cutoffs
(CBT), and QoS (KIST+EWMA) have provided huge perf wins for Tor. Tuning
and improving these systems will continue to provide latency
improvements in the average case, but congestion control is the missing
piece we need for the network to operate properly at high utilization
levels. Congestion control is also necessary to increase the throughput
of the network at even at low utilization levels. Historically, Tor
performance shows high correlation to network utilization levels, and I
believe this is largely due to the effects of ephemeral congestion:
https://lists.torproject.org/pipermail/tor-scaling/2019-June/000051.html
In fact, Section 6.3 of
https://www.freehaven.net/anonbib/cache/murdoch-pet2008.pdf uses
Pollaczek-Khinchin queuing theory to show that expected Tor queue
latency is proportional to network utilization divided by spare
capacity, if queues are not otherwise bounded somehow (by congestion
control).
Summary: The rest of this post is organized as follows: First, I go over
TCP and ECN, because I build upon those for a couple new ideas. Then, I
summarize Tor's current SENDME flow control and review its many
failings. Then, I review past attempts at congestion control for Tor in
the research literature. Then, I propose four new candidate ideas.
Finally, I conclude the post with some ideas for evaluation and further
analysis.
Unless you are a congestion control expert who is also deeply familiar
with Tor's many failed attempts in this area, I strongly recommend
wading through this post in order. The summaries are meant to get you up
to speed without having to do quite as much reading as I did, and they
are filled with in-line URL references in case you want to dig deeper.
If you still want to skip ahead to the highlights, search this mail for
[PROPOSAL_CANDIDATE]. There are four such sections. Those sections build
on other ideas, though. For those, search for [TCP_HISTORY],
[BOOTLEG_RTT_TOR], [FORWARD_ECN_TOR], and [BACKWARD_ECN_TOR]. Search for
[TRACK_LISTING] to find a summary table of all the new ideas from this
post, with their associated key properties.
Crucially absent from this pre-proposal post is the closely related
discussion of QoS/scheduling, which we need to make decisions about
which circuit(s) to select for delivering congestion control signals,
when congestion occurs. For now, this post just handwaves this piece as
"use either EWMA or something RED-like". There is much literature on
these mechanisms, enough to warrant separate treatment, but thankfully
the choice of which to use is orthogonal to the choice of congestion
control signal delivery mechanism.
-I. History of Internet Congestion Control [TCP_HISTORY]
In TCP, each endpoint maintains send and receive buffers of packets,
called windows. The receive window holds packets until a contiguous
chunk is received, at which point data is delivered to the application
and an acknowledgement packet is sent to the other end. The send window
size is doubled every ack until the first packet is dropped, after which
it is increased by 1 for each acked packet, and halved for each drop.
This is called Slow Start with AIMD (Additive-Increase
Multiplicative-Decrease). Packets that are not acked are retransmitted
from the send window buffer after a timeout of one RTT (Round Trip
Time). Drops are caused by intermediate routers' fixed-size queues being
full, but can also be caused by poor network conditions (which leads to
sub-optimal throughput). In this way, TCP responds to congestion
anywhere on the path, and takes around one RTT to detect said congestion.
Even this detailed summary is a simplified model. In reality, TCP is way
more complicated than that. The real thing has like 23 states, and a
whole bunch of kruft from the 80s and 90s that we'll want to cut out of
our mix. We just need the key hooks.
Decades later, Explicit Congestion Notification (ECN) was proposed to
allow routers to explicitly set a flag on TCP packets to signal that
their queue is getting full, instead of dropping packets. For abstruse
compatibility and reliability reasons, when a router adds this flag to a
packet, it is sent all the way to one endpoint. That endpoint *then*
re-adds the flag to acks going in the other direction, all the way to
the other endpoint, who then backs off as if it detected a packet drop.
This is called Forward ECN: https://tools.ietf.org/html/rfc3168#section-6.1
Unfortunately, because of the requirement to bounce the congestion
notification flag all the way off the other endpoint, it *still* takes
Forward ECN at least one RTT to respond to congestion, and so almost no
intermediate routers have bothered to deploy it -- the gains are only
marginal.
Also interesting for our purposes is the vastly simplified Backwards ECN
that uses separate ICMP signaling to eliminate the endpoint echo RTT for
much faster responsiveness to congestion:
https://tools.ietf.org/html/draft-salim-jhsbnns-ecn-00
Backwards ECN was never widely deployed either, because ICMP is easy to
spoof+spam, requires extra packet overhead, many routers filter it, and
no strong authentication binds it to the TCP connection.
But enough history! Onward!
@. Status quo of the Tor flow: SENDME sadness and pain
Tor has a window-based flow control system called SENDMEs. This system
does not provide congestion control, as window sizes are hard-coded
constants. I believe it is important to review this system in detail, so
we can avoid making any of the same mistakes again. Bear with me.
Recall that Tor circuits are 3 hops long when contacting the Internet
(Guard, Middle, Exit), and 7 hops long when a client contacts an onion
service. TCP is used for connection in between these hops. One or more
reliable streams are multiplexed inside each circuit.
Each endpoint of a circuit (client and Exit, or client and onion
service) maintains a remaining-to-send cell count for each circuit and
stream, which is refilled when it receives a SENDME from the other end.
Each time it sends a cell, it decrements this count. When this count
reaches 0 without receiving corresponding SENDMEs, it will stop sending
data. The initial value of these counts is 1000 cells for circuits and
500 cells for streams, and each SENDME refills 100 cell counts for
circuits, and 50 cell counts for streams. Crucially: no backpressure
happens on interior relay connections for this windowing system.
Instead, we just queue.
If you have a TCP background, you may find it easier to mentally replace
"SENDME" with "ACK", and you won't be far off. Its just that our acks
always ack a fixed amount of cells, and the window size is also fixed.
In the steady state stage, window updates of 50 or 100 cells per SENDME
results in an upper bound of performance inversely proportional to half
the circuit RTT. This is 2*CELL_SIZE*50/RTT for streams, and
2*CELL_SIZE*100/RTT for circuits. For streams with a ~100msec RTT, this
is ~500Kbytes/sec.
This means that once the Tor network has enough capacity for RTT to be
close to link transit time (ie: no queue delay), adding more Tor relays
will *not* make Tor faster. This also means that onion service
throughput will almost always be much lower than Exit throughput. The
RTT is much much higher for 7 hop onion circuits than for 3 hop exits,
so even if there is plenty of spare network capacity, Onion service
sites will *never* download as fast as their clearweb equivalents with
our current flow control.
All of this is done just to provide flow control, so that data flow can
stop in the event of endpoint stalls and/or interior relay failure. It
does not actually provide fairness or limit congestion when multiple
clients are used. Additionally, because nothing proves the endpoint has
read the data, a client that does not read data but keeps sending
SENDMEs to the Exit can deliberately force queues to build up in the
Guard node (due to the client-blocked TCP layer), to induce OOM
conditions and crashes. We need to upgrade all clients and all Exit
relays to fix this problem, which will take quite some time:
https://gitweb.torproject.org/torspec.git/tree/proposals/289-authenticated-…
Even with honest or authenticated behavior, the SENDME protocol means
that each circuit can queue up to 1000 cells at an overloaded bottleneck
Tor router, which load balancing can help alleviate, but can't correct
in transient and degenerate cases. This is why high capacity Exits have
obscene memory requirements when Exits are scarce (ie: below ~1/3 total
network throughput), despite there being no memory leaks.
This also tells us that Tor relays *cannot* scale to support larger
numbers of users without a corresponding linear increase in memory
requirements.
As a side effect of this protocol, it is possible for the client and the
Exit to compute the RTT of a circuit by measuring time between every
50th or 100th sent cell and the associated SENDME arrival. Some previous
work makes use of this to try to detect congestion, but it is not a
property we want to keep, if we can avoid it. No deployed code uses it,
and its presence is worrisome:
http://people.cs.ksu.edu/~eyv/papers/latency_leak-ccs07.pdf
https://www.robgjansen.com/publications/howlow-pets2013.pdf
To further complicate things, these SENDME windows only apply to
end-to-end data cells on the circuit. Tor also supports other non-data
end-to-end cells, as well as non-end-to-end data cells (so-called "leaky
pipe topology", which is used for circuit padding), which are not
counted against SENDME windows. Both of these details have also caused
us problems in the SENDME design. Still, for now, we will ignore both of
these details. Full proposal treatment will need to consider them, though.
I. Vaporwaving to ghosts in the anonymity graveyard
Ok, so let's party like it's 2010 and go chase some ghosts of past
anonymity literature and lore.
In my review of this area, I have identified the following causes of
death for previous congestion control attempts in Tor:
* Side channels and other anonymity risks
* Slow or limited responsiveness to queue pressure
* Poor fairness properties
* Poor throughput
* Lack of stream flow control
* Endpoint fairness cheating ability
* Deployment costs
Ghost 1: Drop Signaling
Cause of Death: Side channel issues; deployment cost
Early proposals for congestion control for Tor attempted to simply
tunnel OS TCP streams. This uses the OS's native implementation of TCP
Slow Start and AIMD as congestion control methods. These attempts all
quickly died, due to OS TCP stack fingerprinting problems (ie: nmap):
https://murdoch.is/papers/tor11datagramcomparison.pdf
Two years ago, I tried to make the case for using QUIC as a drop-in
replacement for Tor circuits, and use QUIC's streams to carry Tor
streams. Such a network could leverage QUIC's TCP-like drop-signaled
congestion control algorithm internally, and terminate QUIC at the Exit
node, transforming the QUIC streams into TCP connections initiated from
the Exit. This avoids the TCP fingerprinting issues. In fact, even
without a full datagram network, Tor could still deploy relay-only drop
signaling if we added windowing and retransmission at the circuit layer:
https://lists.torproject.org/pipermail/tor-dev/2018-March/013026.html
Unfortunately, the present/absent bit vector of missing packets in this
window is a communication channel between malicious Tor relays or
Internet routers and the Exit node. This effect was an obscure piece of
ancient mix network lore until Nick Mathewson publicly documented it on
tor-dev a little over a year ago:
https://lists.torproject.org/pipermail/tor-dev/2018-November/013562.html
I was the voice of optimism in that grim post, but there is not a lot of
hope to be had. It still seems to me that adding cover traffic will
reduce the bandwidth of this side channel, but it is a deeply unsolved
traffic analysis problem to quantify the level of cover traffic needed,
on top of the engineering expense of protocol and cryptographic
revamping needed to support cell drops at intermediate relays.
It may also be possible to add a MAC scheme that prevents excessive
drops by intermediate relays and/or internet routers, but then we may
lose a lot of congestion signaling capability and responsiveness.
Aside: This drop side channel is much worse for VPN systems that blindly
tunnel OS-level TCP. Intermediate Internet routers between the VPN
client and the VPN server can use this side channel to send information
to any Internet router after the VPN server, via exposed TCP sequence
numbers. In a Tor-like system with Exit stream termination, the Exit
relay must be one of the side channel participants.
For more fun VPN TCP side channels, see also
https://seclists.org/oss-sec/2019/q4/122 and
https://tools.ietf.org/html/rfc6040.
Ghost 2: Datagram Tor with uTP/LEDBAT Signaling
Cause of Death: Responsiveness; fairness/cheating; side channels
uTP/LEDBAT is the bittorrent transport that measures the RTT of a
connection, and backs off if high latency is detected. The theory is
that latency above the maximum acceptable value (100ms by spec)
indicates that router queues have formed, due to competing traffic
causing a bottleneck. Typically these queues form at poor-quality
consumer edge routers that queue way too much for their link capacity
and user counts.
https://tools.ietf.org/html/rfc6817
https://research.torproject.org/techreports/libutp-2013-10-30.pdf
Because LEDBAT's goal is to use the link fully, and yield capacity in
presence of competing traffic, LEDBAT congestion control requires that
target latency upper bounds be known, and also expects some queuing to
occur to cause this latency. If network drops still occur, it also uses
TCP-like behavior, with similar side channel issues for our usecase. If
inherent path latency ever exceeds the LEDBAT target max, throughput
will plummet to near-zero. This is bad for Tor, as our path latency is
highly variable, especially between 7 hop onion circuits vs 3 hop exit
circuits.
Additionally, malicious LEDBAT clients can cheat by tolerating a
*larger* max delay than other clients. They will thus accept larger
queue sizes, which allow larger windows, which allow marginally better
throughput, at the expense of more congestion latency for everyone.
Ghost 3: Congestion Aware Tor
Cause of Death: Hazy anonymity analysis; responsiveness; throughput
Congestion Aware Tor uses more detailed per-node latency estimates
to measure transient congestion-induced delay and respond to it by
altering path selection:
https://www.cypherpunks.ca/~iang/pubs/Congestion_Aware_FC12.pdf
The downside of Congestion Aware Tor is that rather than managing the
congestion window, it suggests simply migrating circuits and changing
path weights. This has hazy effects on anonymity, as the set of viable
paths in the network is reduced by some hard-to-predict amount. Our
Circuit Build Timeout code (CBT) is able to do this kind of path pruning
in a significantly more precise manner by setting the circuit timeout
such that nearly exactly 80% of all possible network paths are used, but
it does not continually perform this timing measurement once the circuit
is built.
We could build a hybrid system with CBT-style math, with continual
measurements, with circuit migration, and with conflux multipath, but
this will still require retransmission buffers for reliable circuit
migration, will be slower to respond than true window control, and can't
actually cap congestion. Worse, circuit migration won't reduce
congestion at Exit nodes (you have to keep the same Exit or streams will
die).
Even with these changes, it also will not remove the throughput cap, or
improve onion service throughput.
Ghost 4: N23 Tor (aka DefenestraTor)
Cause of Death: Stream control; side channels; poor throughput
N23 Tor is described in Section 4.2 of the DefenestraTor paper:
https://cseweb.ucsd.edu/~savage/papers/PETS11.pdf
Basically, each router limits queue length for each circuit to N2 + N3,
which are consensus and circuit parameters, respectively. N3 is tuned
per circuit by latency measures similar to Congestion Aware Tor, but is
still capped at 500 per circuit.
This system died primarily because we could not figure out how to bolt
stream flow control back on to it. But that is a solvable problem (and
any circuit-level congestion control system we deploy must solve it):
https://lists.torproject.org/pipermail/tor-dev/2012-November/004138.html
https://lists.torproject.org/pipermail/tor-dev/2012-November/004143.html
I was not involved in the evaluation of N23 because I was designing and
building Tor Browser at the time, but upon diving into it now, I have
many additional concerns.
First: Because queues are per-circuit but backpressure is
per-connection, the backpressure suffers from multiplexing information
leak issues. If a circuit manages to fill its N23 limit, the only way
for the system to have backpressure to enforce N23 queue sizes is to
stop reading entirely on that circuit's upstream TCP connection,
stalling *all* other circuits on *only* that particular connection. This
property is extremely worrisome from an anonymity standpoint, as it
gives a client adversary a mechanism to experimentally stall pairwise
router connections to probe for the path taken by a long-lived active
target circuit (such as an onion service intro point).
Second: It's not clear to me how credits are sent in both directions on
a circuit, so that the congestion control can be applied in both
directions. Does anyone remember? (Does anyone care?)
Third: At the end of the day, the system still has total queue lengths
that scale in proportion to the number of circuits on the relay, rather
than being globally fixed and controlled through fairness properties. In
my view, this is the final nail in the coffin: it's not an improvement
for scalability, actually decreases throughput (due to the new lower
window cap), and can't control congestion enough to reduce the long-tail
latency.
Indeed, the perf CDFs from the paper show all of these effects, if you
look closely.
Ghost 5: RTT Tor [BOOTLEG_RTT_TOR]
Cause of Death: Sender cheating; Small max window size
Buried in the Defenestrator paper is an unnamed system that uses RTT to
estimate congestion and update windows accordingly, much like LEDBAT. It
is described in Section 4.1 of
https://cseweb.ucsd.edu/~savage/papers/PETS11.pdf
I'm going to call this system RTT Tor. RTT Tor works by tracking the min
and max RTT observed on a circuit, using the SENDME cell counting side
effect. It then picks a threshold timeout T, at a tuneable point in
between the min and max observed RTT. If the most recently measured RTT
exceeds T, the circuit is declared congested. Unlike LEDBAT, because RTT
Tor dynamically chooses T rather than using a hardcoded target RTT, it
is possible to use on Tor.
The window starts at 100 cells. So long as the RTT stays below T, the
window grows by an additional 100 cells every SENDME. If the RTT exceeds
T, the window is cut in half.
Window size max is capped at 1000 cells. The window cannot go below 100
cells.
This system depends upon the Exit node's ability to measure RTT to the
client in order to have a downstream window. This capability has been
shown to impact client anonymity:
http://people.cs.ksu.edu/~eyv/papers/latency_leak-ccs07.pdf
https://www.robgjansen.com/publications/howlow-pets2013.pdf
Furthermore, endpoints can cheat by choosing a higher T value or
otherwise acting as if their window size is increasing when it is not.
However, while the paper does not describe a solution for such cheating,
it is possible to detect if one endpoint is honest. Because both
endpoints can measure the RTT, both endpoints can track what their
peer's window should be growing to based on this algorithm, and compare
that to an estimate of their peer's window size. The peer's window size
can be estimated by counting the number of cells that arrive in RTT/2
amount of time.
If the RTT is asymmetric or highly variable, this detection mechanism
will have false positives, but we can at least reduce the ability and
incentive to cheat so long as one of the endpoints is honest.
But if both endpoints cheat, intermediate routers have no way of
limiting the data that is in flight or can queue up, except for closing
the circuit through the circuit OOM killer.
Hence the system still must impose a safe cell window max. It must also
impose this max size because some circuits may have an RTT_min that is
not much different from the RTT_max, even though the circuit is very
congested (because it is continually very congested). In this scenario,
the system would keep adding more congestion until the circuit OOM
killer kicks in.
But, as far as options go, this is not a bad one. Plus, it only requires
the client and the Exit to support it.
II. Fresh Remixes and New Ideas
So what if we took some hits from the 90s and remixed their hooks, Tor
style?
Let's start with the simplest, most TCP-like scheme we can come up with,
and then discuss potential variants and improvements to them, and see
which of these ideas mix together well.
Remix 1: Forward ECN [FORWARD_ECN_TOR]
What if we tried to make something as close to TCP ECN as possible for Tor?
Let's use an initial window size of 100 cells, and make our SENDMEs into
100 cell ACKs.
Let's create a RELAY_COMMAND_QUEUE_PRESSURE relay cell command that can
be sent by a relay towards a client whenever the sum total of queued
cells exceeds some limit. Which circuit gets the cell can be chosen by
EWMA or some other QoS algorithm (ie: RED-like weighted random choice,
as per TCP ECN). The cell would indicate the direction that the circuit
was loudest on (ie: client-bound or exit-bound). When the client gets
the cell, if the direction is exit-bound, the client halves its outbound
window. If the direction is client-bound, the client sends the cell back
to the Exit node, who then halves its client-bound SENDME send window.
(Recall that Tor's "relay commands" are encrypted to/from the client,
with Tor's per-hop circuit keys. This means that extra relay cells can
be injected by any hop in the circuit towards the client, and
intermediate relays cannot read these cells' relay commands. As we will
soon see in [BACKWARD_ECN_TOR] and related ideas, using cleartext "cell
commands" instead can help make things more efficient by allowing
signaling of congestion without requiring extra cell injection, but this
also introduces more side channel risk).
We can use Slow Start with AIMD here: Before the first
RELAY_COMMAND_QUEUE_PRESSURE, outbound windows would double every
window. After the first RELAY_COMMAND_QUEUE_PRESSURE, they would
increase by say 100, every window. Or something similar.
To reduce the potential for side channel injection, the client can
ensure that RELAY_COMMAND_QUEUE_PRESSURE do not arrive more often than
once per window. To prevent patterns from trivially being encoded on
these cells, the client can delay relaying them until they are on a K %
M == 0 boundary, for M=5 or 10 or so (high M will reduce responsiveness
to congestion). The client can also refuse to relay these cells if they
would cause the Exit's window to drop below some minimum size (say 128).
Intuition tells me the relays should use total outbound queue size for
deciding *when* to send these cells, and use specific circuit queue
length and/or activity to decide *which* circuit to send them on. In
this way, we globally limit queue size (per hop latency) regardless of
the number of circuits, and still enforce fairness among these circuits.
However, picking parameters and algorithms for all of this is a research
problem, or at least requires much more QoS RFC and research literature
review. There may be additional optimizations too, like sending these
cells on circuits of any TCP connection that starts blocking or
exceeding per-connection queue limits.
While this will work if clients are well-behaved, this system has a few
drawbacks.
First, it will be slow to respond to client-induced download congestion
at middle nodes, since we have to wait a full circuit RTT for the client
to relay cells to the Exit. Since the Exit is responsible for the
download window for web-like clients, this is also the direction that
needs the most congestion control, which is unfortunate.
Second, clients can easily cheat, even by malfunction: all they have to
do is refuse or forget to relay cells to an Exit, and they get faster
downloads at the expense of more network congestion for everyone. We
could do what RTT Tor did and cap the window size at 1000, but then
we're back in the situation where throughput is artificially capped at
some bizarre function of circuit RTT, and worst-case latency will not
improve as much as we want. We could rely more heavily on the circuit
OOM killer, and kill circuits that queue too much on the assumption that
they are cheating/malfunctioning, but tuning this to avoid false
positives may be tricky.
Third, while this system could technically be deployed even if some
nodes do not support it yet, this is risky, as nodes who do not support
it will experience excessive congestion at worst, and no improvement at
best.
Fourth, onion services are tricky to handle. Because of the way
rendezvous circuits are joined, this system will send
RELAY_COMMAND_QUEUE_PRESSURE towards the client for the client-side of
the rend circuit, and towards the service for the service end of the
circuit. Then, either of these sides would echo that signal as a
RELAY_COMMAND_QUEUE_PRESSURE *all* the way to the other side. Either or
both could cheat in this case, and again, the only recourse we have is
for each relay to enforce strict circuit queue length limits via the
circuit OOM killer.
Remix 2: Forward ECN mixed with RTT Tor [FORWARD_ECN_RTT_TOR]
[PROPOSAL_CANDIDATE]
Ok, [FORWARD_ECN_TOR] sounds decent, but because the client mediates
everything, it will be slow to respond to client-destined download
congestion, and clients can cheat. Also, if some relays don't support
the signal yet, the window may become too large for their congested status.
What if we *also* mix in [BOOTLEG_RTT_TOR], so that an endpoint's send
window can only grow if *both* conditions hold: the measured circuit RTT
stays below RTT Tor's T threshold, *and* there are no congestion signals
coming from [FORWARD_ECN_TOR]?
The addition of the signaling cell from [FORWARD_ECN_TOR] also allows us
to be more robust in the cheating detection described in
[BOOTLEG_RTT_TOR], and eliminate false positives. If either endpoint
detects a window that is growing too large for their measured circuit
RTT (by counting the number of cells arriving in that RTT, and comparing
that to what the window should be based on the RTT window growth
algorithm), it can send a warning shot RELAY_COMMAND_QUEUE_PRESSURE,
explicitly telling the cheater to back off. If the window still grows
because the other endpoint ignores this warning shot, it can close the
circuit.
This system is fully incrementally deployable: it can be used if only
the client and Exit support it. It is resilient to cheating so long as
Exits are honest, without false positives, and even without intermediate
relay support.
While we still aren't doing better than 1 RTT response to congestion, we
now have a good answer for cheating that doesn't have any false positive
risk for good actors. Furthermore, because we also have explicit
congestion signaling, we no longer have to worry as much about imposing
a max window size, to protect circuits for which RTT_min is close to
RTT_max due to persistent congestion. Good actors on these circuits will
back off, and for actors acting badly enough to add additional
congestion, RTT_max will increase enough for them to eventually be detected.
Downside: Since onion services have no Exit node, if we want to use the
RTT mechanism to mitigate cheating for them, we must terminate
congestion control at the Rendezvous point for this proposal idea, so
that onion service clients and services can't collude to get faster
service. The RP would then be responsible for examining each half of the
spliced circuit's congestion windows, and sending ECN signals down
whichever side had a larger window. This may or may not be worth the
additional complexity, as opposed to just employing the circuit OOM
killer if circuit queues get too long (due to cheating).
The only remaining downside I see is that we are relying on baking the
ability to make RTT measurements into the protocol.
Still, I think this is worth turning into its own proposal. Does anyone
see any other issues, or have any suggestions?
Remix 3: Backward ECN [BACKWARD_ECN_TOR]
Ok so the [FORWARD_ECN_RTT_TOR] mix is pretty tight. We've got an
incrementally deployable system that is somewhat resistant to cheaters,
and responds to congestion within an RTT. Can we do better? Yes.
Remember that Backward ECN IETF draft that we covered in [TCP_HISTORY]
earlier? It had that RTT/2 response with a hyphy hook. Let's sample that.
First: just like the other proposals, we still need to use SENDMEs, to
ack received packets. However, since we are not measuring RTT here, it
does not have to be exactly every 100 cells. The SENDME can be sent at a
randomized cell count to obscure RTT, with a different randomized ack
cell count value included in the SENDME.
In addition to RELAY_COMMAND_QUEUE_PRESSURE, which is sent to the client
and can't be read by any other intermediate relays, let's make a new
*cell* command type, CELL_COMMAND_RELAY_QUEUE_PRESSURE. Because this is
a cell command, it can be read by intermediate relays, and can be set by
an intermediate relay on any cell heading towards the Exit, simply by
flipping cell_t.command from CELL_COMMAND_RELAY to
CELL_COMMAND_RELAY_QUEUE_PRESSURE, so long as all relays in the circuit
understand this new command. This also avoids sending any additional
empty cells when congestion occurs in the common Web download direction,
which is nice.
We should still respond to congestion with Slow Start AIMD: when the
client or exit gets this relay cell or cell command, it cuts its send
window in half in that direction. Otherwise windows grow during
transmission similar to TCP (ie double the window size each window until
the first congestion cell, then linear).
The downside of these cells being a new end-to-end cell_t.command is
that it opens up side channel attacks we saw with the RELAY_EARLY attack:
https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confi…
The attack is to flip this cell_t.command field on a sequence of cells
to encode a bitstring, to enable the Exit to send data to the Guard. So
in this mix, let's still use RELAY_COMMAND_QUEUE_PRESSURE towards the
client. We will only use the relay-visible cell_t.command
CELL_COMMAND_RELAY_QUEUE_PRESSURE towards the Exit. This means we only
need to worry about Guard to Exit side channels, which require much more
information to be useful (ie: they must encode client IP address or some
other unique client identifier).
Note that because all intermediate relays can see the cell_t.command,
they can enforce similar properties as clients did in [FORWARD_ECN_TOR],
to limit side channels in the Guard to Exit direction. For instance,
they can ensure that these commands do not get sent more often than once
per window of client-bound incoming cells. They can also enforce that
the Exit-bound cell_t.command = CELL_COMMAND_RELAY_QUEUE_PRESSURE in the
outgoing direction is only set on K % M == 0 cell count boundaries, for
low M=5 or 10. Note that this K % M spacing actually works better for
longer circuits, since there is more chance for other relays to flip the
congestion bit at each M spacing position, which damages the reliability
of the side channel signal.
When middle relays enforce that the window is not allowed to drop below
win_min=128, a malicious Guard can only inject
log2(win_size)-log2(win_min) of these cells in a burst. Once the middle
node detects that the window size would be below the legal minimum, it
knows either cheating is happening, or a side channel is being used. For
a typical win_size of ~8k (ie: 16X faster throughput than current Tor)
and a win_min=128, this detection will be triggered in ~6 properly
spaced fake CELL_COMMAND_RELAY_QUEUE_PRESSURE commands.
However, in the AIMD steady state for this side channel, a malicious
guard can attempt to keep the window size as close to win_min as
possible, without going below it. Since the window size grows linearly
after the first ECN signal, the guard gets to send an additional
CELL_COMMAND_RELAY_PRESSURE cell_t.command around once every win_min
client-bound incoming cells. So long as no other relay is also sending
congestion control signals, the guard can keep flipping bits roughly
this often (though the K % M == 0 requirement would restrict the
placement of these cells).
Downsides: Even just 6 bits of reliable information can be damaging if
target client anonymity sets are already small, and so long as
client-bound incoming cells keep arriving, there are still more
opportunities to flip the cell_t.commands from the Guard to the Exit.
The big question here is can these 6+ cell_t.command flips be turned
into a large side channel, despite K % M == 0 placement enforcement? And
can this side channel be made reliable? If so (and it seems likely that
this is so), then this whole idea needs revision (see
[EDGE_FORWARD_ECN_TOR] and [START_BACKWARD_ECN_TOR] for possible revisions).
We also need all intermediate relays to upgrade to properly forward the
CELL_COMMAND_RELAY_QUEUE_PRESSURE cell command just like
CELL_COMMAND_RELAY. Also, clients can still cheat by ignoring the
encrypted RELAY_COMMAND_QUEUE_PRESSURE. On the plus side, clients who
cheat can at best get faster upload speeds. They cannot get faster
download speeds unless the Exits also decide to cheat.
Finally, the story for cheating onion service endpoints is similarly
complicated: Do we have the RP mediate, or fall back to the circuit OOM
killer yet again?
Remix 4: Backward ECN mixed with RTT Tor [BACKWARD_ECN_RTT_TOR]
If we really are worried about upload cheating, we can mix RTT Tor back
in, just like we did for [FORWARD_ECN_RTT_TOR], and use its cheating
detection. An endpoint that detects abnormally large window sizes (based
on arrival cell counts per RTT) could send the
RELAY_COMMAND_QUEUE_PRESSURE warning shot to reduce the window.
So now we have a system that responds in RTT/2 when supported by all
intermediate relays, and falls back to RTT response in the presence of
one cheater endpoint or lack of support by intermediate relays.
Unfortunately, this still has the side channel issues of
[BACKWARD_ECN_TOR], and so I have not tagged it as a proposal candidate.
Remix 5: Backward ECN with Transparent Trap [BACKWARD_ECN_TRAP_TOR]
Ok so the "SENDME ack exactly 100 cells" 2-step beat is getting a little
tired. Can we devise any ways to trap cheaters without having a
predicable RTT measurement baked into the protocol, and still respond to
congestion in RTT/2?
There is a way to defeat the wizard. But there be dragons in this trap
house, and they really want to soul bond using side channels while
everybody is watching. Hence I have not flagged this as a proposal
candidate. Plus, as these mixed metaphors and obscure references
suggest, this remix is very busy and complicated. Still, let's consider
it because it may be useful to remix later.
Let's take [BACKWARD_ECN_TOR], and instead of using
RELAY_COMMAND_QUEUE_PRESSURE towards the client, let's use
cell_t.command=CELL_COMMAND_RELAY_QUEUE_PRESSURE in both directions.
This allows us to perform the same kinds of side channel and cheating
checks at the middle node in the Exit to Guard direction, and also
avoids sending any extra relay cells during congestion.
Just as in [BACKWARD_ECN_TOR], let's randomize the cell count at which
we send each SENDME, and include a different randomized ack count in the
SENDME, so that each endpoint can add cells back to the window for that
SENDME, but cannot compute RTT. Let's also still use Slow Start with
AIMD (double the window if there are no congestion signals, half the
window on congestion signals, and grow it linear after that).
However, this is still not sufficient to stop Exit to Guard side
channels! The Exit to Guard direction is different than the Guard to
Exit direction: relays can inject arbitrary amounts of full relay cells
for the client in the Guard direction, whereas they could not do so in
the Guard to Exit direction. In Tor (and other mixnets), this property
is called "leaky pipe topology", and is what allows us to send
RELAY_COMMAND_DROP padding cells from the middle relay to the client
(and vice-versa).
This means that the combination of these injected cells plus congestion
control can be used to encode a pattern from Exit to Guard, and subvert
the checks at the middle node, because relays can inject dummy cells
sent towards the client to meet the win_size and M spacing requirements.
The good news is that the client-side circuit padding system only allows
RELAY_COMMAND_DROP from a hop for which the client has negotiated a
padding machine. This means that Exit nodes cannot inject padding for
assistance in any side channel usage, regardless of the congestion
control scheme we choose.
The bad news is that vanilla Tor allows the presence of invalid
protocol-violating cells, and does not close the circuit for these
cases. Best case, it issues a log message. Worst case: it silently drops
them. Thankfully, the vanguards addon is designed such that any cell
that is not specifically marked as valid by Tor counts as a dropped
cell, after which the circuit will be closed. This is a fail-closed
defense. If new code is added for cell processing that forgets to flag
cells as properly processed by calling circuit_read_valid_data(), those
cells count as false positive dropped cells, rather than false
negatives. This impacts reliability, but has the advantage that we won't
shoot ourselves in the foot by adding new code that is too permissive
without noticing that this code simply does not work properly because
circuits get closed when it is exercised.
All this means we will need to port this defense into Tor itself,
though, or dummy cells can be injected by the Exit for side channel
assistance in congestion control schemes or other cases:
https://github.com/mikeperry-tor/vanguards/blob/master/README_TECHNICAL.md#…
But what about legit padding causing the Guard to think that win_size
and M spacing requirement are violated? Well, so long as any relay with
an active padding machine stores the congestion signals so that it can
delay them such that they are sent on the next K % M == 0 boundary after
padding is applied, these limits can still be enforced, without
revealing the quantity of padding. Additionally, any guard node
congestion caused by the sum of real+padding traffic can be controlled
under this scheme, because the middle relay can observe the guard's
congestion signals and decrease padding approperiately.
Downsides: If preventing *any* side channel bits from flowing from the
Exit to the Guard is important, this scheme likely does not accomplish
that. It does seem that Exit to Guard side channels are far more
dangerous than Guard to Exit, since all the Exit has to say is "hey this
client looked at something I think is interesting", where as the Guard
has to communicate a unique identifier for the anonymity set, so that
concern alone may kill this idea.
This scheme is also complicated, and any implementation errors will
bring back the RELAY_EARLY side channel attack in the Exit to Guard
direction, even if the math might claim otherwise.
Remix 6: Edge-Forward ECN [EDGE_FORWARD_ECN_TOR]
[PROPOSAL_CANDIDATE]
The common problem with both [BACKWARD_ECN_TOR] and
[BACKWARD_ECN_TRAP_TOR] is that they enable side channels by one or both
of the edges of the circuit (Guard to Exit, or Exit to Guard).
Forward ECN mitigates these side channels, because the edge no longer
gets to encode its congestion signal at specific points in a traffic
pattern. This works better than win_size and M position limiting by
itself, and it can be used in combination with client-side window size
checks and M position limiting, too.
If middle nodes could somehow reliably detect the Guard and Exit
endpoints of a circuit, then the middle node(s) could force one or both
endpoints to use Forward ECN [FORWARD_ECN_TOR]. Then the circuit edges
would not be able to communicate with each other by directly encoding
congestion signals in deterministic traffic patterns.
This is easier said than done. In practice, middles can tell that they
are middles because they do not get any authentication cells from the
client, and they are not asked to open any streams. But what we really
need is for middles to be able to tell if their *neighbors* are the
circuit edges, because these edges are the only ones that we actually
need to forbid from using cell_t.command from BACKWARD_ECN_TOR; it is
fine if multiple middles use cell_t.command (for example, in onion
service circuits).
The only way I can think of to reliably detect circuit edges from the
middle position is to forbid either Exits or Guards nodes from being
used as middles, so that middles could know when they were next to one
of these node types, and therefore next to the edge of the circuit. Then
middles could forbid these edges' use of the cleartext
CELL_COMMAND_RELAY_QUEUE_PRESSURE. This only-middles-as-middles
stratification requirement has been proposed before for several other
reasons, so maybe it is not a horrible idea?
This is more tricky for onion service circuits, though. For those, we
could require that sensitive positions (such as HSDIR, RP, and IP) also
be Exit-flagged nodes?
Or maybe there is another clever way to detect just the edges of a
circuit from the middle position?
Even if there is, this proposal still has some cheating risk, as clients
can refuse to relay these edge congestion signals to get marginally
better throughput. Again, RTT measurements can be preserved here as a
backstop against cheating, but for the additional anonymity risk.
Otherwise we have to fall back to the circuit OOM killer.
Remix 7: Start Backward ECN [START_BACKWARD_ECN_TOR]
[PROPOSAL_CANDIDATE]
The Slow Start with AIMD window management for all of these systems
means that we most need the quick RTT/2 response to the congestion early
on in the Slow Start phase, while the window size is growing
exponentially. This is also when the network is most vulnerable to cheaters.
So let's allow only 1 (or maybe 2) CELL_COMMAND_RELAY_QUEUE_PRESSURE
cell_t.commands per circuit in the Guard to Exit direction as per
[BACKWARD_ECN_TOR], and maybe even 1 CELL_COMMAND_RELAY_QUEUE_PRESSURE
cell_t.command in the Exit to Guard direction from
[BACKWARD_ECN_TRAP_TOR]. After these cell_t.command limits are hit,
middle nodes forbid any further use of this cell type on that circuit,
and enforce that all circuit member relays switch to [FORWARD_ECN_TOR]
or [FORWARD_ECN_RTT_TOR] from then on (depending on how concerned we are
about cheating vs disclosing circuit RTT).
This is my all-around favorite of the options. The only downside is that
we have to choose between disclosing RTT, or risking some cheating, but
since the cheating can now only happen after slow start has finished, it
will provide less speed advantages. If clients cheat with the goal of
causing memory exhaustion at relays, the circuit OOM killer can kick in.
Remix 8: Your idea here!
Anyone have any other old lore to resurrect? Any new ideas?
Bonus B-Side Track: Stream Flow Control [PROPOSAL_CANDIDATE]
All of the systems discussed in this post address the problem of
circuit-level congestion control, but do not perform any stream-level
flow control. While we were prototyping N23, Andreas Krey pointed out
that a single blocked stream could consume the entire circuit window
with unread packets, causing the surprising result that other active
streams on the same circuit will also stall:
https://lists.torproject.org/pipermail/tor-dev/2012-November/004138.html
He then went on to propose some alternatives: XON/XOFF, bundling stream
window updates in circuit SENDMEs, or just killing any blocked streams
that have too many unread cells:
https://lists.torproject.org/pipermail/tor-dev/2012-November/004143.html
XON/XOFF is the simplest option, but we will need heuristics to avoid
excessive XON/XOFF chatter on application streams that frequently block
in normal operation. It also will have RTT/2 delay before it actually
causes the other end to stop sending. If the blocked stream is sending
cells at the full bandwidth of the circuit, this could very well be a
full circuit window worth of cells before the XOFF makes it across.
We can just ack those cells with a circuit-level SENDME even though they
were not delivered to their blocked stream, but we have to be careful to
authenticate what we read in this case, so we do not re-introduce the
unauthenticated SENDME attack:
https://gitweb.torproject.org/torspec.git/tree/proposals/289-authenticated-…
Can we learn anything from the layered flow control in QUIC? It is
excessively complicated:
https://docs.google.com/document/d/1F2YfdDXKpy20WVKJueEf4abn_LVZHhMUMS5gX6P…
III. Evaluation
First, does anyone have any strong objections to any of the remixes
tagged with PROPOSAL_CANDIDATE? It will save time if we can decide right
from the start that any are completely unacceptable. Bonus points if we
can mod them back into being acceptable, or mod one of the remixes that
are not tagged with PROPOSAL_CANDIDATE into being acceptable.
We also need some criteria to help us determine how we're going to
compare the remaining ones. Here is a condensed [TRACK_LISTING], for
quick review/comparison and criteria brainstorming:
______________________________________________________________________
| MIX | CHEATING | RESPONSE| SIDE | UPGRADE |
| TRACK | DETECTION | TIME | CHANNELS | PATH? |
|~~~~~~~~~~~~~~~~~~~~~~|~~~~~~~~~~~|~~~~~~~~~|~~~~~~~~~~~~|~~~~~~~~~~|
|BOOTLEG_RTT_TOR | Exit RTT |100c+RTT | RTT | Exits |
|FORWARD_ECN_TOR | Circ OOM | RTT | None? | Full Net |
|FORWARD_ECN_RTT_TOR | Exit RTT | RTT | RTT |Exits;Full|
|BACKWARD_ECN_TOR |Middles;OOM| RTT/2 | Guard->Exit| Full Net |
|BACKWARD_ECN_RTT_TOR |Middles;RTT|RTT/2;RTT| Guard->Exit|Exits;Full|
|BACKWARD_ECN_TRAP_TOR | Middles | RTT/2 |Guard<->Exit| Full Net |
|EDGE_FORWARD_ECN_TOR |Middles;OOM| 2*RTT/3 | None? | Full Net |
|START_BACKWARD_ECN_TOR|Middles;OOM|RTT/2;RTT| Low/None? | Full Net |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What are the things we need to decide if a scheme is acceptable? Here's
a laundry list. Please do add any others you think of:
- Side channel accounting
- Interaction with non-data traffic, padding traffic, and leaky-pipes
- Anonymity analysis as per
https://www.robgjansen.com/publications/howlow-pets2013.pdf
- No/minimal cheating
- No throughput cap
- Quick responsiveness (at least RTT; RTT/2 is better)
- Fairness analysis (may be a separate proposal on QoS/EWMA)
- Onion service deployment
- Simulation methodology that will reflects live deployment
- Deployment costs
- Upgrade path
Must-Read References:
Cultural etymology of Tor's end-of-decade donations campaign aesthetic:
https://en.wikipedia.org/wiki/Vaporwave
https://blog.torproject.org/better-internet-possible-ive-seen-it
Tor congestion latency is proportional to network utilization divided by
spare_capacity:
Section 6.3 of https://www.freehaven.net/anonbib/cache/murdoch-pet2008.pdf
RTT-based Congestion Control for Tor:
Section 4.1 of https://cseweb.ucsd.edu/~savage/papers/PETS11.pdf
TCP ECN: https://tools.ietf.org/html/rfc3168#section-6.1
Backward ECN: https://tools.ietf.org/html/draft-salim-jhsbnns-ecn-00
Stream Flow Control Issues and Ideas:
https://lists.torproject.org/pipermail/tor-dev/2012-November/004138.html
https://lists.torproject.org/pipermail/tor-dev/2012-November/004143.html
Side channels in circ command cells:
https://blog.torproject.org/tor-security-advisory-relay-early-traffic-confi…
--
Mike Perry
1
0
Hi,
Here is an initial draft of Proposal 311: Relay IPv6 Reachability.
This proposal includes:
* relay IPv6 ORPort extends, and
* relay IPv6 ORPort reachability checks.
This is the first of 3 proposals:
* Proposal 311: Relay IPv6 Reachability
* Proposal 312: Automatic Relay IPv6 Addresses
* Proposal 313: Relay IPv6 Statistics
(I haven't written the others yet!)
I also want to make some minor changes to Proposal 306, so that bridge
IPv6 behaviour stays in sync with client IPv6 behaviour.
There are still a few TODO items in the proposal, mostly about Tor's
current behaviour. If you know the answers, please let me know.
The full text is included below, and it is also available as a GitHub
pull request:
https://github.com/torproject/torspec/pull/103
The related tickets are #24404 (proposal) and #24403 (implementation):
https://trac.torproject.org/projects/tor/ticket/24404
https://trac.torproject.org/projects/tor/ticket/24403
Please feel free to reply on this list, or via GitHub pull request
comments.
Filename: 311-relay-ipv6-reachability.txt
Title: Tor Relay IPv6 Reachability
Author: teor
Created: 22-January-2020
Status: Draft
Ticket: #24404
0. Abstract
We propose that Tor relays and bridges should check the reachability of
their IPv6 ORPort, before publishing their descriptor. To check IPv6 ORPort
reachability, relays and bridges need to be able to extend circuits via
other relays, and back to their own IPv6 ORPort.
1. Introduction
Tor relays and bridges currently check the reachability of their IPv4
ORPort and DirPort before publishing them in their descriptor. But relays
and bridges do not test the reachability of their IPv6 ORPorts.
However, Directory Authorities make direct connections to relay IPv4 and
IPv6 ORPorts, to test each relay's reachability. Once a relay has been
confirmed as reachable by a majority of authorities, it is included in the
consensus. (Currently, 6 out of 9 Directory Authorities perform IPv4 and
IPv6 reachability checks. The others just check IPv4.)
The Bridge authority makes direct connections to bridge IPv4 ORPorts, to
test each bridge's reachability. Depending on its configuration, it may also
test IPv6 ORPorts. Once a bridge has been confirmed as reachable by the
bridge authority, it is included in the bridge networkstatus used by
BridgeDB.
Many relay and bridge operators don't know when their relay's IPv6 ORPort is
unreachable. They may not find out until they check [Relay Search], or
their traffic may drop. For new operators, it might just look like Tor
simply isn't working, or it isn't using much traffic. IPv6 ORPort issues
are a significant source of relay and bridge operator support requests.
Implementing IPv6 ORPort reachability checks will provide immediate, direct
feedback to operators in the relay or bridge's logs. It also enables future
work, such as automatically discovering relay and bridge addresses for IPv6
ORPorts (see [Proposal 312]).
2. Scope
This proposal modifies Tor's behaviour as follows:
Relays:
* circuit extension,
* OR connections for circuit extension,
* reachability testing.
Bridges:
* reachability testing only.
This proposal does not change client behaviour.
When this proposal describes Tor's current behaviour, it covers all
supported Tor versions (0.3.5.7 to 0.4.2.5), as of January 2020.
3. Allow Relay IPv6 Extends
To check IPv6 ORPort reachability, relays and bridges need to be able to
extend circuits via other relays, and back to their own IPv6 ORPort.
We propose that relays start to extend some circuits over IPv6 connections.
We do not propose any changes to bridge extend behaviour.
3.1. Current IPv6 ORPort Implementation
Currently, all relays and bridges must have an IPv4 ORPort. IPv6 ORPorts
are optional for relays and bridges.
Tor supports making direct IPv6 OR connections:
* from directory authorities to relay ORPorts,
* from the bridge authority to bridge ORPorts,
* from clients to relay and bridge ORPorts.
Tor relays and bridges accept IPv6 ORPort connections. But IPv6 ORPorts are
not currently included in extend requests to other relays. And even if an
extend cell contains an IPv6 ORPort, bridges and relays will not extend
via an IPv6 connection to another relay.
Instead, relays will extend circuits:
* Using an existing authenticated connection to the requested relay
(which is typically over IPv4), or
* Over a new connection via the IPv4 ORPort in an extend cell.
If a relay receives an extend cell that only contains an IPv6 ORPort, the
extend typically fails.
3.2. Relays Extend to IPv6 ORPorts
We propose that relays make some connections via the IPv6 ORPorts in
extend cells.
Relays will extend circuits:
* using an existing authenticated connection to the requested relay
(which may be over IPv4 or IPv6), or
* over a new connection via the IPv4 or IPv6 ORPort in an extend cell.
Since bridges try to imitate client behaviour, they will not adopt this new
behaviour, until clients begin routinely connecting via IPv6. (See
[Proposal 306].)
3.2.1. Making IPv6 ORPort Extend Connections
Relays can make a new connection over IPv6 when:
* there is no existing authenticated connection to the requested relay,
and
* the extend cell contains an IPv6 ORPort.
If these conditions are satisfied, and the extend cell also contains an
IPv4 ORPort, we propose that the relay choose between an IPv4 and an IPv6
connection at random.
If the extend cell does not contain an IPv4 ORPort, we propose that the
relay connects over IPv6. (Relays should support IPv6-only extend cells,
even though they are not used to test relay reachability in this proposal.)
A successful IPv6 connection also requires that:
* the requested relay has an IPv6 ORPort.
But extending relays must not check the consensus for other relays' IPv6
information. Consensuses may be out of date, particularly when relays are
doing reachability checks for new IPv6 ORPorts.
See section 3.3.2 for other situations where IPv6 information may be
incorrect or unavailable.
3.2.2. No Tor Client Changes
Tor clients currently include IPv4 ORPorts in their extend cells, but they
do not include IPv6 ORPorts.
We do not propose any client IPv6 extend cell changes at this time.
The Tor network needs more IPv6 relays, before clients can safely use
IPv6 extends. (Relays do not require anonymity, so they can safely use
IPv6 extends to test their own reachability.)
We also recommend prioritising client to relay IPv6 connections
(see [Proposal 306]) over relay to relay IPv6 connections. Because client
IPv6 connections have a direct impact on users.
3.3. Alternative Extend Designs
We briefly mention some potential extend designs, and the reasons that
they were not used in this proposal.
(Some designs may be proposed for future Tor versions, but are not necessary
at this time.)
3.3.1. Future Relay IPv6 Extend Behaviour
Random selection of extend ORPorts is a simple design, which supports IPv6
ORPort reachability checks.
However, it is not the most efficient design when:
* both relays meet the requirements for IPv4 and IPv6 extends,
* a new connection is required,
* the relays have either IPv4 or IPv6 connectivity, but not both.
In this very specific case, this proposal results in an average of 1
circuit extend failure per new connection. (Because relays do not try to
connect to the other ORPort when the first one fails.)
If relays try both the IPv4 and IPv6 ORPorts, then the circuit would
succeed. For example, relays could try the alternative port after a 250ms
delay, as in [Proposal 306]. This design results in an average circuit delay
of up to 125ms per new connection, rather than failure.
However, partial relay connectivity should be uncommon. And relays keep
connections open long-term, so new relay connections are a small proportion
of extend requests.
Therefore, we defer implementing any more complex designs. Since we propose
to use IPv6 extends to test relay reachability, occasional circuit extend
failures have a very minor impact.
3.3.2. Future Bridge IPv6 Extend Behaviour
When clients automatically connect to relay IPv4 and IPv6 ORPorts by
default, bridges should also adopt this behaviour. (For example,
see [Proposal 306].)
3.3.3. Rejected Extend Designs
Some designs may never be suitable for the Tor network.
We rejected designs where relays check the consensus to see if other
relays support IPv6, because:
* relays may have different consensuses,
* the extend cell may have been created using a version of the
[Onion Service Protocol] which supports IPv6, or
* the extend cell may be from a relay that has just added IPv6, and is
testing the reachability of its own ORPort (see Section 4).
We avoided designs where relays try to learn if other relays support IPv6,
because these designs:
* are more complex than random selection,
* potentially leak information between different client circuits,
* may enable denial of service attacks, where a flood of incorrect extend
cells causes a relay to believe that another relay is unreachable on an
ORPort that actually works, and
* require careful tuning to match the typical interval at which network
connectivity is actually changing.
4. Check Relay and Bridge IPv6 ORPort Reachability
We propose that relays and bridges check their own IPv6 ORPort reachability.
To check IPv6 ORPort reachability, relays and bridges extend circuits via
other relays, and back to their own IPv6 ORPort.
IPv6 reachability failures may result in a relay or bridge refusing to
publish its descriptor, if enough existing relays support IPv6 extends.
4.1. Current Reachability Implementation
Relays and bridges check the reachability of their IPv4 ORPorts and
DirPorts, and refuse to publish their descriptor if either reachability
check fails.
IPv4 ORPort reachability checks succeed when any create cell is received on
any inbound OR connection. The check succeeds, even if the cell is from an
IPv6 ORPort, or a circuit built by a client.
Directory Authorities make direct connections to relay IPv4 and IPv6
ORPorts, to test each relay's reachability. Relays that fail either
reachability test, on enough directory authorities, are excluded from the
consensus.
The Bridge authority makes direct connections to bridge IPv4 ORPorts, to
test each bridge's reachability. Depending on its configuration, it may also
test IPv6 ORPorts. Bridges that fail either reachability test are excluded
from BridgeDB.
4.2. Checking IPv6 ORPort Reachability
We propose that testing relays (and bridges) select some IPv6 extend-capable
relays for their reachability circuits, and include their own IPv4 and IPv6
ORPorts in the final extend cells on those circuits.
The final extending relay will extend to the testing relay:
* using an existing authenticated connection to the testing relay
(which may be over IPv4 or IPv6), or
* over a new connection via the IPv4 or IPv6 ORPort in the extend cell.
The testing relay will confirm that test circuits can extend to both its
IPv4 and IPv6 ORPorts.
4.2.1. Selecting the Final Extending Relay
IPv6 ORPort reachability checks require an IPv6 extend-capable relay as
the second-last hop of reachability circuits. (The testing relay is the
last hop.)
IPv6-extend capable relays must have:
* Relay subprotocol version 3 (or later), and
* an IPv6 ORPort.
(See section 5.1 for the definition of Relay subprotocol version 3.)
The other relays in the path do not require any particular protocol
versions.
4.2.2. Extending from the Second-Last Hop
IPv6 ORPort reachability circuits should put the IPv4 and IPv6 ORPorts in
the extend cell for the final extend in reachability circuits.
Supplying both ORPorts makes these extend cells indistinguishable from
future client extend cells.
If reachability succeeds, the testing relay (or bridge) will accept the
final extend on one of its ORPorts, selected at random by the extending
relay (see section 3.2.1).
4.2.3. Separate IPv4 and IPv6 Reachability Flags
Testing relays (and bridges) will record reachability separately for IPv4
and IPv6 ORPorts, based on the ORPort that the test circuit was received on.
4.2.4. No Changes to DirPort Reachability
We do not propose any changes to relay IPv4 DirPort reachability checks at
this time.
The following configurations are currently not supported:
* bridge DirPorts, and
* relay IPv6 DirPorts.
Therefore, they are also out of scope for this proposal.
4.3. Refuse to Publish Descriptor if IPv6 ORPort is Unreachable
If an IPv6 ORPort reachability check fails, relays (and bridges) should log
a warning.
In addition, if IPv6ORPort reachability checks are reliable, based on the
number of relays in the network that support IPv6 extends, relays should
refuse to publish their descriptor.
4.3.1. Refusing to Publish the Descriptor
We set a threshold of consensus relays for reliable IPv6 ORPort checks:
* at least 30 relays, and
* at least 1% of the total consensus weight,
must support IPv6 extends.
We chose these parameters so that the number of relays is triple the
number of directory authorities, and the consensus weight is high enough
to support occasional reachability circuits.
In small networks with:
* less than 2000 relays, or
* a total consensus weight of zero,
the threshold should be the minimum tor network size to test reachability:
* at least 2 relays, excluding this relay.
(Note: we may increase this threshold to 3 or 4 relays if we discover a
higher minimum during testing.)
If the current consensus satisfies this threshold, testing relays (and
bridges) that fail IPv6 ORPort reachability checks should refuse to publish
their descriptors.
To ensure an accurate threshold, testing relays should exclude:
* the testing relay itself, and
* relays that they will not use in testing circuits,
from the:
* relay count, and
* the numerator of the threshold percentage.
Typically, relays will be excluded if they are in the testing relay's:
* family,
* IPv4 address /16 network,
* IPv6 address /32 network (a requirement as of Tor 0.4.0.1-alpha),
unless EnforceDistinctSubnets is 0.
As a useful side-effect, these different thresholds for each relay family
will reduce the likelihood of the network flapping around the threshold.
If flapping has an impact on the network health, directory authorities
should set the AssumeIPv6Reachable consensus parameter. (See the next
section.)
4.3.2. Add AssumeIPv6Reachable Option
We add an AssumeIPv6Reachable torrc option and consensus parameter.
If IPv6 ORPort checks have bugs that impact the health of the network,
they can be disabled by setting AssumeIPv6Reachable=1 in the consensus
parameters.
If IPv6 ORPort checks have bugs that impact a particular relay (or bridge),
they can be disabled by setting "AssumeIPv6Reachable 1" in the relay's
torrc.
This option disables IPv6 ORPort reachability checks, so relays publish
their descriptors if their IPv4 ORPort reachability checks succeed.
The default for the torrc option is "auto", which checks the consensus
parameter. If the consensus parameter is not set, the default is "0".
"AssumeReachable 1" overrides all values of "AssumeIPv6Reachable",
disabling both IPv4 and IPv6 ORPort reachability checks.
4.4. Optional Efficiency and Reliability Changes
We propose some optional changes for efficiency and reliability, and
describe their impact.
Some of these changes may be more appropriate in future releases, or
along with other proposed features.
4.4.1. Extend IPv6 From All Supported Second-Last Hops
The testing relay (or bridge) puts both IPv4 and IPv6 ORPorts in its final
extend cell, and the receiving ORPort is selected at random by the
extending relay (see sections 3.2.1 and 4.2). Therefore, approximately half
of IPv6 ORPort reachability circuits will actually end up confirming IPv4
ORPort reachability.
We propose this optional change, to improve the rate of IPv6 ORPort
reachability checks:
If the second-last hop of an IPv4 ORPort reachability circuit supports IPv6
extends, testing relays may put the IPv4 and IPv6 ORPorts in the extend
cell for the final extend.
As the number of relays that support IPv6 extends increases, this change
will increase the number of IPv6 reachability confirmations. In the ideal
case, where the entire network supports IPv4 and IPv6 extends, IPv4 and IPv6
ORPort reachability checks would require a similar number of circuits.
4.4.2. Close Existing Connections Before Testing Reachability
When a busy relay is performing reachability checks, it may already have
established inbound or outbound connections to the second-last hop in its
reachability test circuits. The extending relay may use these connections
for the extend, rather than opening a connection to the target ORPort
(see sections 3.2 and 4.2.2).
Bridges only establish outbound connections to other relays, and only over
IPv4 (except for reachability test circuits). So they are still potentially
affected by this issue.
We propose these optional changes, to improve the efficiency of IPv4 and
IPv6 ORPort reachability checks:
Testing relays (and bridges):
* close any outbound connections to the second-last hop of reachability
circuits, and
* close inbound connections to the second-last hop of reachability
circuits, if those connections are not using the target ORPort.
Even though it is unlikely that bridges will have inbound connections to
a non-target ORPort, bridges should still do inbound connection checks, for
consistency.
These changes are particularly important if a relay is connected to all
other relays in the network, but only over IPv4. (Or in the future, only
over IPv6.)
We expect that these changes will slightly increase the number of relay
re-connections, but reduce the number of reachability test circuits
required to confirm reachability.
4.4.3. Accurately Identifying Test Circuits
The testing relay (or bridge) may confirm that the create cells it is
receiving are from its own test circuits, and that test circuits are
capable of returning create cells to the origin.
Currently, relays confirm reachability if any create cell is received on
any inbound connection (see section 4.1). Relays do not check that the
circuit is a reachability test circuit, and they do not wait to receive the
return created cell. This behaviour has resulted in difficult to diagnose
bugs on some rare relay configurations.
We propose these optional changes, to improve the efficiency of IPv4 and
IPv6 ORPort reachability checks:
Testing relays may:
* check that the create cell is received from a test circuit
(by comparing the received cell to the cells sent by test circuits),
* check that the create cell is received on an inbound connection
(this is existing behaviour),
* if the create cell from a test circuit is received on an outbound
connection, destroy the circuit (rather than returning a created cell),
and
* check that the created cell is returned to the relay on a test circuit
(by comparing the remote address of the final hop on the circuit, to
the local IPv4 and IPv6 ORPort addresses).
TODO: work out how to efficiently match inbound create cells to test
circuits.
4.4.4. Allowing More Relay IPv6 Extends
Currently, clients, relays, and bridges do not include IPv6 ORPorts in their
extend cells.
In this proposal, we only make relays (and bridges) extend over IPv6 on
the final hop of test circuits. This limited use of IPv6 extends means that
IPv6 connections will still be uncommon.
We propose these optional changes, to increase the number of IPv6
connections between relays:
To increase the number of IPv6 connections, relays that support IPv6
extends may want to use them for all hops of their own circuits. Relays
make their own circuits for reachability tests, bandwidth tests, and
ongoing preemptive circuits. (Bridges can not change their behaviour,
because they try to imitate clients.)
We propose a torrc option and consensus parameter SendIPv6CircuitExtends,
which is only supported on relays (and not bridges or clients). This option
makes relays send IPv4 and IPv6 ORPorts in all their extend cells, when
supported by the extending and receiving relay. (See section 3.2.1.)
TODO: Is there a shorter name, that isn't easily confused with enabling
support for other nodes to perform IPv6 extends via this relay?
The default value for this option is "auto", which checks the consensus
parameter. If the consensus parameter is not set, it defaults to "0" in
the initial release.
Once IPv6 extends have had enough testing, we may enable
SendIPv6CircuitExtends on the network. The consensus parameter will be set
to 1. The default will be changed to "1" (if the consensus parameter is not
set).
We defer any client (and bridge) changes to a separate proposal, to be
implemented when there are more IPv6 relays in the network. But we note
that relay IPv6 extends will provide some cover traffic when clients
eventually use IPv6 extends in their circuits.
As a useful side effect, increasing the number of IPv6 connections in the
network makes it more likely that an existing connection can be used for
the final hop of a relay IPv6 ORPort reachability check.
4.5. Alternate Reachability Designs
We briefly mention some potential reachability designs, and the reasons that
they were not used in this proposal.
4.5.1. Removing IPv4 ORPorts from Extend Cells
We avoid designs that only include IPv6 ORPorts in extend cells, and remove
IPv4 ORPorts.
Only including the IPv6 ORPort would provide slightly more specific
reachability check circuits. However, we don't need IPv6-only designs,
because relays continue trying different reachability circuits until they
confirm reachability.
IPv6-only designs also make it easy to distinguish relay reachability extend
cells from other extend cells. This distinguisher will become more of an
issue as IPv6 extends become more common in the network (see sections 4.2.2
and 4.4.4).
Removing the IPv4 ORPort also provides no fallback, if the IPv6 ORPort is
actually unreachable. IPv6-only failures do not affect reachability checks,
but they will become important in the future, as other circuit types start
using IPv6 extends.
IPv6-only reachability designs also increase the number of special cases in
the implementation. (And the likelihood of subtle bugs.)
These designs may be appropriate in future, when there are IPv6-only bridges
or relays.
5. New Relay Subprotocol Version
We reserve Tor subprotocol "Relay=3" for:
* relays that support for IPv6 extends, and
* relays and bridges that support IPv6 ORPort reachability checks,
as described in this proposal.
5.1. Tor Specification Changes
We propose the following changes to the [Tor Specification], once this
proposal is implemented.
Adding a new Relay subprotocol version lets testing relays identify other
relays that support IPv6 extends. It also allows us to eventually recommend
or require support for IPv6 extends on all relays (and bridges).
Append to the Relay version 2 subprotocol specification:
Relay=2 has limited IPv6 support:
* Clients do not include IPv6 ORPorts in EXTEND2 cells.
* Relays do not not initiate IPv6 connections in response to
EXTEND2 cells containing IPv6 ORPorts.
However, relays accept inbound connections to their IPv6 ORPorts,
and will extend circuits via those connections.
"3" -- relays support extending over IPv6 connections in response to an
EXTEND2 cell containing an IPv6 ORPort. Relays and bridges perform
IPv6 ORPort reachability checks. Client behaviour does not change.
This subprotocol is advertised by all relays and bridges, regardless
of their configured ORPorts. But relays without a configured IPv6
ORPort may refuse to extend over IPv6. And bridges always refuse to
extend over IPv6, because they try to imitate client behaviour.
A successful IPv6 extend requires:
* Relay subprotocol version 3 (or later) and an IPv6 ORPort on the
extending relay,
* an IPv6 ORPort in the EXTEND2 cell, and
* an IPv6 ORPort on the accepting relay.
(Because different tor instances can have different views of the
network, these checks should be done when the path is selected.
Extending relays should only check local IPv6 information, before
attempting the extend.)
This subprotocol version is described in proposal 311, and
implemented in Tor 0.4.4.1-alpha. (TODO: check version after code is
merged).
6. Test Plan
We provide a quick summary of our testing plans.
6.1. Test IPv6 ORPort Reachability and Extends
We propose to test these changes using chutney networks with AssumeReachable
disabled. (Chutney currently enables AssumeReachable by default.)
We also propose to test these changes on the public network with a small
number of relays and bridges.
Once these changes are merged, volunteer relay and bridge operators will be
able to test them by:
* compiling from source,
* running nightly builds, or
* running alpha releases.
6.2. Test Existing Features
We will modify and test these existing features:
* IPv4 ORPort reachability checks
We do not plan on modifying these existing features:
* Relay reachability retries
TODO: Do relays re-check their own reachability? How often?
* Relay canonical connections
* "Too many connections" warning logs
But we will test that they continue to function correctly, and fix any bugs
triggered by the modifications in this proposal.
6.3. Test Legacy Relay Compatibility
We will also test IPv6 extends from newer relays (which implement this
proposal) to older relays (which do not). Although this proposal does not
create these kinds of circuits, we need to check for bugs and excessive
logs in older tor versions.
8. Ongoing Monitoring
To monitor the impact of these changes, relays should collect basic IPv4
and IPv6 connection and bandwidth statistics (see [Proposal 313]).
We may also collect separate statistics on connections from:
* clients (and bridges, because they act like clients), and
* other relays (and authorities, because they act like relays).
We do not propose to collect additional statistics on:
* bridges,
* circuit counts, or
* failure rates.
Collecting statistics like these could impact user privacy.
References:
[Onion Service Protocol]: In particular, Version 3 of the Onion Service Protocol supports IPv6: https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt
[Proposal 306]: One possible design for automatic client IPv4 and IPv6 connections is at https://gitweb.torproject.org/torspec.git/tree/proposals/306-ipv6-happy-eye… (TODO: modify to include bridge changes with client changes)
[Proposal 312]: https://gitweb.torproject.org/torspec.git/tree/proposals/312-auto-relay-ipv… (TODO)
[Proposal 313]: https://gitweb.torproject.org/torspec.git/tree/proposals/313-relay-ipv6-sta… (TODO)
[Relay Search] https://metrics.torproject.org/rs.html
[Tor Specification]: https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt
--
(End)
3
3
Hi. We are rolling out the vanguard plugin for our users and wanted to
understand some options we can enable.
* In many parts of the Security README setting *circ_max_megabytes* is
recommended. Though it is discouraged for usecases involving Onionshare
and Securedrop which we support. What is a reasonable limit to set? What
happens is the max ceiling gets hit? Does it permanently disrupt the
upload/download?
* "High load onion services may consider using 4 layer2 guards by
changing the *num_layer2_guards* option in the configuration file
<https://github.com/mikeperry-tor/vanguards/blob/master/vanguards-example.co…>,
but going beyond that is not recommended."
Does this benefit clients too? We would like to enable options that
mimic the configuration used by actual high load onion services to
provide them with more cover.
2
1
Thanks Cecylia for your great explanation.
1
0
Greetings tor-dev!
This email is a discussion on adding tracing to little-t tor. Tracing can be a
very abstract notion sometimes so I'll start with a quick overview of what
that is, what we can achieve and use cases within tor. Then I'll go over a
last point which is safety.
This email doesn't go into the technical details of userspace tracing on how
and what will be done to add it to tor. That is for another discussion.
1. Overview
Long story short, you can see tracing as a specific type of logging as in it
records information of the application at runtime using tracepoints (similar
to logging statement) so it can be used later. But the main differences from
logging are in two parts: performance and API stability.
Usually, tracing implies high performance as in adds very little overhead to
the application in order to disrupt as little as possible the normal behavior
of an application. This is extremely useful in cases where you want to catch
race conditions or performance bottle necks.
Tracers in userspace have usually an "inprocess library" which in short means
that it records data (raw) from the application and move it to an outside
buffer. Then, that buffer is emptied either on disk or network by the outside
component of the tracer for which the data can be analyzed after collection.
So all a tracer do is, within the application when a tracepoint is hit, copy
some data into a buffer and yields back to the application.
The other part is the API stability. Very often, logs (let say at DEBUG level)
don't usually have strict stable requirements between released versions. But
tracing events (tracepoints), are exposed to the outside for tracers to hook
on, and for people to run analyzing tools on the recorded data. Thus,
stability is usually strongly encouraged. In other words, what the tracepoint
exposes, once released stable, should really not change that much over time.
With a proper abstraction in the application, we can offer stable tracepoints
for which a variety of tracers can hook themselves on at runtime. It is all
about providing an interface to the outside world.
2. Why Tracing in Tor
The tor software is a very complex beast. It has dozens of subsystems with
various interactions between them. One of the big main job of tor is to relay
data as fast as possible in order to keep the latency low. Which means, that
there are code paths that are considered "fast path" implying that they must
remain light and fast. One example is the crypto code that is hit at each
cell.
Tracing comes in extremely handy to hunt down race conditions, performance
issues, or even multithreading problems. A fast relay, let say 25MB/s, if we
wanted to record cell timing in order to hunt down such issues, it simply can
_not_ be done with logging at debug level since it slows down considerably tor
but also fills the disk in a matter of minutes.
And using the control port is not a good solution for two main reasons: string
formatting at each event and control port is part of the mainloop. So anything
you ask to go on the control port will add an overhead to the overall behavior
of tor which is not good when you hunt down races.
One concrete example where tracing was used in the past in tor is with the
rewrite of the cell scheduler (KIST). In order to measure cell timings within
Tor so bottlenecks issues could be found, tracing had to be added so millions
of events could be recorded within few minutes of using a fast relay in
production.
In pressure situation, this is where tracing comes handy. Tracing was also
used recently to find onion service v3 reachability issues. In order to
correlate connection, cell and circuit level problems with the higher level HS
subsystem, we were able to record events in all those subsystems, match them
with their precise timing (offered by tracing) and analyze the results later
on after recording the data.
3. Safety Discussion
Onto the last part I wanted to raise. Allowing anyone to record very low level
data from tor, there is an obvious safety question that must be raised.
Over the years, I've talked about tracing with many people in Tor and the
consensus was always that it should never be enabled in production. As in, the
packages shipped by Tor or by distros should _never_ build the tracepoints.
In other words, it should be considered a development option only. Not only an
option, but compiled _out_ in production and one has to explicitly build them
into tor.
For example (nothing final, just to show the idea):
$ ./configure --enable-tracing
I personally think that should be enough since the presence of the code
upstream won't stop people from using it (bad or not) but we can prevent it to
be in any legit Tor packages out there. See it a bit like the obsolete Tor2web
option that was never enabled in any published packages by Tor Project or
distros, one had to explicitly enable it at configure time.
The ControlPort is allowed in production and if a malicious actor gets access
to it, then game over. I do see tracing like that as well but at least we can
control its availability as a feature where we can't for the ControlPort as of
today.
Any feedback is very welcome! Concerns, questions, thoughts.
Cheers!
David
--
AOrq46damX3clZogjR9FlXTru90GV9IT5Rq/J0EzVSA=
3
4
OK got mix vanguards from packages.debian.org with Tor from deb.torproject.org repository?
by Patrick Schleizer 17 Jan '20
by Patrick Schleizer 17 Jan '20
17 Jan '20
I am considering to install vanguards by default in Whonix.
Is it sane to mix the Debian 'tor' package deb.torproject.org (buster
repository) with packages.debian.org buster version of 'vanguards' or do
you foresee any issues?
Cheers,
Patrick
2
2
Goal: We (Whonix) are researching optional bridge hosting for our users
to thwart web fingerprinting. Snowflake makes the most sense since no
NAT hole-punching is needed. Correct me if I'm wrong here because if
that was possible with obfs4 or meek it would save a lot of work.
We now know acting as a bridge makes the user act as a guard node and
not just a rendezvous to one.[1]
[1]
https://tor.stackexchange.com/questions/3636/what-is-the-relationship-betwe…
Some questions to help with implementation:
* Do the user's own data go through just two hops as well or are they
sent to the guard node they chose before deciding to run as a bridge?
How do configure Tor to do the former if it isn't?
* Are there plans to create signed debs for snowflake client/server so
we can use it with Debian's tor daemon?
* Do Tor Browser bundles with the snowflake addon also include the
server component?
* Do Alpha bundles have this code yet?
* When are these bundles expected to arrive to stable?
* Is it possible to interact headlessly with the snowflake server
component via commandline? How?
* How can we run TBB headlessly so users don't mistakenly interact with
it on the gateway?
2
1
15 Jan '20
Hi tor-dev@ mailing list,
Sorry for the many-months delay in updating Prop306.
I have updated Prop306, which is the IPv6 Happy Eyeballs proposal.
The GitHub PR is here: https://github.com/torproject/torspec/pull/98
The Trac ticket is here:
https://trac.torproject.org/projects/tor/ticket/29801
Some of the older discussion on Prop306 can be seen on the thread here:
https://lists.torproject.org/pipermail/tor-dev/2019-August/013959.html
Could some of you please review this proposal?
-Neel
===
https://www.neelc.org/
2
3
Dear all,
Please find attached out final version for Proposal 295. This version
has two changes compared to the previous one:
1. It fixes a vulnerability introduced in the previous iteration which
was the result of making the authentication layer stateless. Since there
is no freshness entering into the first layer, the same plaintext would
have resulted in the same ciphertext thus allowing an adversary to
distinguish. This is now fixed by restoring the running digest also for
authentication layer thus making in stateful again.
2. It adds an option for forward secrecy. The approach here is similar
to the one taken by Proposal 308 by replacing the encryption key of the
first layer after successfully processing the message. If this approach
is taken, there is no need to keep the authentication layer stateful
anymore.
I'll be leaving the mailing list to reduce noise in my mailbox. If you
need anything from me regarding this proposal just make sure to CC me on
the email and I'll be happy to answer.
Wishes,
Tomer
1
0
Hi!
When trying to update tor's geoip databases the other day I found that
MaxMind's GeoLite2 database is not available for download anymore. The
reason is:
https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-us…
We do not have a MaxMind account (yet). As of now, we cannot update
tor's geoip files nor the files used by Onionoo.
Should we try to get somebody who knows more about licenses and legal
stuff to review their GeoLite2 EULA and tell us if it's okay for us to
sign up for a MaxMind account? A possible downside would be that whoever
wants to verify that we didn't mess with their database when converting
it to our format would have to sign up for an account, too.
An alternative is to find another, truly open data source than MaxMind
databases (#25542, #26585). However, this could eat up more time than we
currently have available, and we should have something ready in a few
weeks from now. I'm not sure how we would squeeze this into the metrics
team schedule, so we might need help with this.
Thoughts on the two alternatives? What else did I miss?
All the best,
Karsten
2
1