Hi everyone,
Operation Onymous, the anecdotes about it (I don't think the DoS was a DoS), the wording of the related legal documents, and the previous CMU research... make me think that traffic confirmation attacks are now widely used in practice. Other, cat-and-mouse implemetation vulnerabilities may be diversions or parallel construction.
This kind of attack would mean it's game over for HS that use HTTP or other low-latency protocols.
Has there been research on integrating high-latency message delivery protocols with the hidden service model of location hiding? The SecureDrop or Pynchon Gate protocols sound like good starting points. I would love to participate, and encourage everyone to start in this direction (in your copious free time ;).
Mansour
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 09/11/14 18:33, Mansour Moufid wrote:
Has there been research on integrating high-latency message delivery protocols with the hidden service model of location hiding? The SecureDrop or Pynchon Gate protocols sound like good starting points. I would love to participate, and encourage everyone to start in this direction (in your copious free time ;).
This issue has just come up on the guardian-dev list, so we're moving the conversation over here. Context quoted below.
On Thu, Nov 20, 2014, at 09:46 AM, Michael Rogers wrote:
On 20/11/14 14:21, Nathan of Guardian wrote:
If we simply use Tor as a low-latency transport for asynchronous messaging then we're limited to Tor's threat model, i.e. we can't prevent traffic confirmation attacks. If we revive one of the remailers or build a new system then we're limited to a small number of users, i.e. a small anonymity set. So ideally we'd find some way of adding high-latency mix-like features to Tor.
How much difference in latency are we talking about? Can we just introduce some sort of randomness or delay into our existing stacks/protocols?
If we add delays at the application layer then those delays will be the same all along the Tor circuit. So from the point of view of an adversary doing a traffic confirmation attack against Tor, the delays are irrelevant: the adversary sees the same pattern of delays at both ends of the circuit, so the ends are still correlated with each other.
To decorrelate the traffic entering Tor from the traffic leaving Tor we need to delay the traffic at each hop. Ideally we'd go further than that and decouple high-latency traffic from circuits, so that traffic could enter Tor on one circuit and leave on another circuit, long after the first circuit was closed. But that's a much harder problem than adding a delay at each hop, I think.
Done right, this could provide a large anonymity set for the high-latency users and improve the traffic analysis resistance of Tor for the low-latency users at the same time, by providing a pool of latency-insensitive traffic to smooth out the bursty low-latency traffic between relays.
I think this really makes the case, why a native Tor-based messaging channel/layer/link/substrate should be implemented.
Great! Maybe we should move this discussion to the thread on tor-dev that Mansour Moufid started recently?
Cheers, Michael
Michael Rogers michael@briarproject.org writes:
On 09/11/14 18:33, Mansour Moufid wrote:
Has there been research on integrating high-latency message delivery protocols with the hidden service model of location hiding? The SecureDrop or Pynchon Gate protocols sound like good starting points. I would love to participate, and encourage everyone to start in this direction (in your copious free time ;).
This issue has just come up on the guardian-dev list, so we're moving the conversation over here. Context quoted below.
Yes, integrating low-latency with high-latency anonymity is a very interesting probleml. Unfortunately, I haven't had any time to think about it.
For people who want to think about it there is the "Blending different latency traffic with alpha-mixing" paper. Roger mentioned that one of the big challenges of making the paper usable with Tor, is switching from the message-based approach to stream-based.
Other potential papers are "Stop-and-Go-MIX" by Kesdogan et al. and "Garbled Routing (GR): A generic framework towards unification of anonymous communication systems" by Madani et al. But I haven't looked into them at all...
Another interesting question IMO, is whether we can build wrappers so that we can do WWW/XMPP/etc. over higher-latency anonymity systems.
If anyone wants to do some research and present some preliminary results or ideas here, it would be awesome. We can also turn it into a blog post for blog.torproject.org if it's good!
On Thu, Nov 20, 2014, at 09:46 AM, Michael Rogers wrote:
On 20/11/14 14:21, Nathan of Guardian wrote:
If we simply use Tor as a low-latency transport for asynchronous messaging then we're limited to Tor's threat model, i.e. we can't prevent traffic confirmation attacks. If we revive one of the remailers or build a new system then we're limited to a small number of users, i.e. a small anonymity set. So ideally we'd find some way of adding high-latency mix-like features to Tor.
How much difference in latency are we talking about? Can we just introduce some sort of randomness or delay into our existing stacks/protocols?
If we add delays at the application layer then those delays will be the same all along the Tor circuit. So from the point of view of an adversary doing a traffic confirmation attack against Tor, the delays are irrelevant: the adversary sees the same pattern of delays at both ends of the circuit, so the ends are still correlated with each other.
To decorrelate the traffic entering Tor from the traffic leaving Tor we need to delay the traffic at each hop. Ideally we'd go further than that and decouple high-latency traffic from circuits, so that traffic could enter Tor on one circuit and leave on another circuit, long after the first circuit was closed. But that's a much harder problem than adding a delay at each hop, I think.
Done right, this could provide a large anonymity set for the high-latency users and improve the traffic analysis resistance of Tor for the low-latency users at the same time, by providing a pool of latency-insensitive traffic to smooth out the bursty low-latency traffic between relays.
I think this really makes the case, why a native Tor-based messaging channel/layer/link/substrate should be implemented.
Great! Maybe we should move this discussion to the thread on tor-dev that Mansour Moufid started recently?
Cheers, Michael
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 25/11/14 12:45, George Kadianakis wrote:
Yes, integrating low-latency with high-latency anonymity is a very interesting probleml. Unfortunately, I haven't had any time to think about it.
For people who want to think about it there is the "Blending different latency traffic with alpha-mixing" paper. Roger mentioned that one of the big challenges of making the paper usable with Tor, is switching from the message-based approach to stream-based.
Other potential papers are "Stop-and-Go-MIX" by Kesdogan et al. and "Garbled Routing (GR): A generic framework towards unification of anonymous communication systems" by Madani et al. But I haven't looked into them at all...
Two of these papers were also mentioned in the guardian-dev thread, so I guess we're thinking along similar lines.
Alpha mixes and stop-and-go mixes are message-oriented, which as you said raises the question of how to integrate them into Tor. Judging by the abstract of the garbled routing paper (paywalled), it's a hybrid design combining message-oriented and circuit-oriented features. I think there might also be scope for circuit-oriented designs with higher latency than Tor currently provides, which might fit more easily into the Tor architecture than message-oriented or hybrid designs.
A circuit-oriented design would aim to prevent an observer from matching the circuits entering a relay with the circuits leaving the relay. In other words it would prevent traffic confirmation at each hop, and thus also end-to-end.
At least four characteristics can be used to match circuits entering and leaving a relay: start time, end time, total traffic volume and traffic timing. The design would need to provide ways to mix a circuit with other circuits with respect to each characteristic.
The current architecture allows start and end times to be mixed by pausing at each hop while building or tearing down a circuit. However, each hop of a given circuit must start earlier and end later than the next hop.
Traffic volumes can also be mixed by discarding padding at each hop, but each hop must carry at least as much traffic as the next hop (or vice versa for traffic travelling back towards the initiator). This is analogous to the problem of messages shrinking at each hop of a cypherpunk mix network, as padding is removed but not added.
There's currently no way to conceal traffic timing - each relay forwards cells as soon as it can.
Here's a crude sketch of a design that allows all four characteristics to be mixed, with fewer constraints than the current architecture. Each hop of a circuit must start earlier than the next hop, but it can end earlier or later, carry more or less traffic, and have different traffic timing.
The basic idea is that the initiator chooses a traffic pattern for each direction of each hop. The traffic pattern is described by a distribution of inter-cell delays. Each relay sends the specified traffic pattern regardless of whether it has any data to send, and regardless of what happens at other hops.
Whenever a relay forwards a data cell along a circuit, it picks a delay from the specified distribution, adds it to the current time, and writes the result on the circuit's queue. When the scheduler round-robins over circuits, it skips any circuits with future times written on them. If a circuit's time has come, the relay sends the first queued data cell if there is one; if not, it sends a single-hop padding cell.
Flow control works end-to-end in the same way as any other Tor circuit: single-hop padding cells aren't included in the circuit's flow control window.
When tearing down the circuit, the initiator tells each relay how long to continue sending the specified traffic pattern in each direction. Thus each hop may stop sending traffic before or after the next hop.
Even this crude design has multiple parameters, so its anonymity properties may not be easy to reason about. Even if we restrict traffic patterns to a single-parameter distribution such as the exponential, we also have to consider the pause time at each hop while building circuits and the 'hangover time' at each hop while tearing them down. But I think we can mine the mix literature for some ideas to apply - and probably some attacks against this first attempt at a design as well.
Cheers, Michael
I’m interested in helping out with this, mostly because we’ll want it for Pond : https://pond.imperialviolet.org/
I’ve read the alpha-mixing paper, but not the others, so I’ll check em’ out.
Jeff
On 9 Dec 2014, at 16:40, Michael Rogers michael@briarproject.org wrote:
Signed PGP part On 25/11/14 12:45, George Kadianakis wrote:
Yes, integrating low-latency with high-latency anonymity is a very interesting probleml. Unfortunately, I haven't had any time to think about it.
For people who want to think about it there is the "Blending different latency traffic with alpha-mixing" paper. Roger mentioned that one of the big challenges of making the paper usable with Tor, is switching from the message-based approach to stream-based.
Other potential papers are "Stop-and-Go-MIX" by Kesdogan et al. and "Garbled Routing (GR): A generic framework towards unification of anonymous communication systems" by Madani et al. But I haven't looked into them at all...
Two of these papers were also mentioned in the guardian-dev thread, so I guess we're thinking along similar lines.
Alpha mixes and stop-and-go mixes are message-oriented, which as you said raises the question of how to integrate them into Tor. Judging by the abstract of the garbled routing paper (paywalled), it's a hybrid design combining message-oriented and circuit-oriented features. I think there might also be scope for circuit-oriented designs with higher latency than Tor currently provides, which might fit more easily into the Tor architecture than message-oriented or hybrid designs.
A circuit-oriented design would aim to prevent an observer from matching the circuits entering a relay with the circuits leaving the relay. In other words it would prevent traffic confirmation at each hop, and thus also end-to-end.
At least four characteristics can be used to match circuits entering and leaving a relay: start time, end time, total traffic volume and traffic timing. The design would need to provide ways to mix a circuit with other circuits with respect to each characteristic.
The current architecture allows start and end times to be mixed by pausing at each hop while building or tearing down a circuit. However, each hop of a given circuit must start earlier and end later than the next hop.
Traffic volumes can also be mixed by discarding padding at each hop, but each hop must carry at least as much traffic as the next hop (or vice versa for traffic travelling back towards the initiator). This is analogous to the problem of messages shrinking at each hop of a cypherpunk mix network, as padding is removed but not added.
There's currently no way to conceal traffic timing - each relay forwards cells as soon as it can.
Here's a crude sketch of a design that allows all four characteristics to be mixed, with fewer constraints than the current architecture. Each hop of a circuit must start earlier than the next hop, but it can end earlier or later, carry more or less traffic, and have different traffic timing.
The basic idea is that the initiator chooses a traffic pattern for each direction of each hop. The traffic pattern is described by a distribution of inter-cell delays. Each relay sends the specified traffic pattern regardless of whether it has any data to send, and regardless of what happens at other hops.
Whenever a relay forwards a data cell along a circuit, it picks a delay from the specified distribution, adds it to the current time, and writes the result on the circuit's queue. When the scheduler round-robins over circuits, it skips any circuits with future times written on them. If a circuit's time has come, the relay sends the first queued data cell if there is one; if not, it sends a single-hop padding cell.
Flow control works end-to-end in the same way as any other Tor circuit: single-hop padding cells aren't included in the circuit's flow control window.
When tearing down the circuit, the initiator tells each relay how long to continue sending the specified traffic pattern in each direction. Thus each hop may stop sending traffic before or after the next hop.
Even this crude design has multiple parameters, so its anonymity properties may not be easy to reason about. Even if we restrict traffic patterns to a single-parameter distribution such as the exponential, we also have to consider the pause time at each hop while building circuits and the 'hangover time' at each hop while tearing them down. But I think we can mine the mix literature for some ideas to apply - and probably some attacks against this first attempt at a design as well.
Cheers, Michael
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Tue, Dec 9, 2014 at 4:40 PM, Michael Rogers michael@briarproject.org wrote:
On 25/11/14 12:45, George Kadianakis wrote:
Yes, integrating low-latency with high-latency anonymity is a very interesting probleml. Unfortunately, I haven't had any time to think about it.
For people who want to think about it there is the "Blending different latency traffic with alpha-mixing" paper. Roger mentioned that one of the big challenges of making the paper usable with Tor, is switching from the message-based approach to stream-based.
Other potential papers are "Stop-and-Go-MIX" by Kesdogan et al. and "Garbled Routing (GR): A generic framework towards unification of anonymous communication systems" by Madani et al. But I haven't looked into them at all...
Two of these papers were also mentioned in the guardian-dev thread, so I guess we're thinking along similar lines.
At least four characteristics can be used to match circuits entering and leaving a relay: start time, end time, total traffic volume and traffic timing. The design would need to provide ways to mix a circuit with other circuits with respect to each characteristic.
Traffic volumes can also be mixed by discarding padding at each hop,
There's currently no way to conceal traffic timing - each relay forwards cells as soon as it can.
Guilty of tldr here, yet similarly, with the easily trackable characteristics firstly above, I'm not seeing a benefit to anything other than filling all links with chaff which then hides all those parameters but one...
Then as far as passive observation goes, you're left with only packet clocking, for which you can use your bucket queueing randomizers to defeat at no more than the cost of their average jitter width X the number of hops.
http://en.wikipedia.org/wiki/Asynchronous_Transfer_Mode
Granted it does require a 'get what you pay for' dedicated pipe of bandwidth [purchasing/giving] mentality, but at least among relays that already exists today, and if that is returned by gaining more resistance to GPA who can currently do something as simple as counting bytes in a time and pumping your hidden services while being their guard, well then, so what?
It's also conceivable that nodes could choose to carry a 'chaff capable' flag. And that other nodes, and clients, could choose to select and participate in that when building links.
Note, there's no change to the low-latency circuit based expectations here at all, other than ability of CPU to process chaff operations. Though that might be easier if using a UDP or packet switched model.
I'm sure someone has explored chaff in papers already. Is it valid? As opposed to just requiring lots of new work to implement.
I can't see any other way to have both low latency and hide the talkers other than filling bandwidth committed links with talkers. And when you want to talk, just fill in your voice in place of the fake ones you'd otherwise send. That seems good against the GPA above.
The other question is how, similar to Sybil, does chaff hold up when an ACTIVE adversary runs nodes and can therefore discriminate wheat/chaff on those nodes and perhaps communicate that among their nodes on the backend to reduce things back to being counting bytes in time above. I'm thinking it's not any worse than today's Sybil and/or untrusted nodes, and thus not a factor in considering implementing chaff.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 10/12/14 00:14, grarpamp wrote:
Guilty of tldr here, yet similarly, with the easily trackable characteristics firstly above, I'm not seeing a benefit to anything other than filling all links with chaff which then hides all those parameters but one...
I'm not opposed to this approach, but "filling all links" isn't as simple as it sounds:
* Which links should carry chaff? We can't send chaff between every pair of relays, especially as the network grows.
* How much chaff should we send on each link? At present relays don't divide their bandwidth between links ahead of time - they allocate bandwidth where it's needed. The bandwidth allocated to each link could be a smoothed function of the load - but then we need to make sure that instantaneous changes in the function's output don't leak any information about instantaneous changes in its input. That isn't trivial.
* Is it acceptable to add any delay at all to low-latency traffic? My assumption is no - people already complain about Tor being slow for interactive protocols. So we'd still need to mark certain circuits as high-latency, and only those circuits would benefit from chaff.
Once you fill in those details, the chaffing approach looks pretty similar to the approach I suggested: the relay treats low-latency circuits in the same way as now, allocates some bandwidth to high-latency traffic, and uses that bandwidth for data or padding depending on whether it has any data to send.
I can't see any other way to have both low latency and hide the talkers other than filling bandwidth committed links with talkers. And when you want to talk, just fill in your voice in place of the fake ones you'd otherwise send. That seems good against the GPA above.
The alternative to filling the link with talkers is mixing the talkers - - i.e. preventing the adversary from matching the circuits entering a relay with the circuits leaving it. But as I said above, when you get down to the details these approaches start to look similar - perhaps they're really just different ways of describing the same thing.
Some papers on this topic that I haven't read for a while:
http://acsp.ece.cornell.edu/papers/VenkTong_Anonymous_08SSP.pdf http://www.csl.mtu.edu/cs6461/www/Reading/08/Wang-ccs08.pdf https://www.cosic.esat.kuleuven.be/publications/article-1230.pdf
Cheers, Michael
On Thu, Dec 11, 2014 at 8:26 AM, Michael Rogers michael@briarproject.org wrote:
- Which links should carry chaff?
First you need it to cover the communicating endpoints entry links at the edges. But if your endpoints aren't generating enough traffic to saturate the core, or even worse if there's not enough talking clients to multiplex each other through shared entry points densely enough, that's bad. So all links that any node has established to any other nodes seem to need chaffed.
We can't send chaff between every pair of relays, especially as the network grows.
Right, a full mesh is not necessary, only chaff the established links. Whether that includes today's idle prebuilt paths, or only upon actual user traffic, no idea yet.
- How much chaff should we send on each link?
Today, all nodes have an idea that the most bw you're ever going to get out of the system anyways is up to your pipe capacity, whether you let it free run, or you set a limit... all within your personal or purchased limits. So you just decide how much you can bear, set your committed rate, and fill it up.
At present, relays don't divide their bandwidth between links ahead of time - they allocate bandwidth where it's needed. The bandwidth allocated to each link could be a smoothed function of the load
This sounds like picking some chaff ratio (even a ratio function) and scaling up the overall link bandwidth as needed to carry enough overall wheat within that. Not sure if that works to cover the 'first, entries, GPA watching there' above. Seems too user session driven bursty at those places, or the ratio/scale function is too slow to accept fast wheat demand. So you need a constant flow of CAR to hide all style wheat demands in. I scrap the ethernet thinking and recall clocked ATM. You interleave your wheat in place of the base fixed rate chaff of the link as needed. You negotiate the bw at the switch (node).
but then we need to make sure that instantaneous changes in the function's output don't leak any information about instantaneous changes in its input.
This is the point of filling the links fulltime, you don't see any such ripples. (Maybe instantaneous pressure gets translated into a new domain of some nominal random masking jitter below. Which may still be a bit ethernet-ish.)
That isn't trivial.
What needs work is the bw negotiation protocol between nodes. You've set the CAR on your box, now how to divide it among established links? Does it reference other CARs in the consensus, does it accept subtractive link requests until full, does it span just one hop or the full path, is it part of each node's first hop link extension relative to itself as a circuit builds its way through, are there PVCs, SVCs, then the recalc as nodes and paths come and go, how far in does the wheat/chaff recognition go, etc? Do you need to drop any node that doesn't keep up RX/TX at the negotiated rate as then doing something nefarious?
- Is it acceptable to add any delay at all to low-latency traffic? My
assumption is no - people already complain about Tor being slow for interactive protocols.
No fixed delays, because yes it's annoyingly additive, and you probably can't clock packets onto the wire from userland Tor precisely enough for [1]. Recall the model... a serial bucket stream. Random jitter within an upper bound is different from fixed delay. Good question. [1 There was something I was trying to mask or prevent with jitter, I'll try to remember.]
So we'd still need to mark certain circuits as high-latency, and only those circuits would benefit from chaff.
High latency feels more like a class of service... not tied to selective chaffing on/off. I'll try to reread these parts from you.
Once you fill in those details, the chaffing approach looks pretty similar to the approach I suggested: the relay treats low-latency circuits in the same way as now, allocates some bandwidth to high-latency traffic, and uses that bandwidth for data or padding depending on whether it has any data to send.
I can't see any other way to have both low latency and hide the talkers other than filling bandwidth committed links with talkers. And when you want to talk, just fill in your voice in place of the fake ones you'd otherwise send. That seems good against the GPA above.
The alternative to filling the link with talkers is mixing the talkers
- i.e. preventing the adversary from matching the circuits entering a
relay with the circuits leaving it.
Well, there's the packet switched routing approach (mixed talking in that regard) vs. circuit switched. You can do chaff with either. But as way at the top, when you are the last hop and/or the net's not busy, you've no one to mix with, and must flood chaff.
But as I said above, when you get down to the details these approaches start to look similar - perhaps they're really just different ways of describing the same thing.
Perhaps.
Some papers on this topic that I haven't read for a while:
http://acsp.ece.cornell.edu/papers/VenkTong_Anonymous_08SSP.pdf http://www.csl.mtu.edu/cs6461/www/Reading/08/Wang-ccs08.pdf https://www.cosic.esat.kuleuven.be/publications/article-1230.pdf
I'll look to see if what I'm thinking is described.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Resurrecting a thread from last year...
On 11/12/14 16:05, grarpamp wrote:
On Thu, Dec 11, 2014 at 8:26 AM, Michael Rogers michael@briarproject.org wrote:
- Which links should carry chaff?
First you need it to cover the communicating endpoints entry links at the edges. But if your endpoints aren't generating enough traffic to saturate the core, or even worse if there's not enough talking clients to multiplex each other through shared entry points densely enough, that's bad. So all links that any node has established to any other nodes seem to need chaffed.
Are you proposing that chaff would be sent end-to-end along circuits? (That's what "generating enough traffic to saturate the core" seems to imply.) If so, that would raise a number of problems:
1. Chaff would start and end at the same time for all hops of a given circuit.
2. Each hop of a given circuit would carry at least as much traffic away from the initiator as the next hop, and at most as much traffic towards the initiator as the next hop (where traffic = wheat + chaff in this context).
3. A delay introduced at one point in a circuit (e.g. by inducing congestion) would be visible along the rest of circuit, potentially revealing the path taken by the circuit.
The mechanism I proposed doesn't suffer from these problems.
- How much chaff should we send on each link?
Today, all nodes have an idea that the most bw you're ever going to get out of the system anyways is up to your pipe capacity, whether you let it free run, or you set a limit... all within your personal or purchased limits. So you just decide how much you can bear, set your committed rate, and fill it up.
That tells you how much chaff to send in total, but not how much to send on each link.
At present, relays don't divide their bandwidth between links ahead of time - they allocate bandwidth where it's needed. The bandwidth allocated to each link could be a smoothed function of the load
This sounds like picking some chaff ratio (even a ratio function) and scaling up the overall link bandwidth as needed to carry enough overall wheat within that. Not sure if that works to cover the 'first, entries, GPA watching there' above. Seems too user session driven bursty at those places, or the ratio/scale function is too slow to accept fast wheat demand. So you need a constant flow of CAR to hide all style wheat demands in. I scrap the ethernet thinking and recall clocked ATM. You interleave your wheat in place of the base fixed rate chaff of the link as needed. You negotiate the bw at the switch (node).
Here it sounds like you're proposing hop-by-hop chaff, not end-to-end?
but then we need to make sure that instantaneous changes in the function's output don't leak any information about instantaneous changes in its input.
This is the point of filling the links fulltime, you don't see any such ripples. (Maybe instantaneous pressure gets translated into a new domain of some nominal random masking jitter below. Which may still be a bit ethernet-ish.)
A relay can't send chaff to every other relay, so you can't fill all the links fulltime. The amount of wheat+chaff on each link must change in response to the amount of wheat.
The question is, do those changes only occur when circuits are opened and closed - in which case the endpoint must ask each relay to allocate some bandwidth for the lifetime of the circuit, as in my proposal - or do changes occur in response to changes in the amount of wheat, in which case we would need to find a function that allocates bandwidth to each link in response to the amount of wheat, without leaking too much information about the amount of wheat?
That isn't trivial.
What needs work is the bw negotiation protocol between nodes. You've set the CAR on your box, now how to divide it among established links? Does it reference other CARs in the consensus, does it accept subtractive link requests until full, does it span just one hop or the full path, is it part of each node's first hop link extension relative to itself as a circuit builds its way through, are there PVCs, SVCs, then the recalc as nodes and paths come and go, how far in does the wheat/chaff recognition go, etc? Do you need to drop any node that doesn't keep up RX/TX at the negotiated rate as then doing something nefarious?
It seems like there are a lot of unanswered questions here. I'm not saying the idea I proposed is perfect, but it does avoid all these questions.
- Is it acceptable to add any delay at all to low-latency
traffic? My assumption is no - people already complain about Tor being slow for interactive protocols.
No fixed delays, because yes it's annoyingly additive, and you probably can't clock packets onto the wire from userland Tor precisely enough for [1]. Recall the model... a serial bucket stream. Random jitter within an upper bound is different from fixed delay. Good question. [1 There was something I was trying to mask or prevent with jitter, I'll try to remember.]
If I understand right, you were using jitter to disguise the fine-grained timing of packets, so that packets arriving at relay couldn't easily be matched with packets leaving it. Packet departures would be triggered by a clock rather than by packet arrivals.
My suggestion was similar in that respect: packet departure times would be drawn from a random distribution.
Either way, decoupling the arrival and departure times necessarily involves delaying packets, which isn't acceptable for Tor's existing low-latency use cases. So we're looking at two classes of traffic here: one with lower latency, and the other with better protection against traffic confirmation.
So we'd still need to mark certain circuits as high-latency, and only those circuits would benefit from chaff.
High latency feels more like a class of service... not tied to selective chaffing on/off. I'll try to reread these parts from you.
Thanks, class of service is a better way to put it.
Cheers, Michael
On Thu, Jan 1, 2015 at 9:24 AM, Michael Rogers michael@briarproject.org wrote:
Are you proposing that chaff would be sent end-to-end along circuits?
No. That wouldn't work. Crypto of circuits would blind middles to any end to end padding, and not able to contribute control back to you, let alone anonymously without knowing you were using them. Circuits also need move around and share and cross links randomly on demand further making intermediate fill control an impossible task. There'd also be regions in the mesh with too little/much fill, and unfilled capacity mismatches at the endpoints.
Here it sounds like you're proposing hop-by-hop chaff, not end-to-end?
"Hop by hop" would be different in that, yes, it follows the same path as your "end to end" onion encrypted private data circuit, but is not encapsulated within it, rather negotiated in some other band along the way. That still seems quite difficult to design and manage.
I'm proposing each node fill in a star pattern with other nodes one hop away.
(That's what "generating enough traffic to saturate the core" seems to imply.)
No. If you have 10k node mesh and too few are talking across it, a quiet day, it becomes easier to determine paths and endpoint pairs of those who are.
That tells you how much chaff to send in total, but not how much to send on each link.
No. Buy or allocate 1Mbit of internet for your tor. You now have to fill that 1Mbit with tor. So find enough nodes from consensus who also have a need to fill some part of their own quota and start passing fill with them.
You'd have to think about whether to use one of them as your own guard or make another filled connection to your real guard.
A relay can't send chaff to every other relay
I previously said you are not opening connections to every relay in the consensus full matrix, only enough to both have enough connections (say three), and fill your dedicated rate.
so you can't fill all the links fulltime.
You'd have to think about how people with 64k vs 128k vs 512k vs 1Mbit vs 1Gbit could spread their needs and abilities for chaff around the whole cloud (to make sure that at least every edge client is full all the time). That's in control, or maybe even the consensus. Are there 10 1Mbit nodes around to fill 1 10Mbit node? Do nodes have to scale their commitments dynamically somehow in a way that does not follow the path of real wheat? I don't know yet.
The amount of wheat+chaff on each link must change in response to the amount of wheat.
Of course, wheat occupies space in a stream of buckets you'd otherwise be filling with chaff.
The question is, do those changes only occur when circuits are opened and closed - in which case the endpoint must ask each relay to allocate some bandwidth for the lifetime of the circuit
No, I don't think it should depend on circuit flux, that's way too complex. Only upon node flux. That's much more stable and just one easy hop away.
or do changes occur in response to changes in the amount of wheat, in which case we would need to find a function that allocates bandwidth to each link in response to the amount of wheat, without leaking too much information about the amount of wheat?
If the node capacities of the variety network three paragraphs up showed the need for a leak free dynamic commitment function, then yes. If not, then all you need is to know how many out of 100 chaff buckets you need to swap out for wheat based on demand of circuits you know are going through you.
It seems like there are a lot of unanswered questions here.
Yes there are :) I'm just putting out potential elements I figured might fit the puzzle if people actually sat down and tried to solve link padding in a network such as Tor. (Unfortunately I don't tend to sit very long.)
If I understand right, you were using jitter to disguise the
I think that was it. It might, if needed, also hide the time it takes your CPU to recalc the wheat ratio in response to demand of a new circuit.
Either way, decoupling the arrival and departure times necessarily involves delaying packets, which isn't acceptable for Tor's existing low-latency use cases.
Disagree as before. The average round trip time between a tor client and clearnet via an exit is ~1000ms plus. Adding 8 hops x 5ms avg = 80ms max of jitter isn't going to turn a network into a store and forward yawn session.
So we're looking at two classes of traffic here: one with lower latency, and the other with better protection against traffic confirmation.
Whether or not such jitter is a useful defense needs evaluated. It may be possible to path through relays in the consensus that do add jitter to fill, or path through ones that don't add jitter in order to get minimum latency for irc/voice/video. A SocksPort setting perhaps.
I'll try to reread these parts from you.
When I'm sitting down :) I've got 4 mails of same subject from you on tor-dev, reference therein to guardian-dev. I think I got to all your questions though.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 04/01/15 09:45, grarpamp wrote:
That tells you how much chaff to send in total, but not how much to send on each link.
No. Buy or allocate 1Mbit of internet for your tor. You now have to fill that 1Mbit with tor. So find enough nodes from consensus who also have a need to fill some part of their own quota and start passing fill with them.
You'd have to think about whether to use one of them as your own guard or make another filled connection to your real guard.
To be clear, are you suggesting that each relay and each client should pick some relays from the consensus and exchange chaff with them, and clients should also exchange chaff with their guards?
If that's what you're suggesting, then what happens if a client wants to extend a circuit from relay A to relay B, but A and B aren't exchanging chaff with each other?
The amount of wheat+chaff on each link must change in response to the amount of wheat.
Of course, wheat occupies space in a stream of buckets you'd otherwise be filling with chaff.
Are you saying that wheat can only be sent over links that have been chosen to carry chaff?
Cheers, Michael
On Mon, Jan 5, 2015 at 2:17 PM, Michael Rogers michael@briarproject.org wrote:
To be clear, are you suggesting that each relay and each client should pick some relays from the consensus and exchange chaff with them, and clients should also exchange chaff with their guards?
I'm saying you definitely have to have your guard link[s] filled with a steady stream of ATM-like buckets, where you fill with your wheat as needed but can achieve no more wheat throughput than the fixed rate of the bucket stream. And so does every other edge node to a depth of at least one hop inside the net. Beyond that... because your blob might be the only one passing through your guard on a quiet day/path (whose far side to the middle node is also observed), every node on the net has to have its links filled. And since every link has to be filled, you can just pick nodes out of the consensus to peer with for additional fill purposes if negotiations with your guards doesn't result in you filling the CAR of the overall tor pipe on your machine. It'll help if you think of links to your peers not just as a potential paths for building circuits, but as bandwidth commitments before that. Since every node has its own first hops and is doing their own fill there, you don't have to worry about what they do, just cover your own and donate any excess capacity you have to whatever node on the net comes asking for help to fill theirs.
If you don't fill all the links, your fact of downloading, page loading, keystroking are going to be passively observable as those blobs enter, transit, and leave the net.
If that's what you're suggesting, then what happens if a client wants to extend a circuit from relay A to relay B, but A and B aren't exchanging chaff with each other?
This doesn't happen. You have a lower layer of nodes doing fill guided by knowledge of who their own guards are (in this model relays must have notions of their own guards / first hops). Circuits are then strung unaffected and unaware over that as usual. Relays know the difference between their own action doing p2p for fill, and non fill (circuit) data trying to leave them... so they can make room in their existing first hop links, or negotiate new ones with where that data is trying to go. Yes, if for some reason the relay could not negotiate its underlying first hop bandwidth for that, the circuit extension itself would die (perhaps EBUSY from the circuit side) and the client would try another path.
Are you saying that wheat can only be sent over links that have been chosen to carry chaff?
The point is to hide the fact that you're sending bytes (wheat), at what time, in what amount, and with what properties, that might be identifiably observed again tied your destination on the other side of the net. So yes, of course, by definition of link filling.
Sorry I've no graphics or complete scheme yet, for that matter this could all be rubbish.
If anyone knows of networks (whether active, defunct or discredited) that have used link filling, I'd like a reference. Someone out there has to have at least coded one for fun.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 08/01/15 06:03, grarpamp wrote:
If that's what you're suggesting, then what happens if a client wants to extend a circuit from relay A to relay B, but A and B aren't exchanging chaff with each other?
This doesn't happen. You have a lower layer of nodes doing fill guided by knowledge of who their own guards are (in this model relays must have notions of their own guards / first hops). Circuits are then strung unaffected and unaware over that as usual. Relays know the difference between their own action doing p2p for fill, and non fill (circuit) data trying to leave them... so they can make room in their existing first hop links, or negotiate new ones with where that data is trying to go.
Thanks for the explanation.
If relays A and B negotiate a new link between them when a client wants to extend a circuit from A to B, then A and B must each subtract some bandwidth from their existing links to allocate to the new link (since they're already using their full bandwidth allowance, by design).
I suspect the details of how that reallocation is done will be important for anonymity. If bandwidth is subtracted from existing links without taking into account how much wheat the existing links are carrying, then any circuits using those links will feel the squeeze - the adversary will be able to tell when a relay's opening a new link by building a circuit through the relay, filling the circuit with wheat, and waiting for its throughput to get squeezed.
On the other hand, if bandwidth is subtracted from existing links in such a way that existing wheat is never affected - in other words, if you only reallocate the spare bandwidth - then it's possible for an adversary observing a relay to find out how much wheat each link is carrying by asking the relay to negotiate new links until it says no because it can't reallocate any more spare bandwidth, at which point any links that weren't requested by the adversary are now carrying nothing but wheat.
If anyone knows of networks (whether active, defunct or discredited) that have used link filling, I'd like a reference. Someone out there has to have at least coded one for fun.
PipeNet was a proposal for an onion-routing-like network with constant-rate traffic: http://cypherpunks.venona.com/date/1998/01/msg00878.html
Tarzan was an onion-routing-like network in which each relay exchanged constant-rate traffic with a fixed set of other relays called its mimics, and circuits could only be constructed over links between mimics: http://pdos.csail.mit.edu/tarzan/docs/tarzan-ccs02.pdf http://pdos.csail.mit.edu/tarzan/docs/tarzan-thesis.pdf
George Danezis looked at the anonymity properties of paths chosen from a restricted graph rather than a complete graph (this was in the context of mix networks, but the findings may also be relevant to onion routing): http://www.freehaven.net/anonbib/cache/danezis:pet2003.pdf
Cheers, Michael
On Mon, Jan 19, 2015 at 6:08 PM, Michael Rogers michael@briarproject.org wrote:
Thanks for the explanation.
I think I have more comment address this subsection...
If anyone knows of networks (whether active, defunct or discredited) that have used link filling, I'd like a reference. Someone out there has to have at least coded one for fun.
[list of relevant papers]
Chakravarty et al of Aug 2013 was going to go back (page 13) and write and test dummy traffic as defense to this passive observation. Being recent, some followup and sharing there may be oppurtune... Netflow Traffic Analysis of Anonymity Networks http://hdl.handle.net/10022/AC:P:21763
Michael Rogers:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 25/11/14 12:45, George Kadianakis wrote:
Yes, integrating low-latency with high-latency anonymity is a very interesting probleml. Unfortunately, I haven't had any time to think about it.
For people who want to think about it there is the "Blending different latency traffic with alpha-mixing" paper. Roger mentioned that one of the big challenges of making the paper usable with Tor, is switching from the message-based approach to stream-based.
Other potential papers are "Stop-and-Go-MIX" by Kesdogan et al. and "Garbled Routing (GR): A generic framework towards unification of anonymous communication systems" by Madani et al. But I haven't looked into them at all...
Two of these papers were also mentioned in the guardian-dev thread, so I guess we're thinking along similar lines.
Alpha mixes and stop-and-go mixes are message-oriented, which as you said raises the question of how to integrate them into Tor. Judging by the abstract of the garbled routing paper (paywalled), it's a hybrid design combining message-oriented and circuit-oriented features. I think there might also be scope for circuit-oriented designs with higher latency than Tor currently provides, which might fit more easily into the Tor architecture than message-oriented or hybrid designs.
A circuit-oriented design would aim to prevent an observer from matching the circuits entering a relay with the circuits leaving the relay. In other words it would prevent traffic confirmation at each hop, and thus also end-to-end.
At least four characteristics can be used to match circuits entering and leaving a relay: start time, end time, total traffic volume and traffic timing. The design would need to provide ways to mix a circuit with other circuits with respect to each characteristic.
The current architecture allows start and end times to be mixed by pausing at each hop while building or tearing down a circuit. However, each hop of a given circuit must start earlier and end later than the next hop.
Traffic volumes can also be mixed by discarding padding at each hop, but each hop must carry at least as much traffic as the next hop (or vice versa for traffic travelling back towards the initiator). This is analogous to the problem of messages shrinking at each hop of a cypherpunk mix network, as padding is removed but not added.
There's currently no way to conceal traffic timing - each relay forwards cells as soon as it can.
Here's a crude sketch of a design that allows all four characteristics to be mixed, with fewer constraints than the current architecture. Each hop of a circuit must start earlier than the next hop, but it can end earlier or later, carry more or less traffic, and have different traffic timing.
The basic idea is that the initiator chooses a traffic pattern for each direction of each hop. The traffic pattern is described by a distribution of inter-cell delays. Each relay sends the specified traffic pattern regardless of whether it has any data to send, and regardless of what happens at other hops.
Whenever a relay forwards a data cell along a circuit, it picks a delay from the specified distribution, adds it to the current time, and writes the result on the circuit's queue. When the scheduler round-robins over circuits, it skips any circuits with future times written on them. If a circuit's time has come, the relay sends the first queued data cell if there is one; if not, it sends a single-hop padding cell.
You might also like the "Adaptive Padding" defense: http://freehaven.net/anonbib/cache/ShWa-Timing06.pdf. It implements pretty much what you describe here. It is one of my current low-resource favorites.
Marc Juarez actually has a reference implementation of this defense as a pluggable transport: https://bitbucket.org/mjuarezm/obfsproxy-wfpadtools/
It is a generalization of the Adaptive Padding scheme, and is described here: https://gitweb.torproject.org/user/mikeperry/torspec.git/tree/proposals/idea...
Our goal is to study this as a defense for Website Traffic Fingerprinting, but I also believe it has promise for frustrating E2E correlation.
Unfortunately, hidden services also represent the hardest form of E2E correlation - where the adversary actually gets to control when you connect, and to a large degree what you send (both in terms of amount, and traffic pattern). In this sense, it actually resembles a Global Active Adversary more than a GPA. It may even be harder than a generic GAA, because of the "adversary also decides when you send data" property.
I am unfortunately not optimistic about simple low-overhead padding being successful here. At the very least, this problem will require something more like a congestion-sensitive "always pad if there is spare capacity available to grow the CWIND, but no data". We have an approximation of this defense, too, in the form of the "Basket" pluggable transport: https://lists.torproject.org/pipermail/tor-dev/2014-December/007977.html
If the adversary is *also* capable of Guard node coercion/compromise, then we also want something like "Basket" to pad all the way to the middle node, but to accomplish that, we actually need fair end-to-end flow control for clients (and by extension, a datagram/UDP transport).
(Though also note: We should not need such end-to-end flow control to use Adaptive Padding all the way to the middle node, because we can approximate fairness using statistics and consensus load balancing information).
So the padding approach is a long, long road for hidden services. Luckily, most of it is also stuff we want for other reasons, anyway.
On Thu, 8 Jan 2015 03:25:52 -0800 Mike Perry mikeperry@torproject.org wrote:
I am unfortunately not optimistic about simple low-overhead padding being successful here. At the very least, this problem will require something more like a congestion-sensitive "always pad if there is spare capacity available to grow the CWIND, but no data". We have an approximation of this defense, too, in the form of the "Basket" pluggable transport: https://lists.torproject.org/pipermail/tor-dev/2014-December/007977.html
I believe most of BuFLO's shortcomings documented in Cai, X., Nithyanand, R., Johnson R., "New Approaches to Website Fingerprinting Defenses" 5.A. apply to the currently proposed defense, though some of the problems have been solved via CS-BuFLO/Basket.
CS-BuFLO as implemented in Basket without application assistance (to terminate padding early) has an absolutely gargantuan amount of bandwidth overhead, and the smarter Basket variant that doesn't have stupid amounts of sender side buffering isn't portable (for the same reasons that the new KIST code isn't).
If the adversary is *also* capable of Guard node coercion/compromise, then we also want something like "Basket" to pad all the way to the middle node, but to accomplish that, we actually need fair end-to-end flow control for clients (and by extension, a datagram/UDP transport).
Indeed. This is something I'd consider fun, but is really hard to do correctly, since the CS part ideally should be global across all connections (Basket gets that for free since there's only one TCP connection per guard, and only one guard).
(Though also note: We should not need such end-to-end flow control to use Adaptive Padding all the way to the middle node, because we can approximate fairness using statistics and consensus load balancing information).
None of the schemes I've seen proposed so far fit well into Tor as it is now, due to the fact that multiple circuits are multiplexed over a single channel (that enforces in-order-reliable delivery semantics). HOL blocking is a thing that needs to be considered. Solving this undoubtedly has really interesting anonymity impacts that I haven't given much thought to.
Another issue with all of the padding schemes that I currently don't have a solid suggestion for is how to actually detect malicious peers that don't pad/pad incorrectly.
I know Marc Juarez is going to be systematically evaluating defenses as part of his research work, so perhaps he can provide more insight into algorithm selection.
Regards,
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 09/01/15 14:40, Yawning Angel wrote:
I believe most of BuFLO's shortcomings documented in Cai, X., Nithyanand, R., Johnson R., "New Approaches to Website Fingerprinting Defenses" 5.A. apply to the currently proposed defense, though some of the problems have been solved via CS-BuFLO/Basket.
Thanks for the pointer to an excellent paper. The single-hop padding scheme I suggested is closer to CS-BuFLO than BuFLO: it operates over TCP, doesn't inject padding when the TCP connection is congested, and allows the initiator to decide when to close each hop of the circuit (similar to CS-BuFLO's early termination).
CS-BuFLO as implemented in Basket without application assistance (to terminate padding early) has an absolutely gargantuan amount of bandwidth overhead, and the smarter Basket variant that doesn't have stupid amounts of sender side buffering isn't portable (for the same reasons that the new KIST code isn't).
Why do you need stupid amounts of buffering? Bursts of data from the application can be smoothed out by keeping a small buffer and making the application block until there's space in the buffer - as TCP does, for example.
In general I don't see any need for a padding scheme to touch anything below the TLS layer, or buffer any more data at the endpoints or relays than is already buffered.
None of the schemes I've seen proposed so far fit well into Tor as it is now, due to the fact that multiple circuits are multiplexed over a single channel (that enforces in-order-reliable delivery semantics). HOL blocking is a thing that needs to be considered. Solving this undoubtedly has really interesting anonymity impacts that I haven't given much thought to.
I don't see why padding needs to make multiplexing more complicated. We already have logic for multiplexing circuits over a TLS connection. Padding cells don't behave any differently from data cells in terms of head-of-line blocking.
Another issue with all of the padding schemes that I currently don't have a solid suggestion for is how to actually detect malicious peers that don't pad/pad incorrectly.
Is it necessary to detect peers that aren't padding correctly? In adaptive padding, if a relay detects a gap in the incoming traffic along a circuit, it doesn't try to assign blame for the existence of the gap - it just fills it. Likewise for the single-hop padding scheme I suggested: each relay is responsible for normalising its output, not speculating about why its input was or wasn't normalised.
Cheers, Michael
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 08/01/15 11:25, Mike Perry wrote:
You might also like the "Adaptive Padding" defense: http://freehaven.net/anonbib/cache/ShWa-Timing06.pdf. It implements pretty much what you describe here. It is one of my current low-resource favorites.
Thanks for the link! I agree that what I proposed is very similar to adaptive padding. The major difference is that with adaptive padding, the padding cells travel to the end of the circuit, whereas with single-hop padding they only travel to the next relay.
There are pros and cons to both approaches. Adaptive padding protects against a malicious relay that manipulates the flow of data along a circuit in order to see whether the flow of data at a colluding downstream relay is affected. With adaptive padding, the first relay downstream from the point of manipulation will inject padding to smooth out the flow, and the colluding downstream relay won't be able to distinguish the injected padding cells from data cells. But this defense doesn't work if the adversary controls the downstream endpoint rather than a downstream relay, as the endpoint can distinguish padding from data.
Single-hop padding doesn't prevent colluding relays from discovering that they belong to the same circuit, but it may provide better protection than adaptive padding against an external adversary, because it makes it possible to have a different start time, end time and traffic rate for each hop of a given circuit. It's also compatible with Tor's existing cell encryption and end-to-end flow control, whereas it's not clear to me that the same's true for adaptive padding.
Marc Juarez actually has a reference implementation of this defense as a pluggable transport: https://bitbucket.org/mjuarezm/obfsproxy-wfpadtools/
It is a generalization of the Adaptive Padding scheme, and is described here: https://gitweb.torproject.org/user/mikeperry/torspec.git/tree/proposals/idea...
Our goal is to study this as a defense for Website Traffic Fingerprinting, but I also believe it has promise for frustrating E2E correlation.
Great! Preventing end-to-end correlation of hidden service traffic is my main goal here.
Unfortunately, hidden services also represent the hardest form of E2E correlation - where the adversary actually gets to control when you connect, and to a large degree what you send (both in terms of amount, and traffic pattern). In this sense, it actually resembles a Global Active Adversary more than a GPA. It may even be harder than a generic GAA, because of the "adversary also decides when you send data" property.
That's a really tough problem, but I think it's also worthwhile to consider the easier problem of preventing E2E correlation when the client and the hidden service are cooperating - for example, a Pond client and server that want unlinkability rather than mutual anonymity.
For that use case we may be able to find a relatively simple, low-overhead solution that doesn't depend on datagram transports, new flow control algorithms, etc.
Cheers, Michael