On Tue, Dec 9, 2014 at 4:40 PM, Michael Rogers michael@briarproject.org wrote:
On 25/11/14 12:45, George Kadianakis wrote:
Yes, integrating low-latency with high-latency anonymity is a very interesting probleml. Unfortunately, I haven't had any time to think about it.
For people who want to think about it there is the "Blending different latency traffic with alpha-mixing" paper. Roger mentioned that one of the big challenges of making the paper usable with Tor, is switching from the message-based approach to stream-based.
Other potential papers are "Stop-and-Go-MIX" by Kesdogan et al. and "Garbled Routing (GR): A generic framework towards unification of anonymous communication systems" by Madani et al. But I haven't looked into them at all...
Two of these papers were also mentioned in the guardian-dev thread, so I guess we're thinking along similar lines.
At least four characteristics can be used to match circuits entering and leaving a relay: start time, end time, total traffic volume and traffic timing. The design would need to provide ways to mix a circuit with other circuits with respect to each characteristic.
Traffic volumes can also be mixed by discarding padding at each hop,
There's currently no way to conceal traffic timing - each relay forwards cells as soon as it can.
Guilty of tldr here, yet similarly, with the easily trackable characteristics firstly above, I'm not seeing a benefit to anything other than filling all links with chaff which then hides all those parameters but one...
Then as far as passive observation goes, you're left with only packet clocking, for which you can use your bucket queueing randomizers to defeat at no more than the cost of their average jitter width X the number of hops.
http://en.wikipedia.org/wiki/Asynchronous_Transfer_Mode
Granted it does require a 'get what you pay for' dedicated pipe of bandwidth [purchasing/giving] mentality, but at least among relays that already exists today, and if that is returned by gaining more resistance to GPA who can currently do something as simple as counting bytes in a time and pumping your hidden services while being their guard, well then, so what?
It's also conceivable that nodes could choose to carry a 'chaff capable' flag. And that other nodes, and clients, could choose to select and participate in that when building links.
Note, there's no change to the low-latency circuit based expectations here at all, other than ability of CPU to process chaff operations. Though that might be easier if using a UDP or packet switched model.
I'm sure someone has explored chaff in papers already. Is it valid? As opposed to just requiring lots of new work to implement.
I can't see any other way to have both low latency and hide the talkers other than filling bandwidth committed links with talkers. And when you want to talk, just fill in your voice in place of the fake ones you'd otherwise send. That seems good against the GPA above.
The other question is how, similar to Sybil, does chaff hold up when an ACTIVE adversary runs nodes and can therefore discriminate wheat/chaff on those nodes and perhaps communicate that among their nodes on the backend to reduce things back to being counting bytes in time above. I'm thinking it's not any worse than today's Sybil and/or untrusted nodes, and thus not a factor in considering implementing chaff.