[tor-talk] Mix, stream, VPN, traffic analysis, overlays, bandwidth buys (re: relay weighting)
grarpamp at gmail.com
Fri Feb 21 10:25:16 UTC 2020
On 2/17/20, Mirimir <mirimir at riseup.net> wrote:
> On 02/17/2020 05:16 AM, Roger Dingledine wrote:
>> But I'll turn it around, and point out that many systems (e.g. most VPNs)
>> are centralized, that is, the number is 100 percent.
> Yes, a VPN service is for sure 100% centralized, regarding ownership and
> management. And more generally, VPN services generally are probably
> about as centralized at the AS level as Tor is, for basically the same
> reasons. For some VPN services, I've found that most servers are
> actually located in a few cities (Nuland, Los Angeles, Prague and
> Vancouver) <https://restoreprivacy.com/virtual-server-locations/>
>> (You might turn it back around and say that VPNs are companies and you
>> have an agreement with them so nothing will go wrong. That's a good
>> point too, though that trust should only go so far. It's not clear to
>> me which one is the shakier argument. :)
A good contract against logging, retaining, sharing,
that is actually upheld, can be quite valuable.
The problem comes in verifying the implementation
of those statements, and in divining when any upholding
will be thrown out the window for higher priorities.
> Well, that's too iffy for me. Which is why I use nested VPN chains. It's
> a crude parody of Tor, for sure. But I can do 6-7 hops with decent
> latency and throughput, using a different VPN service for each hop. Paid
> with multiply mixed Bitcoin, and using dynamically changing paths.
VPN chains provide the same fundamental as tor does...
circuit based, tunneled, client-to-exit crypto.
In fact, people were chaining things around the net well
before both tor and rise of vpn services. Tor just made it
easy by wrapping a management engine over a bunch of
nodes. With enhancements coming from also being the
node software itself.
VPNgate makes things easy too. With a new influx of help
from various communities, vpngate could be greatly expanded...
dynamic tunnel pathing, more nodes, even hosted along with
tor, i2p, etc, paid with crypto. Added bonus of UDP and binding
inbound, providing termination from other networks.
Things begin to get interesting at that point as another choice
that has fairly well known properties.
Like tor, surely not a next generation design, but an alternative.
> And then there's
> which is a real thing.
> Although, sadly enough, for now limited to Android and iOS.
It's opensource, so it can get to native Linux and BSD before long.
Might even be able to run it in droid emulator and push traffic through
it that way.
There's a lot of crazy things coming from blockchain land :)
>> It's times like this where I wish the world knew how to do mixing with
>> streams. That is, there is a whole field out there on how to build
>> stronger anonymity designs, based on mix-nets, but nobody knows how to
>> do that safely when users generate flows of messages rather than just
>> a single message.
Can you or anyone link to papers that make a general proof of that?
ie: Not to papers that merely present weaknesses in some specific
proposed approach to date, but to something that proves "doing that"
is not possible. Proof of negative can steer work in other areas.
Safely... against traffic analysis bugaboo.
Stream... generally, not a single packet, but two or more generated
under and belonging to some user app context in a line.
While likely not impossible to carve up a transfer,
spray it across some magic random routing cloud,
and reassemble it on the other side in as near as real time as
some torrent sequence reassembly buffer mechanism will allow...
definitely not going to be happily "stream" it like YouTube
unless cloud has zero packet loss, which it won't,
leading to segment re-quest or falls back to store-forward.
Whether "stream" or "message" or otherwise, a mix still
needs to generate fill so that *PA's cannot simply match
up two endpoint usage patterns.
Given a fill that can defeat that, then ride such happy
stream circuits on top within and part of it as needed
up to your bandwidth commitment.
> What about Garlic routing? I know that I2P doesn't yet implement actual
> content mixing. But I've seen the claim that using unidirectional
> connections should allow that. Maybe the key point is that they've been
> saying that for years. Or maybe it's just that they're a small team.
>> Making sure that people
>> who want to contribute a lot of bandwidth can actually do it is really
What percent of relays, relative to both node and operator counts,
pay by the Byte actually transferred, vs for an unmetered bitrate contract?
Same question for end user client non-relay nodes?
Answers even potentially gathered from among all the overlay networks.
Any actual studies there?
declare that no relay family should get more than 10% of the total
consensus weight for any relay role (guard, exit, etc). By adopting a
policy like that, we could accidentally *increase* the total weight that
actual bad relays receive, thus providing yet another incentive for
attackers to assign their families incorrectly.
See also the tickets on whether MyFamily is a harmful idea, because
it pulls traffic away from honest relay operators and sends it to
So in summary: (a) yes we should get more relays and more capacity, and
(b) yes it is super important for us to get better at making the consensus
weights accurate and predictable and well-understood, but also (c) there
are a bunch of interconnected reasons why these two steps are important
It may help to try adding each proposed type of change, each
ranked by potential impact, into bins label *PA traffic analysis,
performance, jurisdictional and other diversity, sybil, etc to
analyze what they do to the bins by their weight in each.
Though, hm. In the sense that Tor's security comes down to probabilities,
it's not obvious that 20% of the network is much worse than 10% of the
The actual probabilities depend on the specific attack we're talking about
Those of us with years experience in bad relay area,
and of course other fields in general, each have our own
estimated percentage and odds ranges therein, and how
they've changed over time.
More information about the tor-talk