-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 03/06/2016 05:21 AM, nusenu wrote:
Moritz wrote:
Maybe this is better taken to tor-relays.
Ok.
url to the tor-dev thread: https://lists.torproject.org/pipermail/tor-dev/2016-March/010473.html
Brian didn't say anything about planed deployment locations, but if _all_ relays are within a single /16 network you might skip MyFamily altogether, but I assume they are not.
To help better explain this to folks on this mailing list, this started as a discussion of how to play nicely with the network. To be a bit less coy, I work for a Linux vendor (CoreOS). Some of our users run small groups of servers (1-10) many also run clusters of 1000 machines or more. When folks deploy machines, the goal is to keep the configurations as minimal & static as possible so as to avoid treating each config as a unicorn. When I deploy a cluster of machines for a workload they're generally split between Google Compute Engine, Amazon Web Services, and physical machines split across a number of different datacenters.
I was originally inquiring as to how to best inform the network that these machines are all related so as to be able to
1) expand the network with a number of hosts 2) Avoid the appearance of a sybil attack 3) Simplify the configuration so that /more/ operators could duplicate the work
This originally focused around the use of the Families. At the same time, I'm coming to the community to give both an anecdote of how we (and our users) deploy software as well as make sure we're providing the maximum benefit for the network.
In my case machines have a lifecycle. They come and they go
out of curiosity: What percentage of them do you expect to be online concurrently? (starting when) Are planing to rekey when "coming back" or resume with the former?
This is generally up to operators. In my case, machines generally *never* "come back." This is to say that I treat all machines as untrusted (the caveat to this is below) and I treat the provisioning process as vital to ensuring a minimally tampered state.
On 03/05/2016 10:31 PM, Brian "redbeard" Harrington wrote:
"Lets say you are about to deploy 100 relays within the next week." - Take this an order of magnitude greater and we're on the right track with the correct scale. It is a regular occurrence for our users to deploy 500 to 5000 nodes at a time.
This is why I said "and maybe set yourself an upper boundary as to how big you want to grow"
A single entity deploying 5000 relays isn't very sane at the current network size I guess, but instead of speaking of relay counts using CW fraction/exit/guard probability as upper boundaries makes more sense. <10% might be a worthy upper boundary for exit/guard probability.
The biggest (known) exit operator is currently at 7-8% exit probability.
teor wrote:
And there's likely some limit on MyFamily or on descriptor size that would stop you listing 1000 fingerprints.
That is actually another good use-case for replacing the current MyFamily design with something that scales better with family size like Mike's proposed design (#5565), but we did not see declared families that big so far. It was no problem in practice.
https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n364
Server descriptors may not exceed 20,000 bytes in length; [...] If they do, the authorities SHOULD reject them.
So the max family size would be something around 400 relays?
(20000 - 1250) / 42 = 446
(1250 bytes was the size of a non-exit sample descriptor without family)
generating 1000 relay keys and coordinating that key distribution dance across the same number of nodes (more than likely in highly distributed environments) seems to bring more questions than it answers (securing the keys for those nodes, securely distributing them, etc)
What problems do you expect when generating and transferring 1000 relay keys? (besides the descriptor limit) ... but before trying to solve any problems it is probably best to answer the question whether a single entities should run >5% CW fraction at all.
In my normal use case (e.g. non Tor processes) secret keys are never transferred to a host. This means that (in the case of PKCS#7 & PKCS#11) key distribution is handled more via CSR rather than traditional "distribution" (copying of files). In fact, members of our team are in the process of attempting to ratify this for some large scale distributed computing projects (https://github.com/kubernetes/kubernetes/pull/20439).
This doesn't answer the question of "it is probably best to answer the question whether a single entities should run >5% CW fraction at all." I humbly agree that a single entity should probably *not* operate that much of the network but the reality of the situation is that when operators have the *abilit**y* to run that much of the network, they may. The easiest way to challenge this is to expand the size of the network making it harder to reach that 5% mark. As an operating system vendor minimizing the deployment process of this, it could make this easier.
There are about 7000 relays in total, with over 1000 of them (almost 40% of the capacity) at only three ASes.
Top 3 ASes currently account for 32% cw fraction. https://compass.torproject.org/#?exit_filter=all_relays&links&sort=c...
but the top 1000 relays account for >72% cw fraction
https://compass.torproject.org/#?exit_filter=all_relays&links&sort=c...
The final piece of this is that we are currently building attestation of running Linux containers using the TPM. While these are being attested by individual signers today, obviously the long term goal would be to help the current signers of the Tor binaries use this process in addition to signed tarballs which would then give a mechanism for full cross distro runtime attestation in a reproducible way. Walk before running though. Let's make sure this can nicely play with the network.
- --redbeard