Hi all,
I was asked by George to submit my comments about the proposal and to suggest
suitable RandomX parameters for this PoW scheme.
> For our dynamic PoW system to work, we will need to be able to compare PoW
> tokens with each other. To do so we define a function:
> unsigned effort(uint8_t *token)
> which takes as its argument a hash output token, and returns the number of
> leading zero bits on it.
This definition makes the effort exponential, i.e. the computational resources
required to reach one notch higher effort increase by a factor of 2 each time.
I suggest to use linear effort defined as the quotient of dividing a bitstring
of 1s by the hash value:
== Example A:
effort(00000001100010101101) = 11111111111111111111 / 00000001100010101101
or in decimal:
effort(6317) = 1048575 / 6317 = 165.
This definition of effort has the advantage of directly expressing the expected
number of hashes that the client had to calculate to reach the effort.
With the exponential definition, we could have an equivalent linear effort of
either 128 (7 leading zeroes) or 256 (8 leading zeroes), while the linear
definition provides smoother classification of PoW results.
> The EXT_FIELD content format is:
>
> POW_VERSION [1 byte]
> POW_NONCE [32 bytes]
>
I suggest to use a 16-byte nonce value, which is more than sufficient given the
target security level and has the benefit of reducing the replay cache size to
half.
Since we expect the seed to be valid for around 3 hours (as proposed), then even
if the service receives 1 million proofs per second and each proof has an effort
of 1 million, then the number of submitted nonces from clients will only reach
about 10^10. With a 108-bit solution space (subtracting 20 bits as the search
space per client), the probability that two clients accidentally submit the same
nonce is roughly 10^-13 (see [REF_BIRTHDAY]).
Additionally, I suggest to add the client's effort for convenience:
The updated EXT_FIELD content format would be:
POW_VERSION [1 byte]
POW_EFFORT [4 bytes]
POW_NONCE [16 bytes]
Including the effort has 2 benefits:
1. In case the Intro Priority Queue is full, the service doesn't need to
waste time verifying PoW solutions that have effort lower than the last
element in the queue. While an attacker can always lie about the actual
effort of their nonce value, I think this field can still save some CPU
cycles when the service is under heavy load.
2. The service can conveniently verify the reported effort with
the following inequality:
POW_EFFORT * pow_function(POW_NONCE, SEED) <= MAX_RESULT
where MAX_RESULT is the highest possible output from the pow_function.
In the case of Example A, that would be:
165 * 6317 = 1042305 <= 1048576
> Similar to how our cell scheduler works, the onion service subsystem will
> poll the priority queue every 100ms tick and process the first 20 cells from
> the priority queue (if they exist). The service will perform the rendezvous
> and the rest of the onion service protocol as normal.
I suggest to use a different selection method rather than always taking
the first 20 requests.
Selecting cells from the front of the queue actually minimizes the effort that
an attacker needs to expend to completely block access to the service.
The attacker can always submit the minimum effort required to occupy the first
20 slots in each 100 ms window. This can be done by submitting just 200 requests
per second and observing how many circuits are successfully open and adjusting
the effort until no other request can go through. This is the "Total overwhelm
strategy" described in § 5.1.1.
See the following examples to show how selecting the first N cells from
the queue is unfair. I will use N = 4 for clarity. E denotes some value of
effort.
== Example B:
ATTACKER LEGITIMATE CLIENTS
___________ _________ ____________ ______________
/ \ / \
+--------+--------+--------+--------+----+----+----+----+----+----+----+----+
| 2E | 2E | 2E | 2E | E | E | E | E | E | E | E | E |
+--------+--------+--------+--------+----+----+----+----+----+----+----+----+
^ ^ ^ ^
selected selected selected selected
Here both the attacker and the legitimate clients have expended a combined
effort of 8E. All of the attacker's cells get selected, while none of the other
clients get through.
Instead, I suggest to use Stochastic universal sampling [REF_SUS], where
the probability that a cell gets selected is proportional to its effort.
== Example C:
Here the total effort in the queue is 16E, so we first select a random value in
the interval [0, 4E) and then select 4 evenly spaced cells:
ATTACKER LEGITIMATE CLIENTS
___________ _________ ____________ ______________
/ \ / \
+--------+--------+--------+--------+----+----+----+----+----+----+----+----+
| 2E | 2E | 2E | 2E | E | E | E | E | E | E | E | E |
+--------+--------+--------+--------+----+----+----+----+----+----+----+----+
^ ^ ^ ^
selected selected selected selected
In this case, 2 cells are selected from each group, which is fairer considering
that each group expended the same effort.
== Example D:
Now if the attacker wanted to block access to all legitimate clients, they would
need to at least quadruple their total PoW effort (and there would still be
a chance that a legitimate client gets selected from the end of the queue):
ATTACKER LEGITIMATE CLIENTS
__________________________ ______________________ ______________
/ \ / \
+--------------+--------------+---------------+--------------+-+-+-+-+-+-+-+-+
| 8E | 8E | 8E | 8E |E|E|E|E|E|E|E|E|
+--------------+--------------+--------------+---------------+-+-+-+-+-+-+-+-+
^ ^ ^ ^
selected selected selected selected
Note: With linear effort, the original selection method becomes even worse
because the attacker can occupy the front of the queue with an effort of just
E+1 per cell rather than 2E.
In some cases, Stochastic universal sampling can select a single element
multiple times, which is not a problem for genetic algorithms, but we want to
avoid it. I suggest to restart the selection algorithm from the beginning of
the next cell in those cases and shorten the sampling interval according to
the remaining portion of the queue. See the following example.
== Example E:
+---------------+-------------+--------------+------------+-+-+-+-+-+-+-+-+-+-+
| 16E | 8E | 6E |E|E|E|E|E|E|E|E|E|E|
+---------------+-------------+--------------+------------+-+-+-+-+-+-+-+-+-+-+
^ ^ ^ ^
selected selected selected selected
> In particular, the service starts with a default suggested-effort value of 15.
>
> Everytime the service handles an introduction request from the priority queue
> in [HANDLE_QUEUE], the service compares the request's effort to the current
> suggested-effort value. If the new request's effort is lower than the
> suggested-effort, set the suggested-effort equal to the effort of the new
> request.
>
> Everytime the service trims the priority queue in [HANDLE_QUEUE], the service
> compares the request at the trim point against the current suggested-effort
> value. If the trimmed request's effort is higher than the suggested-effort,
> set the suggested-effort equal to the effort of the new request.
I think the default suggested-effort is a bit too high. Assuming each attempt
takes around 1 ms to calculate, the client would need, on average,
2^15 ms = 33 seconds to find a solution using 1 CPU core. I suggest to specify
a minimum effort instead. Requests with effort < MIN_EFFORT and requests without
the PROOF_OF_WORK extension would be treated as having effort = 1 for
the purposes of the sampling algorithm.
Secondly, the proposed method of calculating the suggested-effort is susceptible
to gaming by attackers. Since the service can only update the value in
the directory once every 'hs-pow-desc-upload-rate-limit' seconds, they could
stop the attack for a little while just before the directory gets updated, which
would cause the suggested-effort value to be too low despite an ongoing attack.
I suggest to take the median effort of the selected cells during each 100 ms
window. For the examples above that would be:
Example C: 1.5E
Example D: 8E
Example E: 7E
Then I would take the median of these values over the directory update period.
>
> 5. Attacker strategies
>
Additional attacks:
5.1.2 PoW Spam Attack
The attacker may try to spam many requests with random values of POW_NONCE,
requiring the service to waste cycles verifying the invalid proofs. No such
request would make it into the Intro Queue, but it may still be a viable DoS
strategy depending on proof verification time and the number of intro requests
that can be practically delivered through the network.
5.1.3 Precomputed PoW attack
The attacker may precompute many valid PoW nonces and submit them all at once
before the current seed expires, overwhelming the service temporarily even
using a single computer.
> 4.2. Seed expiration issues
>
> As mentioned in [DESC_POW], the expiration timestamp on the PoW seed can
> cause issues with clock skewed clients. Furthermore, even not clock skewed
> clients can encounter TOCTOU-style race conditions here.
>
> The client descriptor refetch logic of [CLIENT_TIMEOUT] should take care of
> such seed-expiration issues, since the client will refetch the descriptor.
>
I suggest to use 2 concurrent seeds, i.e. to accept PoW both from the current
and the last seed epoch. We use this approach in Monero. It would however
require adding the seed value into the proof of work extension field and also
double the memory requirements for verification with RandomX.
> The proposal suggests argon2, and Mike has been looking at Randomx. However,
> after further consideration and speaking with some people (props to Alex
> Biryukov), it seems like those two functions are not well fitted for this
> purpose, since they are memory-hard both for the client and the service. And
> since we are trying to minimize the verification overhead, so that the
> service can do hundreds of verifications per second, they don't seem like
> good fits.
Asymmetric function like Equihash, Cuckoo cycle [REF_CUCKOO] and MTP have the
advantage of being very fast to verify, but they run much faster on GPUs and
specialized hardware, so I don't think they are particularly suitable for this
purpose.
When we designed RandomX to be used as the PoW algorithm by Monero, we selected
the parameters very conservatively to maximize the ASIC resistance of
the algorithm. That's because it is very difficult to change the PoW algorithm
once it's deployed.
TOR is not limited by this, so we can be much more aggressive when configuring
RandomX. I suggest to use a configuration that gives the fastest possible
verification time without completely breaking the algorithm.
In particular, the following parameters should be set differently from Monero:
RANDOMX_ARGON_SALT = "RandomX-TOR-v1"
The unique RandomX salt means we do not need to use a separate salt as PoW input
as specified in § 3.2.
RANDOMX_ARGON_ITERATIONS = 1
RANDOMX_CACHE_ACCESSES = 4
RANDOMX_DATASET_BASE_SIZE = 1073741824
RANDOMX_DATASET_EXTRA_SIZE = 16777216
These 4 changes reduce the RandomX Dataset size to ~1 GiB, which allows
the number of iteration to be reduced from 8 to 4. The combined effect of this
is that Dataset initialization becomes 4 times faster, which is needed due to
more frequent updates of the seed (Monero updates once per ~3 days).
RANDOMX_PROGRAM_COUNT = 2
RANDOMX_SCRATCHPAD_L3 = 1048576
Additionally, reducing the number of programs from 8 to 2 makes the hash
calculation about 4 times faster, while still providing resistance against
program filtering strategies (see [REF_RANDOMX_PROGRAMS]). Since there are
4 times fewer writes, we also have to reduce the scratchpad size. I suggest to
use a 1 MiB scratchpad size as a compromise between scratchpad write density and
memory hardness. Most x86 CPUs will perform roughly the same with a 512 KiB and
1024 KiB scratchpad, while the larger size provides higher resistance against
specialized hardware, at the cost of possible time-memory tradeoffs (see
[REF_RANDOMX_TMTO] for details).
Lastly, we reduce the output of RandomX to just 8 bytes:
RANDOMX_HASH_SIZE = 8
64-bit preimage security is more than sufficient for proof-of-work and it allows
the result to be treated as a little-endian encoded unsigned integer for easy
effort calculation.
RandomX would be used as follows:
The service will select a 32-byte POW_SEED and initialize the cache and
the dataset:
randomx_init_cache(myCache, POW_SEED, 32);
randomx_init_dataset(myDataset, myCache, 0, randomx_dataset_item_count());
randomx_vm *myMachine = randomx_create_vm(flags, NULL, myDataset);
Then in order to validate a PoW token, we could use something like this:
int validateProof(uint32_t pow_effort, void* pow_nonce) {
uint64_t result;
randomx_calculate_hash(myMachine, pow_nonce, 16, &result);
if (mulh(pow_effort, result) == 0) {
return 1;
}
return 0;
}
I suggest to set MIN_EFFORT = 10000, which takes about 1 second on my laptop.
Requests with pow_effort < MIN_EFFORT would not be validated.
I have collected some performance figures for the fast mode with the above
RandomX configuration (~1 GiB of memory is required):
H/s = hashes per second
== CPUs:
Intel Core i3-3220
- 1 thread 700 H/s
- 3 threads 1400 H/s
Intel Xeon (dual core VPS, Sandy Bridge, unknown model)
- 1 thread 2000 H/s
- 2 threads 4000 H/s
Intel Core i5-2500K (stock)
- 1 thread 2200 H/s
- 4 threads 8200 H/s
Intel Core i7-8550U (laptop)
- 1 thread 2700 H/s
- 8 threads 10000 H/s
Intel Core i7-9850H (laptop)
- 1 thread 3100 H/s
- 12 threads 16000 H/s
AMD Ryzen 1700 @ 3300MHz
- 1 thread 2300 H/s
- 16 threads 23800 H/s
AMD Ryzen 3700X @ 3300MHz
- 1 thread 2500 H/s
- 16 threads 27500 H/s
AMD Epyc 7702P
- 1 thread 2100 H/s
- 128 threads 139000 H/s
== GPUs:
NVIDIA GeForce GTX 1660 Ti (credits to SChernykh, see [REF_RANDOMX_CUDA])
- 3072 intensity 2600 H/s
According to the above results, the time to verify a single hash is around
400-500 μs. A mid-range GPU has a similar performance as a single CPU core.
Most CPUs made since 2011 have similar per-core performance except of low-end
CPUs without hardware AES support.
References:
[REF_BIRTHDAY]: https://en.wikipedia.org/wiki/Birthday_attack#Mathematics
[REF_SUS]: https://en.wikipedia.org/wiki/Stochastic_universal_sampling
[REF_CUCKOO]: https://github.com/tromp/cuckoo
[REF_RANDOMX_PROGRAMS]:
https://github.com/tevador/RandomX/blob/master/doc/design.md#12-the-easy-pr…
[REF_RANDOMX_TMTO]: https://github.com/tevador/RandomX/issues/65
[REF_RANDOMX_CUDA]: https://github.com/SChernykh/RandomX_CUDA
Hi everyone,
The end of the Google Summer of Code period has arrived, and you can
find my GSoC final report for the CAPTCHA Monitoring project here [1].
This was my first time working with an active open source community
and I enjoyed it a lot! The feedback you have provided made the
experience worthwhile and exciting. I couldn't finish implementing all
of the features I planned, so I will be around and I plan to stay
active. I want to thank my mentors GeKo & arma and everyone who helped
me with the project!
Best,
Barkin
[1] https://gitlab.torproject.org/woswos/CAPTCHA-Monitor/-/wikis/GSoC-2020
cc tor-dev for real this time :)
On 8/31/20 17:23, Jim Newsome wrote:
> Hi David, (cc tor-dev in case others are interested or know anything
> about it)
>
> I'm told you're the local expert on the linux perf tool :). I'm using it
> to profile sleep times in Shadow, using the instructions here:
> https://perf.wiki.kernel.org/index.php/Tutorial#Profiling_sleep_times
>
> perf unfortunately splits the data both by what the scheduler is
> switching from (which I want) and what it's switching to (which I
> don't). e.g. I have two separate lines:
>
> + 13.46% 13.46% prev_comm=shadow prev_pid=24996 prev_prio=120
> prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120
> + 11.57% 11.57% prev_comm=shadow prev_pid=24996 prev_prio=120
> prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120
>
> I'd rather these were merged into a single "prev_comm=shadow
> prev_pid=24996 prev_prio=120 prev_state=S" bucket.
>
> Any ideas?
>
(Sending this email again because I failed to copy tor-dev@.)
On Mon, Aug 17, 2020 at 12:16:08PM -0700, Philipp Winter wrote:
> Hi Matt,
>
> We recently started experimenting with the Salmon social bridge
> distributor:
> https://gitlab.torproject.org/tpo/anti-censorship/bridgedb/-/issues/31873
>
> We are now exploring the possibility of storing some Salmon-related data
> on a user's computer and are wondering what our options are. The data
> we're talking about is a lightweight, signed, and encrypted blurb that
> contains a user's social graph, proxies, and registration ID.
>
> One option to store this data is Tor's data directory but that doesn't
> seem ideal because Salmon isn't a PT and technically has nothing to do
> with Tor. Is Tor Browser an option here? Or does the "disk avoidance"
> design goal mean that we don't get to store anything at all? A last
> resort option would be to simply hand the blurb to the user and ask them
> to store it somewhere but we would like to find a more usable way to
> handle this.
>
> Thanks,
> Philipp
Forwarding with permission from Nick, since he sent it to only a sekrit
list, and it should go to a public list.
Onward and upward,
--Roger
----- Forwarded message from Nick Mathewson <nickm(a)torproject.org> -----
Date: Fri, 14 Aug 2020 13:52:02 -0400
From: Nick Mathewson <nickm(a)torproject.org>
To: network-team <network-team(a)lists.torproject.org>
Subject: [network-team] Hand-off notes for Nick's vacation
Hi!
You can also read this at
https://gitlab.torproject.org/nickm/notes/-/wikis/Vacation2020 -- it
is a list of stuff I thought might be helpful or important to think
about while I'm away.
This is me signing off for the next while. As documented in the
calendar, I'll be back on 14 September. If you need to contact me in
the meanwhile, please use Signal or personal email. I'll be routing
all mailing lists to a "Post-Vacation" folder.
I'm going to use this email to checkpoint a few things before I go
off, so it'll be easier to pick them up. I've asked George to help me
write it.
I've sorted this list in order of decreasing priority.
# 0.4.4
We are planning to release a stable 0.4.4 release on 15 September.
That's the day after I get back. I just put out an 0.4.4.4-rc release
on 13 August.
There are still some 044-must/044-should issues that
I think we have several possible choices here:
* If there are no further risky changes in 0.4.4 between now and 15
September, we can just release 0.4.4 stable on that date.
* If we want to do futher risky changes between now and 15
September, it would probably be okay to plan for another release
candidate on 1 September. That would let me put out a stable when I
get back.
* If you don't do a release candidate between now and when I get
back, but there _are_ risky changes, then we should put out a release
candidate when I'm back, and we can plan to put out the stable 2-3
weeks after that.
I have updated the ReleasingTor.md documentation in the master branch
to reflect current practice.
There is a ticket at
https://gitlab.torproject.org/tpo/core/team/-/issues/11 to track
things we need to do for the release. Please make sure that the
"must" and "should" tickets get triaged and handled (or not handled)
as appropriate.
# Caitlin's internship
Caitlin has been working on
https://gitlab.torproject.org/tpo/core/tor/-/issues/40066 ??? there is a
patch with a merge request. Please make sure to give them all the help
they need, and answer their questions promptly and usefully: if I
understand correctly, their internship ends at the end of month.
# Sponsor 55
Sponsor 55 will come to an end over the weekend. For a list of what
we've done, have a look at
https://gitlab.torproject.org/tpo/core/team/-/wikis/NetworkTeam/Sponsor55Re…
. I sent a copy of this pad to gaba and david under the heading
"sponsor 55 - what was worked on (so we can start having stuff
together for the report)".
There are remaining tickets labeled with "sponsor 55". I believe that
none of them are must-do items for the sponsor. But some of them are
things we need to do before we can release 0.4.5, and some of them are
things that we should do while the code is still fresh in our heads.
I will try to sort these out as much as I can before I go, but there
may be remaining tickets for you to sort.
https://gitlab.torproject.org/tpo/core/team/-/issues/12 is the
tracking ticket for removing the S55 label from the remaining tickets.
# Gitlab best practices
IMPORTANT: Make sure you are seeing new tickets in tpo/core/tor that
do not mention you specifically. This may not be the default. You
can do this either by looking at the complete list of tickets every
day or two, then skipping to the end... or by updating the settings in
"User settings > notifications" at
https://gitlab.torproject.org/profile/notifications .
When somebody submits a patch but not a merge request, please ask them
to do a merge request if they can, and open one for them if they
can't.
If a volunteer has been sitting on a needs_revision patch for a long
time, consider asking them for permission to take it over and fix it
up.
# 0.4.5 best practices
Please remember how to handle tickets that are possibly backportable.
When making a Merge Request:
1. Make the branch against the target maint-0.x.y branch.
2. Make your merge request against the target maint-0.x.y branch.
3. Note that the MR is backportable in your MR message.
When merging, if it is even _possible_ we want to backport more:
1. Give the MR the "Backport" label.
2. Un-assign yourself from the MR: you don't need to review it any more.
3. Put the MR into the latest milestone to which it has not been backported.
# Tickets and MRs assigned to Nick
There are a bunch of tickets still assigned to me. Please feel to
take any of them over that you want. I'm going to go through them and
unasssign any that I think should be handled before I'm back, with
notes.
If you review an MR that I wrote and it needs changes, please consider
making those changes yourself or getting somebody else to do so, if
you feel that you can. I won't mind, and it's probably better if this
code doesn't rot for a whole month.
# CI
We now have CI running on gitlab, with debian-stable. It covers (I
believe) hardened builds, gcc, clang, stem, chutney, distcheck, and
documentation. It's implemented in .gitlab.yml and in scripts/ci/ .
Feel free to hack on it, but make sure to make any changes to
maint-0.3.5 and merge them forward from there: it was really
unpleasant to have our appveyor and travis files diverge among
versions.
There are more changes to make, and please feel free to move ahead
with this stuff: I've tried to update stuff on trac as we go.
https://gitlab.torproject.org/tpo/core/tor/-/issues/40048 is the
ticket for moving to gitlab CI.
Also please remember that super-slow CI is not our friend. You can
make hardened builds run much faster by merging tpo/core/tor#40088,
once you trust it. :)
# 0.4.5 planning
Please feel free to make progress on planning and coding stuff for
045, but keep im mind that technical debt and leftovers from 044 would
also be very smart things to work on.
The ticket where our discussions have been happening is:
https://gitlab.torproject.org/tpo/core/team/-/issues/7
# TROVE-2020-005
We have an open low-severity issue tracked as tpo/core/tor#40080
(private) and tpo/core/tor#40086 (public). More thought there would
always be valuable. If it is indeed low-severity, the it's okay to
merge it in 044 if you like it and are confident in it. But please be
careful: the code is subtle.
# Tor ticket triage
I closed our original ticket triage activity, and opened a new one:
https://gitlab.torproject.org/tpo/core/team/-/issues/10
Feel free to do as much or as little here as you think good, and/or to
come up with different ideas.
# Blog post comments
We're supposed to moderate comments on all our blog posts, and I just
put out a release (0.4.4.4-rc) on 13 August. Please keep an eye on
the comments there and approve the ones that are on-topic.
# Defer till later: rename 'master', autostyle the code.
We're planning to rename "master" to "main", and to get clang-format
running happily on our code. Both of these are deferred till I get
back in September.
----- End forwarded message -----
Hi,
it is not clear to me whether ExcludeNodes and other similar options (ExcludeExitNodes)
can be used multiple times in a torrc file.
Are all of the lines merged like multiple "MyFamily" lines or
how does it behave?
thanks,
nusenu
--
https://mastodon.social/@nusenu
Hello everyone,
Onion service version two (v2) key pairs were used for more purposes
than simply facilitating the establishment of rendezvous circuits, in
particular third-party applications used this key in numerous ways.
Similarly, version three (v3) onion service keys are being re-used in
similar (and new) ways. However, the (re)use of v3 long-term keys is not
obviously safe in all situations. How should (and should not) these
keys be used, such that the security of the onion service is preserved?
I'll briefly summarize v3 keys (for those who don't want to read through
the spec), and then I'll sketch two alternative use cases along with a
straw man proposal. The long-term v3 identity keys are ed25519 keys (see
[0] for a nice write up of ed25519, in general). The generated key
material is serialized externally (outside of tor) in two (similar)
formats: written on-disk [2] and from the control port [3].
The secret (private) key is the (64-byte) SHA-512 digest of a 32-byte
seed value. The 64-byte digest is the "expanded form". The seed is
thrown away. The public key is obtained by taking the first 32 bytes of
that SHA-512 hash, clamping some of the bits, and performing a scalar
multiplication with the (fixed) base point. This key is the onion
service long-term identity key.
Creating and verifying a signature using the above keys (respectively)
follow standard EdDSA. However, (at this time) Tor does not use these
keys directly for signing messages. All messages are signed using an
ephemeral ed25519 key, and that ephemeral key is certified by a blinded
ed25519 key derived from the long-term key pair. The blinded keys are
computed using a specified blinding scheme [4]. All messages signed
using the ephemeral key are prefixed with a context-specific string. In
summary, long-term keys are used for deriving a short-term blinded key,
and that short-term blinded key is used for certifying an ephemeral
signing key.
For computing the blinded key, the first 32 bytes of the long-term
secret key (LH) are multiplied with a blinding factor (h*a mod l), see
the specification for the value of **h** [4]. This becomes LH'
(LH-prime). The second 32 bytes of the secret key (RH) are concatenated
with a string prefix and then the SHA3-256 digest is computed of the
concatenated string. The first 32 bytes of the resulting digest become
RH' (RH-prime). LH' and RH' are used as regular ed25519 secret keys for
signing and verifying messages following EdDSA.
Tor's EdDSA signature is "R|S", R concatenated with S (the message is
not included in the signature).
The safest usage of the long-term keys for alternative purposes I see
appears to be by deriving a (fixed/deterministic) blinded key pair using
the same scheme that Tor uses, and signing/verification simply follow
the same process as Tor, except the derived keys need not rotate
periodically (is this true?). The derived key should be used for
certifying a freshly generated ed25519 key, which is used in the
application protocol. For example, if I want to use a key for code
signing such that it is bound to my onion service key, then I could
derive a certifying key by following Tor's derivation scheme, by
substituting:
BLIND_STRING = "Derive temporary signing key" | INT_1(0)
N = "key-blind" | INT_8(period-number) | INT_8(period_length)
with
BLIND_STRING = "Derive code signing key" | INT_1(0)
N = "code-sigining-key-blind" | "v0" | "YYYY-MM-DD" | INT_8(validity_period)
for computing the blinding factor. Where "v0" is a version tag.
YYYY-MM-DD is an arbitrary date, but it can be used for rotating signing
keys in the future. INT_8(validity_period) may be used for specifying
the number of days after YYYY-MM-DD at which time previously unverified
signatures using this key should be considered invalid (where INT_8(0)
could indicate "never expire").
And substituting
RH_BLIND_STRING = "Derive temporary signing key hash input"
with
RH_BLIND_STRING = "Derive code signing key hash input"
for computing RH'.
A signature must include "v0" and the values used in "YYYY-MM-DD" and
INT_8(validity_period), such that the client can derive the correct
blinded public key for verification when starting from the long-term
identity key. The signature should be over a certification of an
independently generated ed25519 key pair. This new key pair (along with
the certification) can be used for providing message integrity within
the application's protocol. If, instead, the derived key is used
directly for signing, and the application needs the keys online for
signing messages, then this risks the security of the long-term key, as
well. The blinding scheme allows for (partially) recovering the
long-term secret key from the derived secret key.
Another example use case comes from Jeremy Rand where the onion service
key is used in a root CA certificate, and a leaf certificate (signed by
the CA cert) is used by the application.
Following from the previous example, (most likely) the CA certificate
should not be signed directly using the onion service's long-term secret
key. However, a derived key could be used in the CA certificate and the
leaf cert could contain an ephemeral key (in exactly the same way that
tor certifies ephemeral keys using the derived blinded signing key).
This idea appears to be a concrete design of how the above (abstract)
key certification could be implemented, and it could be a format that
tor natively supports.
The above process seems like a lot to ask from application developers.
Can we make it easier for them?
Open questions:
1) Going back to the long-term secret key, can LH and RH be used
directly in EdDSA without reducing the security and unlinkability of
the blinded keys?
2) Should other use cases of the long-term keys only derive distinct
(blinded) keys, instead of using the long-term keys directly?
3) If other use cases should only use derived keys, then is there an
alternative derivation scheme for when unlinkability between derived
keys is not needed (without reducing the security properties of the
onion service blinded keys), and is allowing linkability
useful/worthwhile?
4) Is the above example derivation scheme safe if different
applications tweak the above prefix strings in similar ways?
5) Should Tor simply derive one blinded key that can be used by all
alternative applications? Is that safe?
I'd like to thank Nick Mathewson and David Goulet for their comments on
an earlier version of this mail.
Thanks,
Matt
[0] https://blog.mozilla.org/warner/2011/11/29/ed25519-keys/
[1] https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt#n2267
[2] A tagged value: "== ed25519v1-secret: type0 =="\0\0\0 | (64-byte
SHA-512 hash of the seed)
[3] https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n1746
[4] https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt#n2267