Comments on proposals 121, 142, and 143.

Karsten Loesing karsten.loesing at gmx.net
Sat Jul 19 13:54:24 UTC 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Nick,

thank you very much for your detailed comments!

Let's see if I can answer at least some of them.

| PROPOSAL 121: Hidden Service Authentication
|
| [...]
|
|  - In 1.2, rather than making two kinds of INTRODUCE1 cells and using
|    voodoo and duct tape to tell them apart, introduce a INTRODUCE1V
|    relay cell type [...]

Added in the last revision of proposal 121 (together with the other
changes described here). See changes in Section 1.2.

|  - The replay avoidance approach can be far better.  Instead of the
|    approach in the proposal, which still allows a number of replays,
|    try including a timestamp and a nonce in the authenticated portion
|    of the INTRODUCE2 cell.  Require the timestamp to be no more than T
|    seconds in the past or future, and require that the nonce has not
|    been used for the last 2*T seconds.  This requires less storage,
|    and prevents all replays.  [Instead of a nonce, we can and should
|    use a cryptographic hash of the rendezvous cookie, or the g^x data
|    from the INTRODUCE2 cell, or the entire introduce2 cell contents.]

The rendezvous cookie itself seems to make most sense here. It's already
20 octets long, so that there is no reason to apply a hash function on
it. See section 1.3 for the improved replay prevention that is now based
on your suggestion.

(On a second thought I'm not 100% sure that there are no situations in
which the same rendezvous cookie can be re-sent in a correct protocol
run. So, there is a slight chance that the above changes to H(g^x)
instead of RC.)

| Karsten is revising section 2 a bit as well to discuss some motivation
| issues, and we're going to figure out (but not necessarily build for
| 0.2.1.x) an authorization system that scales to more users better than
| that of proposal 121's section 2.  Such a system may not provide the
| same security as the one in section 2: the goal is to do better than
| the status quo for security, and better than section 2 for
| scalability.

Yeah, and while revising Section 2 I had this idea of adding a third
protocol that does not make use of any client authorization but that
prevents the introduction point from accessing the service. This was
also described in proposal 142, but it makes sense to include this idea
already in 0.2.1.x without the combination of introduction and
rendezvous point. It adds a new security feature for common hidden
services without client authorization at almost no costs: The
introduction point won't be able to access the service and decide
whether it wants to serve it or not. The earlier we introduce this
feature, the earlier we can shut down hidden services without this
feature. The additional complexity is minimal. See Section 2.1 for
details about the protocol.

I also added a more scalable version of an authorization protocol in
Section 2.2. As discussed, it lacks the nice security feature of hiding
service activity, but therefore it scales. There is just one thing I'm
still unsure about: The protocol requires encryption of
introduction-point data to multiple symmetric keys. Most things I found
on the Net were hybrid approaches, so I conveiced some crypto magic to
handle it. You probably want to have a look at this part. If it's just
crap, maybe consider it an informal description of properties we might
want to have and tell me which existing crypto approach I have missed. :)

If there are no big conceptual issues (like encrypting introduction
points to multiple descriptor cookies), my plan is to implement all
three protocols as hidden service protocol version 3. IMHO it makes more
sense to get this all done in 0.2.1.x. The two authorization protocols
are really alternatives that can both be justified. And if we should
implement proposal 142 in 0.2.2.x, it would be compatible to all three
authorization types of this proposal.

| We'll also want to think here about what to do when we want
| interactive authorization protocols, or to support methods requiring
| more data than can fit in the space currently available in Tor cells.

The only ideas that came to my mind are "add new cell types" and "split
data among multiple cells". Both seem achievable, and as soon as we
conceive a protocol that requires to exceed either of the two
limitations, these extensions can be implemented. If you don't mind I'd
rather spend my time on implementing the stuff for 0.2.1.x and shift the
exact specification of these two items to a proposal that describes such
a protocol. I added a short description of these limitations as Section
1.6 together with an idea how to work around them.

| PROPOSAL 142: Combine Introduction and Rendezvous Points
|
| [...]
|
|   - Chris -- it would be helpful if you could summarize more detail
|     from your thesis about the relevant timing issues.  Since the
|     point of this proposal is to reduce latency, we really need to get
|     all the measurement we can of its efficacy.  (If you can include a
|     URL for the thesis and a page reference, that would help people
|     who don't have a copy on hand.)

The measurements you are looking for are included in the NLnet mid-July
report in Section 1:

http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf

Christian might also make his diploma thesis with a more detailed
description of his changes available somewhere. We haven't talked about
that so far, but will do so next week.

|   - As I read it, I don't see how the proposal results in a separate
|     circuit existing from the hidden server ("Bob") to the client. [...]

You raise an important point here that we weren't aware of before.
Christian implemented the protocol described in proposal 142 only to
measure the performance of single requests. AFAIK the hidden-server-side
introduction circuit was thrown away after a single use and rebuilt.
This needs to change for proposal 142, of course.

The only efficient solution that comes to my mind is to tunnel multiple
"inner circuits" through the circuit between introduction point and
hidden service. That might add a fourth layer in the terminology of
connection, circuit, and stream, though.

I'm afraid of the additional code complexity for this one. So, it might
turn out that we can't handle it within the NLnet project and have to
concentrate on other design changes to make hidden services faster.
We'll have to decide that until mid of August. Maybe there will be new
insights until then.

| PROPOSAL 143: Distributed storage improvements
|
| This is an omnibus proposal with 8 separate ideas.  Going one by one:
|
|  1. Report Bad Directory Nodes
|
|     This seems like a fine idea, though the additional complexity is
|     not insignificant.
|
|     I worry that a clever adversary could distinguish a publication
|     attempt from an HS authority.  After all, hidden services do not
|     generally upload the same descriptor twice: if somebody sends you
|     a descriptor shortly after you received the same descriptor, and
|     it's a descriptor you're trying to censor, you can tell it's the
|     authority.

Agreed, the adversary might tell publication attempts from a DA apart.
The question is whether this matters so much. The adversary still can't
tell *fetch* requests from the DA apart from those coming from real
clients. What should the adversary do? Deny existence of the descriptor
to perform its attack or return it to satisfy the DA? That's quite of a
gamble. If the consequence is that the rogue hidden service directory
serves the descriptor afterwards this is all we wanted to achieve. We
might not have confirmed that the directory is bad, but the service is
not blocked.

After all, this measure was designed to defend against adversaries
trying to block a specific hidden service, not just all descriptors. An
adversary doing the latter is an annoyance, but that's part of the
reason we have replication. What is more dangerous is an adversary that
starts a number of hidden service directories 24+ hours in advance with
the appropriate IDs to become responsible for all descriptors of a
hidden service and block those descriptors. Sounds like a huge effort,
but if you can then block a popular hidden service, why not try it?

The original reason for the re-publication performed by the DA was that
a rogue hidden service can't blame a correct hidden service directory.
Therefore the DA needs to publish the descriptor itself to be sure that
it was published. Maybe we should even drop the 30-minutes delay in
re-publishing the descriptor by the DA. Then the (potentially rogue)
hidden service directory will know what the DAs are up to, but so be it.

|     The "blacklist all nodes in the same /24 or /16" rules seem far
|     too harsh: they let an adversary cut out huge swaths of the
|     network using only one or two targeted hosts.

Heh, maybe that was just me imagining that the average adversary
controls a /16 net. ;) Okay, what would be better numbers here? /28 and /24?

|     The voting rule listed makes the BadHSDir flag follow different
|     rules from all other networkstatus flags.  This would require a
|     version bump in the voting method.

Okay, are there any reasons not to bump the voting method version?

|  2. Publish Fewer Replicas
|
|     This is worthwhile, but no reason is given to think that the 85.7%
|     reliability figure will hold given future networks and network
|     conditions. It would be better to look into adaptive solutions
|     that will continue to work no matter what the reliability is in
|     the future.  See my recent comments on proposal 151: most apply
|     here.

That's a fine idea! The directory authorities could vote on a currently
required replication rate and put that number in the consensus. That way
hidden services and clients would learn how many replicas should exist.

What about the following plan: We skip the actual process to determine
the replication rate and put in the fixed number 4 in the consensus
which works for the data from Jan to Mar 2008. Hidden services and
clients start using this value beginning with the hidden service
protocol 3. At a later time we teach the DAs some smart way to calculate
the optimal replication rate and include it instead of the static number
4. That could well be in 0.2.2.x, though, but it would be recognized by
0.2.1.x hidden services and clients as well.

|  3. Change Default Value of Being Hidden Service Directory
|
|     Seems entirely reasonable.  Overdue, even. :)

Great! Maybe we can convince some more 0.2.0.x relay operators to enable
that option, too.

|     BTW, how many of the numbers in the rest of this proposal are
|     derived from the existing HSDir nodes?  If the number of HSDir
|     nodes is small, then most of the measurements in the rest of this
|     proposal are based on a worryingly small sample set.

None of the measurements are based on the current situation of hidden
service directories. The measurements I have performed are based on the
assumption that all relays with open dir port and 24+ hours uptime are
hidden service directories.

|  4. Make Descriptors Persistent on Directory Nodes
|
|     Plausible, but measurements are needed to make sure this is a good
|     idea.  If a server goes down, how often does it occur that it
|     starts up again in time to serve the hidden service descriptors
|     it's holding?  If the odds are good, this is a good idea.
|     Otherwise, not?

I can only guess here, but don't you think that a certain share of
relays will be restarted within 3 hours after going down? Maybe it's for
updating to a new kernel, a short power outage, or something like that?

Are there reasons to avoid making descriptors persistent, despite the
necessary implementation? I could evaluate these measurements to tell
you that x% of all servers going down are back within three hours. But
if we don't care about the descriptors being persistent or not, I'd
simply implement it. :)

|  5. Store and Serve Descriptors Regardless of Responsibility
|
|     Good idea.  We need an answer for DOS attacks here, though.

Hmm, you mean an adversary generating arbitrary valid descriptors to
overload a hidden service directory? That's quite an expensive DoS
attack, because generating descriptors requires some public key
operations, and descriptors need to contain an up-to-date timestamp in
order to get accepted.

What we can do is introduce an upper limit for stored descriptors, like
5,000. With every descriptor requiring slightly above 2K of storage, a
hidden service directory would never spend more than, say, 12M on
descriptors. And if the hidden service directory would then fail the
"bad directory node"-test, so be it.

On the other hand I'm never sure whether such DoS protections do more
good or more harm. It would mean that an adversary would have a better
defined target. She would need to send 5K descriptors to every directory
node and all hidden services would be offline. That's not what we
wanted, either.

Or did you mean something different as DoS protection?

|  6. Avoid periodic descriptor re-publication.
|
|     Good idea.  Seems obviously correct to me.

Yay!

|  7. Discard Expired Descriptors
|
|     Good idea.  Should descriptors contain an expiration time?

No, that's not required. There is a function get_seconds_valid() that
enables a directory node to easily compute the validity of a descriptor
while parsing and validating it.

|  8. Shorten Client-side descriptor fetch history
|
|     I don't understand this one fully, I think.

Hmm, let me rephrase: The current logic of clients to try to fetch all
replicas of a descriptor is to memorize a) which descriptor ID they have
requested at b) which directory node and c) when. If a fetch request
fails, and they are looking for another replica of the descriptor, they
know that they don't have to try the first directory node again.

However, the 15 minutes to memorize this information that we conceived
back then can really be a long time. If a service is started 6 minutes
after the client has asked, that service will appear unavailable for the
client for another 9 minutes. This mostly affects testing situations,
like when people try to access their own or their friend's service to
see if it's working. That's why these 15 minutes should be reduced to 5
minutes. This should still be reasonably long to avoid ending up in an
infinite request loop.

This item does not have top priority, though. So, if you say you don't
understand this one fully, because you don't see the clear benefit in
it, we can leave it out, too.


Phew, I think that's it. Thanks again for your feedback, Nick! My
feeling is that all three proposals are a lot more sophisticated now.
That's great! :)

- --Karsten


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIgfIQ0M+WPffBEmURAtTLAKC2lMl2ZbMIBv6Zr4hfpvqubsvq7wCfZ07A
RBcWLaoznsu+IDs9QoyuuYA=
=Dzju
-----END PGP SIGNATURE-----



More information about the tor-dev mailing list