Some comments on adversaries, etc.

Paul Syverson syverson at itd.nrl.navy.mil
Thu Aug 1 22:24:12 UTC 2002


Hi all,

Here are some comments/a draft of something/I don't know.  I would
like to make it cleaner, but wanted to let others see it and (again)
need to put it down for a few days or so for other things.

aloha,
Paul

----------------------------------------------------------------
I have been thinking alot about two things of late:
Adversaries and Architectures

What I say here is not a direct response to the various postings
people have had about padding etc., and some was written before I
saw some of them. But overall, it is certainly influenced
by them, and by some conversation with Roger and Rachel at the FTN
PI meeting. This is actually an attempt to pull several fractional
notes together that I have scribble since last week.


\section{Adversary}

I have a basic problem with the idea of global passive adversaries.
As an academic exercise, it seems fine, but it is hard for me to
imagine an adversary that is powerful enough to be global but weak
enough to be entirely passive. Somebody (Roger?) was saying to me
recently that it generally easier to modulate communication on a given
wire than to eavesdrop on it. I bow to any greater applied expertise,
but that makes sense to me. The only way I could see something global
and passive making sense is if the adversary cannot do anything active
for fear of detection. Again, this sounds too weak. In what context is
our intrusion detection that good, or can we detect any network insider
manipulation whatsoever? Where does that leave us? The global passive
adversary is a fairly clean notion so perhaps it should still be
pursued for abstract analysis purposes, but I need way more convincing
than I've seen to design against it. So, I am thinking that in many
ways we are currently barking up the wrong creek without a paddle to
stand on. If you think I am right, then the rest of you need to change
the direction of your conversation. If you think I am wrong, please
shout at me in meaningful ways. It's probably more subtle than that
but I'm just going to plow on. (Inside joke to Roger, "It depends.")

I propose the following (incomplete) list of adversary parameters
(I originally wrote `taxonomy' instead of `list', but it is clear
that some of these overlap. Fixing that, if desired, can be done later.)

Active vs. Passive

Distribution

   -local
   -fractional
      -absolute fraction
      -fixed number in unbounded set
   -global


Confirming vs. searching

   This is a standard one for onion routing, clearly linked to both
   distribution and insider vs. outsider, but somehow it seemed like a
   preferable way to break things. If we ultimately find it more
   useful to define it in terms of those other things, fine.

Insider vs. outsider (wrt network elements)
Insider vs. outsider (wrt path elements)

static vs. roving

growth (none vs. fixed bound vs. tick bound, i.e., growth_bound/tick
        of some kind)

Where does a roving adversary make sense? Where a global passive?
Where a global active? Where a local confirming? Lots of questions
here. 

On the active vs. passive there are probably lots of refinements:

   - tagging of packets                                                        
                                                                               
   - modulating flow an individual circuit                                     
                                                                               
       * breaking and individual circuit                                       
                                                                               
   - modulating flow on a multiplexed circuit                                  
                                                                               
       * breaking a thick pipe (erstwhile known as the destroy flood
         problem)
                                                                             
It would probabl be good to set out these threats fairly well.
Then I think we can design against them.

I would like to alter basics of onion routing as little as
possible. See what we can defeat, e.g., simply vi
a topology, what we can't, and what it would take to change
things.

By local confirming, I mean an adversary who is interested in
particular communicants, e.g., suspects a particular client or
site of going to a particular host. Perhaps that's not completely
local since it must be at two locations: observing at both, or active
at one and observing at the other, or some combination of these.
This seems like a real threat that OR might ought to guard against.

Consider some entity running capable of running their own OR on an
enclave firewall, they want to make sure that if they go to some
Web site the second node can't confirm that they are going to that
Web site. We should assume the Web site capable of inducing volume
and timing signatures on the connection. What could we do against that.
An architecture is discussed below.


\section{Architecture}

I believe that all of the padding/dummy schemes that we have discussed
are completely broken against an active adversary capable of inducing
any timing signatures. I'm going to go to another extreme and think
about no padding but how to do synchronization. Different
tradeoffs. This will probably be less palatable than padding for
clients but more palatable for their carriers/ISPs etc. Something to
keep in mind is that our intended market should be the strongly
paranoid. If you aren't strongly paranoid, you should just use the
anonymizer. If you are, you should be willing to pay a price (as small
as we can make it, but a price). We should also avoid
tragedy-of-the-commons problems if possible, which I think most of the
padding schemes do not. End of current slew of polemical comments.

\subsection{Web/FTP/etc Architecture}

Have a server that loads any desired file sliced into reasonable sized
chunks.

No problem with it doing that, everybody knows how to get to it and
what it does. It's completely visible. Let's leave integrity of data
it does this to aside for the moment.

So, make an OR connection to the slice server requesting a file.  If
reply onions are available you can give it one for when it has
downloaded and processed the file.  If not, you can just go back and
check after a reasonable time.  You then download the pieces, spacing
out the requests as much as needed to avoid obvious correlations back
to you.  If desired you could have either a finished message (might
need some authentication relation to the original request.) after
which data can be cleared out from the server.

We should be able to do something to make it so that the server
doesn't know what he's holding or even if things are related. It
should be an easier problem than censorship resistant storage, but
related.  Server gets the instruction, downloads the thing, cuts and
obscures it, then notifies the client. Client might send a bunch of
onions. These are used to send the pieces to several other
servers. Last encryption in each onion is for the client. Client can
then download from the various places. Add details.

This is moderately complex, but does allow relatively fast download
for someone who is pretty paranoid such that just going through
OR (mixminion?) isn't enough already.

\subsection{Synchronous Cascade to Slice Server}

Here clients send in query cells at a uniform rate. Cascade head only
forwards to next cascade member when all queries from a fixed input
set are received. Some flexibility might be possible by having a
threshold

   If some small number of clients have not sent by a certain time,
   the `mix' can still fire. Question about whether those circuits
   need to now be dropped for the duration of the
   session. Probably. Once some lower bound of involvement is crossed,
   the session terminates (suspends?). Have to watch how this is
   handled to avoid the sort of attacks that have been found on Crowds
   reforming---or figure out if that's even an issue, are we
   vulnerable to this type of long-term profiling.

Responses also are handled in a uniform way. Each circuit is sent the
same number of cells at each time, and the tail only fires when all
the circuits are filled. Here we might be able to get away with
padding from the end proxy (or slice server), but we need to be
careful that these can't gain information by noting repeat queries for
the stuff not returned. This might be OK given the threshold ideas
above.

\subsection{Cascades vs. Free Routes}


Stuff in penultimate section above will diminish the ability to
confirm a download if people can't tell what is being confirmed. The
location of the particular slice server could be hidden
also. Send an OR request to a server to download the stuff/slice it
up/and put it in the appropriate place via an onion given to it. Then
make an OR connection to the storage site(s) to download the
pieces. Note this is weaker than censorship resistant publishing
because we are assuming that nobody is trying to provide general
access to something that should be irremovable. It is only the
requester that should be able to get it. At this point it becomes
pretty hard for a given Web site to confirm who requested it.

This and our more elaborate location-protected server example
(discussed amongst some of us in other contexts) are ones where free
routes seems appropriate.  You don't want somebody to be able to
search around and find the endpoints.  Another example would be the
road warrior.

Cascades make it trivial to confirm the endpoints modulo some complete
guarantee of synchronization from the adversary's perspective, that is
he can't see any asynchronicity (it doesn't matter whether this is
achieved via padding as long as the adversary can't break that
apparent synchronization). In general, cascades seem to hose location
protection, so this couldn't help w/ location protected servers and
would undermine such dual usage (by dual usage I mean for traffic
analysis resistant file downloading and servers that are DoS resistant
via location protection).

Another topology concern is that just having a cascade seriously
limits the the anonymity set. We could have a hydra scheme (short free
routes into a cascade) but I'm concerned about timing attacks if the
synchronization isn't carried all the way to the client.  I think as
long as clients are ping pong about queries and responses we might be
OK. Maybe we can optimize in some way, but not now.

This make me realize I never described my two headed hydra cascade
notion: basically short free route into a high bandwidth cascade to a
short free route at the other end.  Can we leverage the advantages of
both (free routes and cascades) by doing this?


Other notes I didn't know where to stick, but didn't want to throw
away yet:

Talked tao Roger and Rachel about padding at FTN PM 7/23/2002

I sketched my idea about why sending padding to all your neighbors when you
send true data to any of them essentially telegraphs the whole route to any
badguy ORs in the network (assuming a clique). Otherwise this might be limited
by the badguy neighbors rather than the whole network. They agreed that this
is bad. Rachel described this as causing little poofs as the connection is
established across the network nodes.

We were trying to figure out both a realistic threat that was countered by
padding and then figure out if the padding helped.

Some scenarios

Somebody is watching cnn.com, say some guy in China.  In this context, onion
routers are going to be verboten anyway.  If he can use stego, triangle boy,
what have you to get past the censors then he is already pretty much home and
doesn't need us.

A road warrior who is logging into his home enclave with OR on the OR firewall.
Nobody is likely to be watching his connection to the network because they
don't expect him there.

Someone is gather intelligence on some web site owned by the
adversary.  (This one was discussed above.)  Note that somebody who
owns no nodes but can induce timing attacks on OR links from the
intelligence gatherer's OR can learn all: she doesn't need to
own any nodes.

Timing signatures seems like a tough thing to beat. Some of the things I
thought might help don't seem to.
-constant padding connections to a set of "friends" either neighbors or nodes
that we make OR connections to. Bad guys can still do the timing thing.
-likewise a frequency hopping scheme: Roger explained how this would still
allow strong timing signals

If you are willing to take a performance hit, you can put enough noise in at
honest nodes to complicate this. Roger argued for batching and delaying. I
argued for more random noise, but I think he was probably right.
(Looking back, I gues this is what lead to the design sketched above.)


Roger says:
Can we build a feasible synchronous system? Synchronizing for whole network
is O(n^2) but with cascades it's only O(n).

Is this feasible for us?  Biggest problem may be e.g. the timing
signature that any public info source puts on its responses. If the
first and last nodes in the cascade collaborate then any padding to
deal with this will allow them to reveal all (unless users generate
the padding).  Likewise for closing circuits that site doesn't send
data fast enough for.

P: Where is the padding removed here?

R: The last node. And it must add (to be indistinguishable) padding in on
the way back.

P: That might do it as long as the last node can't tag it for the
first. But... if the penultimate node were responsible for padding
management.  No, that won't work because the last node can just put in
its own (tagged?)  padding that will look like real data to the
penultimate node, thus the penultimate node will not pad. If padding
at the last node won't do it, we're probably sunk.  If the crypto
works, and there is no need to decide whether a packet is valid (other
than at the user), tagging can never be detected except by the user,
who is not the adversary.


This is the old saw about: if the endpoints are compromised you're toast, if
not you're OK.




More information about the tor-dev mailing list