[tor-teachers] Surveillance, Privacy, Networks and Encryption

Paul Syverson paul.syverson at nrl.navy.mil
Mon Jan 11 20:24:11 UTC 2016


Only had time for a very cursory quick glance, but here's a few comments.

typo
"the first relay decrpts the communication and sees that"
decrypts

Despite what you see in the press "Tor" is for "The Onion Routing"
not "The Onion Router"
See https://www.torproject.org/docs/faq.html.en#WhyCalledTor
Or for more detail, p. 129 of 
https://www.acsac.org/2011/program/keynotes/syverson.pdf

More importantly. Tor is not a mix network. A mix is not a network of
proxies. A mix is something that takes in a bunch of messages and
spits them out in a way that destroys association of which input
message went with which output message. Onion routers might mix
messages but typically do not, in particular the onion routers that
comprise Tor's relays do not do any mixing. Mixes may operate alone or
in a network to distribute trust. Mixes assembled in a network may be
arranged as cascades (fixed and shared routes for all messages
entering the network together) or use free routes (just as an onion
routing network such as Tor does).
See also p. 125 of the above paper as well as others.

HTH,
Paul


On Mon, Jan 11, 2016 at 04:42:54PM +0000, Hugo Maxwell Connery wrote:
> Hi,
> 
> See attached for a blurb on the internet and tor.  About 6.5 pages.
> 
> All comments welcome.
> 
> Hugo Connery

> 
> Surveillance, Privacy, Networks and Encryption
> 
> 
> The Internet, which was created by the US Dept. of Defence
> as a fallback communications network during the cold war,
> has become a very successful network.  It is now used to 
> perform a very large component of all inter-human communications
> (4th geneneration mobile phone networks use the internet
> as a backbone (specifically packet-switched vs circuit
> switched), email, internet relay chat, on-line business), 
> and serves as a "library in your home/work place" (search). 
> 
> The success of the Internet is a clear case of support
> for standards, by which disparate systems can inter-operate.
> Thus, one reads the "here is how it works" document and 
> can then have your thing work with the other things
> without any understanding of how the other things work.
> 
> It is worth considering the expansion and growth of 
> the Internet with other networks, like Google and Facebook,
> as all are successful but each use different mechanisms
> for their growth.  The core understanding is that the
> value of a network is the number of "people" using it.
> This reflects the "competition is a sin" mantra of 
> oligarchs.  More on that below. **
> 
> Additionally, the huge amount of concentration of communication
> via a single (well, mulitiple but consistent) mechanism
> leads those who wish to understand humans by their interaction
> to wish to have access to those communications.  Thus,
> advertisers and intelligence agencies must have access
> to either the raw communications of summaries thereof
> that fit their needs.  There must be other actors too:
> polling agencies, large corporations etc..
> 
> Moore's law, the prediction that computing would continuous
> improve in speed and capacity, doubling in such each 18 months
> has largely been confirmed for two decades.  The result of
> this is that it is easier to capture the communications of
> an entire population and post capture filter that information,
> than to target specific communications.  This results in
> mass surveillance -- it is cheaper.
> 
> Given the above, the author hopes that it is clear that
> the Internet is now a global surveillance platform.
> 
> If one understands the risks of global surveillance one
> may wish to counter that threat.  A little thought can
> quickly clarify the risk: imagine that someone knew 
> the date, time, and place at which every single internet
> search that you have ever made.  They would surely know
> much about you, and that information would be useful to
> some groups (see above for advertisers and intelligence
> agencies).
> 
> But, to counter the threat, one must understand the 
> Internet itself.  Not all of it, just how it works.
> Thus, one must read and understand the standards documents.
> 
> The following is the author's best effort at rendering
> the contents of those documents understandable by persons
> with little experise in the engineering fields of 
> computing and network communications.
> 
> Let's start with the most trivial thing that persons do
> on the internet -- they start a browser, and that browser
> loads the default page for a particular internet site.
> 
> What happens is as follows:
> 
> 1. The computer operating system that you are using
> asks its long term storage device (disk) for the program
> (the browser) and loads that into the short term storage
> (memory) of the computer, and then starts running that
> process.
> 
> 2. The program asks the operating system how it is that
> the program can turn internet names (e.g site.net)
> into a number, so that the program can then use the internet
> communication protocols to talk to that number (address).
> 
> 3. The program then submits the name (site.net) to the
> place that the operating system told it to use to turn
> the name into a number, and waits for an answer.
> 
> 4. Upon receiving an answer (a number) the program then
> starts talking to the number (which it thinks is "site.net")
> and asking it for its default "page".  This "page" is a 
> document written in HTML which the program (browser) can
> then display.  The communication from the browser runs
> through the operating system (which controls the network
> interface -- wire or wireless) and reaches the end point
> (site.net's number) and the remote site replies.  In all
> of these communications between the browser and the end
> point each piece of communication displays the source of 
> the communication (the address of the computer running
> the browser) and the end point of the communication (the
> number that the browser got from the name to number translation).
> 
> And there is the surveillance.  Each piece of communication
> on the internet describes in clear readable form the addresses
> of both the source (your computer) and the end point (site.net).
> 
> Thus, anyone watching the communication knows that one address
> is talking to the other.  This was true from the earliest
> incarnations of the internet up until the publication of
> this text.  Anyone watching the "wire" (backbone communications 
> channel knows who is talking to what).
> 
> To understand further, we must look at "how" your computer
> is talking to the site.  Internet based communications use
> a model called the seven layered stack.  This is one of 
> the most beautiful things created by computer/network
> communications science.  We need only understand a portion
> of it.  Note that it is this "OSI model" that allows your
> browser to talk to web sites via wired networks, wireless
> networks, mobile phone networks and many others.  All of
> the dirty details are hidden from the browser and handled
> at other layers.
> 
> One of those layers is the "encryption" layer (more precisely,
> the "presentation" layer).  The end point to which your application
> is talking may offer (or demand) some form of encryption.  If
> this is the case, your application (the browser or other program)
> may be able to meet this offer or requirement.  If so, then
> the people watching the wire will know where/what you are 
> communicating with, but not what you are saying.
> 
> The most common form of Internet based encryption happens
> via the https protocol.  This is different from the older
> but, unfortunately, still too commonly used http protocol.
> It is the 's' in https that makes the difference.
> 
> In http the watcher knows both who is talking to whom and
> what they are saying.  In https the watcher only knows who
> is talking to whom but cannot understand the conversation
> (unless they know how to break the encryption).
> 
> To clarify this, imagine that you are visiting a site which
> has recipes.  In http the watcher knows both that you are
> visiting the recipe site, and which recipe you are looking
> at.  In https the watcher only knows that you are visiting
> the recipe site, but not which recipe you are viewing.  In
> a banking analogy, they know that you are visiting a specific 
> bank, but know nothing about which account or transaction you are
> accessing/executing.
> 
> So, to a person who is concerned about surveillance, the 
> use of http should be great concern.  Https offers far
> more security (which is what the 's' stands for) but is 
> still of some concern -- they know who/what you are 
> communicating with.
> 
> Before continuing on the remaining surveillance problem
> of watchers know what/who you are communicating with, it
> behooves me to exand on the 'security' in https.  It does
> 3 things.  Encryption (ensuring that watchers dont understand
> what is being said) is only one of them.  The acronym,
> fittingly, is CIA: Confidentiality (thats the encryption),
> Integrity and Authenticity.  Integrity guarantees that 
> if the 'watchers' (or anyone in the network path) modifies
> the data sent then the entire communication will fail.
> This ensures that you get what they sent, and vice versa.
> Authenticity means that you are really talking to who
> you think you are.  It is possible for watchers (or others)
> to fake the communications and sit in the middle and 
> forward messages back and forth between you and your
> desired end point.  This is called a Man in the Middle
> (MITM) attack, and is deadly.  You think all is fine,
> but the attacker is reading and/or changing anything
> and you dont know.  Authenticity is still a major problem
> for the internet, but its discussion is beyond the scope
> of this article.  But, be confident that encryption and
> integrity are problems that have been solved by the
> cryptographic community.  Problems continue to arise,
> but they are largely due to outdated solutions continuing
> to be used.
> 
> The solution to the "we know who you are talking to" problem 
> are known as proxy, VPN or mix networks.
> 
> A proxy is a computer which will shuttle messages on your
> behalf.  You talk to the proxy and say "I want to access
> site.net".  The proxy takes this communication and says
> to site.net "I want your default page" and when it receives
> that information from site.net sends that back to you.
> 
> Now, the watcher sees you talking to the proxy rather
> than site.net.  If you are using an unencrypted protocol
> to talk to the proxy, the watcher knows everything; that
> you are using a proxy, what you really want to talk to,
> and what answer you want, and what the site sent back.
> With encryption, the watcher needs also to be able to 
> watch the proxy to get much of this information, but if
> they can watch the proxy, there is no value gained.
> 
> The specific risk case is that the watcher controls the
> proxy, in which case they can identify the sender and 
> recipient, and if the communications are unencrypted,
> the communications too.
> 
> A VPN is just like a proxy, but it automatically involves
> encryption between you and the VPN.  Again, this is of
> no value if the watcher can also watch the VPN; they
> know you are talking to the VPN, and with timing analysis
> can know what you really wanted to communicate with, and
> what you asked/what the response was if the communication
> is unencrypted.
> 
> A mix network is a collection of computers that extend 
> the idea of a proxy to multiple steps.  Thus, instead of 
> just shuttling your communication via one intermediary
> (a single proxy) via the mix network you shuttle it via
> multiple intermediaries.  Thus, to do surveillance the 
> watchers need to watch all the proxies, which make life
> harder for them, as the mix network's proxies may be located
> in very geographically distributed location.  The collection
> of "proxies" in a mix network are known as relays (i.e
> they relay the communication sent by you amongst themselves).
> 
> All modern mix networks also mandate encryption between
> you and the start of the mix network, and between each
> relay of the mix network.
> 
> The largest and most used modern mix network is called
> Tor (The Onion Router).  Tor is a volunteer community
> driven network, to which anyone can contribute.  Anyone
> includes both people who wish to help maintain some for
> of anonymity for internet usage, and their adversaries
> (i.e intelligence agencies and law enforcement).  This 
> is a problem that the Tor community is aware of and tries
> to combat (abusive participation).
> 
> Tor has a collection of partially trusted computers called 
> the "Directory Authorities".  These computers know the 
> collection of relays which make up the network.
> 
> When you connect to Tor the following happens:
> 
> 1. you contact a directory authority and ask for a list
>  of all of the nodes in the network
> 
> 2. you select 3 nodes from that list, preferring to stick
> with the first one if you have used it before and it is
> still there
> 
> 3. you contact each of those nodes, in order, through 
> the first, and ask for their cryptographic information.
> With this, you have formed a "circuit" from you via
> 3 relays, and have the information to be able to encrypt
> communication from you to each of those relays.
> 
> 4. You visits a site:
> 
> a) your request is encrypted to the 3rd (last) relay so
> that it can decrypt that request and send your communication
> onwards to the site
> 
> b) you form form a request for the second relay to forward
> some communication to the thrid relay.  That request to be
> forwarded is the previously encrypted communication othe 
> third relay.  You then encrypt that whole thing with the 
> cryptographic details for the second relay.
> 
> c) as above, you create an encapsulated, encrypted communication
> for the first relay which asks it is send the above on to
> the second relay.
> 
> d) you send the above information to the first relay.
> 
> e) the first relay decrpts the communication and sees that
> it should send something that it does not understand to the
> second relay.  It does so, and the second relay decrypts 
> what it gets.  It sees a request to forward and encrypted
> communication that it cannot understand to the third realy.
> It does so.  The third relay decrypts what it receives and 
> it knows what you wanted to do -- get the default page from
> site.net.  It contacts site.net and gets that information.
> 
> f) the third relay encrypts the data from site.net with 
> the second relay's cryptographic key and sends it on.
> The second relay receives that, decrypts it and encrypts it
> with the first relay's cryptographic key and sends it on
> to the first relay.  The first relay receives that, decrypts
> it and encrypts it with your key.  It sends that to you.
> You decrypt that and it is displayed.
> 
> As you can seen in steps a), b) and c) the client (you)
> is multiply encrypting communication to different parts
> of your circuit through the Tor network.  The "layer on 
> layer" encryption is what gives Tor its name: The Onion
> Router.
> 
> The end property is that the only part of the entire exchange
> that knows who is talking to who is you.  You know yourself,
> the three relays, and the end point (5 things talking).  But,
> each relay only knows two things.  The first relay knows you
> and the second relay.  The second relay knows the first and 
> the third (but not you or the end point), the third relay 
> know the second relay and the end point (but not you or the 
> first relay).
> 
> For a watcher to get useful information out of this set up
> they need to watch your first relay and your third relay.
> With that, and using timing, they can with some probability
> determine that you were talking to a specific end point, and
> if the end point was not using encryption, what you were 
> saying.
> 
> There are a number of possible attacks against the Tor 
> (or other mix) network(s).  But, before that is considered,
> take a moment to thinK.  
> 
> * If you are not using Tor and you are using http, then any
> watcher at any point in the communication knows everything.
> 
> * If you are not using Tor but are using https, then every watcher
> at any point knows who you are talking with.
> 
> * With Tor and just http the watchers need to watch two things,
> the first and third relay, which are statistically likely to be
> geographically disparate.  If they do this, and can do the timing
> correlation analysis then they can have some confidence that it
> was you talking to the end point and have the same confidence in
> what you said.
> 
> * If you are using Tor and https, the only entities that knows
> what you said are you and the end point, and watchers who can
> surveil both the first and third relay and who can do the timing
> correlation have some probability of confidence that you were
> saying something unknown to a specific end point.
> 
> A reader may have noticed that the weakness of this mix network
> is the timing itself.  The original mix neworks were designed
> for email.  They would wait until a buffer of messages was filled
> (or some long timout occurred) and then send the messages on in
> mixed order.  This is a high latency (i.e messages might take
> some time to go through) network.  Tor is a low latency anonymity
> network, and is thus always vulernable to timing attacks by
> a global adversary (who still has to do considerable work).
> 
> The above Tor use of encryption and the addition of deliberate
> delays is the best known form of anonymity network.  Assuming that
> the software is acurately doing what it purports, and that the
> inter-network encyrption is strong, and that the end point
> supports strong encryption, then it is almost impossible to
> identify the sender, the recipient and the message.  The ability
> to decifer the message is depended on the end point's choice 
> of encryption (and security of its key).  The sender and recipient's
> identity are secured by the mix network and its delays and 
> encryption.
> 
> To attack the Tor network, there are various options.  The
> biggest hurdle is the design of the network itself.  One can
> deploy sufficient network monitoring equipment to monitor the
> entire internet (difficult) and then perform correlation analysis
> (expensive).  The combination of both to de-anonymise all Tor
> traffic is prohibitive, and is evidenced by the fact that there
> is no evidence of a Tor deanonymisation that was not helped by
> people using poor "operation security" practices (more on that
> later).
> 
> The obvious attack is against the software itself.  Tor is a
> very successful network, but is comprised by relatively few
> people.  This makes it more difficult to expend the time to 
> gain acceptance in the community and then submit a software
> change that would not be checked sufficiently.  Possible, but
> difficult.
> 
> The next most obvious attack has happened several times; 
> create a large number of Tor relays and submit them to the 
> network.  This is the "timing" attack, but instead of just
> watching the network, on participates in it.  This is analygous
> to the "watcher actually runs the proxy" problem noted above,
> and again highlights the importance of the authenticity 
> problem (which largely remains unsolved).
> 
> Say you build and submit enough relays to control a third of
> the Tor relays.  On pure probability you will control a third
> times a third circuits through the network.  Thus your could
> de-anonymise a ninth of the network.  This is why "first" 
> nodes (called guard nodes) are preserved.  People will generally
> continue to use pre-used first nodes.  Thus, new elements of
> the network are less likely to become first nodes and thus
> and amount of correlation attack is reduced.  Additionally,
> Tor clients routinely change their circuit.  Second and third
> nodes get changed every ten minutes or so.  Thus, a polluted,
> compromised community of relays have less chance to constantly
> watch (de-anonymise) persons.
> 
> The last attack I will consider is the funding attack.  Tor
> was orginally fully funded by the US Dept of Defence (Naval
> Research Labs).  Consider the problem that they were trying
> to solve: A government employee needs to submit data to the 
> government from a foreign location.  The foreign location may
> have no direct access to US government secure networks, but
> the internet is accessible.  How can the internet be used
> to allow a government agent to communicate with the government
> without allowing the foreign organisation to know that they
> are doing so?
> 
> Hence the above described three hop encrypted proxy setup
> that is Tor.  Assuming that the end point employs secure
> cryptography, the foregin organisation will have no idea
> about what is being said, or where it is going.  One can
> assume that if a variant of Tor is still being used by the
> US Dept. of Defence or other agencies, then it is as outlined
> above a high latency network.
> 
> Could the DoD have built in special access tricks?  Yes and
> no.  They may have been there (and may still be there) but
> it is increasingly difficult over time to maintain these 
> when a group of non-US technically skilled privacy activists
> control the project and its source code.
> 
> Can the funding donor prioritise the work that is done on
> the project.  Yes ! The US government continues to be a 
> major sponsor of the project and as such, they can direct
> attention away from areas which they wish to be untouched.
> But, beware this is a double edged sword.  Every vulernability
> that they maintain is one that can be found by their adversaries.
> The project is public.  Its funding is public.  Its code and
> code reviews are public.  There are plenty of smart people
> out there who can find these "problems".
> 
> As much as the National Security Agency has shown itself to
> be more focussed on the "attach" side of its dual mission,
> the Tor project (from the Naval Research Labs) has placed
> itself in public hands and allowed all of the disinfection
> of sunlight.
> 
> 
> 
> 
> 
> 
> 

> _______________________________________________
> tor-teachers mailing list
> tor-teachers at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-teachers



More information about the tor-teachers mailing list