Hi Prateek, Yixin, (and please involve your other authors as you like),
(I'm including tor-dev here too so other Tor people can follow along, and maybe even get involved in the research or the discussion.)
I looked through "Counter-RAPTOR: Safeguarding Tor Against Active Routing Attacks": https://arxiv.org/abs/1704.00843
For the tl;dr for others here, the paper: a) comes up with metrics for how to measure resilience of Tor relays to BGP hijacking attacks, and then does the measurements; b) describes a way that clients can choose their guards to be less vulnerable to BGP hijacks, while also considering performance and anonymity loss when guard choice is influenced by client location; and c) builds a monitoring system that takes live BGP feeds and looks for routing table anomalies that could be hijack attempts.
Here are some hopefully useful thoughts:
-----------------------------------------------------------------------
0) Since I opted to write these thoughts in public, I should put a little note here in case any journalists run across it and wonder. Yay research! We love research on Tor -- in fact, research like this is the reason Tor is so strong. For many more details about our perspective on Tor research papers, see https://blog.torproject.org/blog/tor-heart-pets-and-privacy-research-communi...
-----------------------------------------------------------------------
1a) The "live BGP feed anomaly detection" part sounds really interesting, since in theory we could start using it really soon now. Have you continued to run it since you wrote the paper? Have you done any more recent analysis on its false positive rate since then?
I guess one of the real challenges here is that since most of the alerts are false positives, we really need a routing expert to be able to look at each alert and assess whether we should be worried about it. How hard is it to locate such an expert? Is there even such a thing as an expert in all routing tables, or do we need expertise in "what that part of the network is supposed to look like", which doesn't easily scale to the whole Internet?
Or maybe said another way, how much headway can we make on automating the analysis, to make the frequency of alerts manageable?
I ask because it's really easy to write a tool that sends a bunch of warnings, and if some of them are false positives, or heck even if they're not but we don't know how to assess how bad they really are, then all we've done is make yet another automated emailer. (We've made a set of these already, to e.g. notice when relays change their identity key a lot: https://gitweb.torproject.org/doctor.git/tree/ but often nobody can figure out whether such an anomaly is really an attack or what, so it's a constant struggle to keep the volume low enough that people don't just ignore the mails.)
The big picture question is: what steps remain from what you have now to something that we can actually use?
1b) How does your live-BGP-feed-anomaly-detector compare (either in design, or in closeness to actually being usable ;) to the one Micah Sherr was working on from their PETS 2016 paper? https://security.cs.georgetown.edu/~msherr/reviewed_abstracts.html#tor-datap...
1c) Your paper suggests that an alert from a potential hijack attempt could make clients abandon the guard for a while, to keep clients safe from hijack attempts. What about second-order effects of such a design, where the attacker's *goal* is to get clients to abandon a guard, so they add some sketchy routes somewhere to trigger an alert? Specifically, how much easier is it to add sketchy routes that make it look like somebody is attempting an attack, compared to actually succeeding at hijacking traffic?
I guess a related question (sorry for my BGP naivete) is: if we're worried about false positives in the alerts, how much authentication and/or attribution is there for sketchy routing table entries in general? Can some jerk drive up our false positive rate, by adding scary entries here and there, in a way that's sustainable? Or heck, can some jerk DDoS parts of the Internet in a way that induces routing table changes that we think look sketchy? These are not reasons to not take the first steps in the arms race, but it's good to know what the later steps might be.
-----------------------------------------------------------------------
2a) Re changing guard selection, you should check out proposal 271, which resulted in the new guard-spec.txt as of Tor 0.3.0.x: https://gitweb.torproject.org/torspec.git/tree/guard-spec.txt I don't fully understand it yet (so many things!), but I bet any future guard selection change proposal should be relative to this design.
2b) Your guard selection algorithm makes the assumption that relays with the Guard flag are the only ones worth choosing from, and then describes a way to choose from among them with different weightings. But you could take a step back, and decide that resilience to BGP hijack should be one of the factors for whether a relay gets the Guard flag in the first place.
It sounded from your analysis like some ASes, like OVH, are simply bad news for (nearly) all Tor clients. Your proposed guard selection strategy reduced, but did not eliminate, the chances that clients would get screwed by picking one of these OVH relays. And the tradeoff was that by only reducing the chances, you left the performance changes not as extreme as you might have otherwise.
How much of the scariness of a relay is a function of the location of the particular client who is considering using it, and how much is a function of the average (expected) locations of clients? That is, can we identify relays that are likely to be bad news for many different clients, and downplay their weights (or withhold the Guard flag) for everybody?
The advantage of making the same decision for all clients is that you can get rid of the "what does guard choice tell you about the client" anonymity question, which is a big win if the rest of the effects aren't too bad.
Which leads me to the next topic:
-----------------------------------------------------------------------
3) I think you're right that when analyzing a new path selection strategy, there are three big things to investigate:
a) Does the new behavior adequately accomplish the goal that made you want a new path selection strategy (in this case resilience to BGP attacks)?
b) What does the new behavior do to anonymity, both in terms of the global effect (e.g. by flattening the selection weights or by concentrating traffic in fewer relays or on fewer networks) and on the individual epistemic side (e.g. by leaking information about the user because of behavior that is a function of sensitive user details)?
c) What are the expected changes to performance, and are there particular scenarios (like high load or low load) that have higher or lower impact?
I confess that I don't really buy your analysis for 'b' or 'c' in this paper. Average change in entropy doesn't tell me whether particular user populations are especially impacted, and a tiny Shadow simulation with one particular network load and client behavior doesn't tell me whether things will or won't get much worse under other network loads or other client behavior.
I can't really fault this paper though, because the structure of an academic research paper means you can only do so much in one paper, and you did a bunch of other interesting things instead. We, the Tor research community, really need better tools for reasoning about the interaction between anonymity and performance.
In fact, there sure have been a lot of Tor path selection papers over the past decade which each invent their own ad hoc analysis approach for showing that their proposed change doesn't impact anonymity or performance "too much". Is it time for a Systemization of Knowledge paper on this area -- with the goal of coming up with best practices that future papers can use to provide more convincing analysis?
--Roger