Hi All,
I'm just skimming Mahrud's patch at
https://github.com/mahrud/tor/commit/a81eac6d0c0a35adc6036e736565f4a8e2f806f...
...referenced from elsewhere, and also from the blog post:
https://blog.cloudflare.com/cloudflare-onion-service/
Luckily for us, the IPv6 space is so vast that we can encode the Tor
circuit number as an IP address in an unused range and use the Proxy Protocol to send it to the server. Here is an example of the header that our Tor daemon would insert in the connection:
...and it makes me wonder how far back up the chain of hops towards the client, that the circuit ID is visible to a malicious relay? Is it mostly-hidden several onion-skins down? I presume it's not trackable all the way from the client's guard?
Am thinking about the necessary scope for a correlation attack.
-a
On 2018-09-22 06:29, Alec Muffett wrote:
...and it makes me wonder how far back up the chain of hops towards the client, that the circuit ID is visible to a malicious relay? Is it mostly-hidden several onion-skins down? I presume it's not trackable all the way from the client's guard?
Hey Alec!
The circID is scoped under a given connection between adjacent nodes.
A relay node maintains a mapping of circIDs for a circuit - mapping the forward and backward circID - for traffic it is relaying.
So for a circuit ... client <-ID_a-> guard <-ID_b-> middle <-ID_c-> exit
... each of the ID_*s are independent, and any node only knows the IDs immediately "adjacent" to it. Each connection (e.g. each client to that guard) has a independent enumeration/allocation of IDs.
Hope that helps! Dave
On Sat, 22 Sep 2018 at 19:28, Dave Rolek dmr-x@riseup.net wrote:
The circID is scoped under a given connection between adjacent nodes.
A relay node maintains a mapping of circIDs for a circuit - mapping the forward and backward circID - for traffic it is relaying.
So for a circuit ... client <-ID_a-> guard <-ID_b-> middle <-ID_c-> exit
... each of the ID_*s are independent, and any node only knows the IDs immediately "adjacent" to it. Each connection (e.g. each client to that guard) has a independent enumeration/allocation of IDs.
That is an awesome explanation, thank you ever so much.
If I read that right, to the most that an attacker with observability of the Cloudflare IP addresses could get, is either ...
( using the nomenclature from the diagram at https://twitter.com/AlecMuffett/status/926032680055201792 )
1) correlation backwards to "Server Side Middle 1" for browsing a normal onion over Tor; or...
2) correlation backwards to "Client Side Middle" for browsing a single-hop onion over Tor
Am I correct? That latter seems not very much worse than the information which a compromised exit node would be able to obtain ("Browsing Normal Web over Tor") although it would be a lot more available when the circID is presented to the any backbone observer who can sniff IPv6?
-a
On 23 Sep 2018, at 04:50, Alec Muffett alec.muffett@gmail.com wrote:
On Sat, 22 Sep 2018 at 19:28, Dave Rolek dmr-x@riseup.net wrote: The circID is scoped under a given connection between adjacent nodes.
A relay node maintains a mapping of circIDs for a circuit - mapping the forward and backward circID - for traffic it is relaying.
So for a circuit ... client <-ID_a-> guard <-ID_b-> middle <-ID_c-> exit
... each of the ID_*s are independent, and any node only knows the IDs immediately "adjacent" to it. Each connection (e.g. each client to that guard) has a independent enumeration/allocation of IDs.
That is an awesome explanation, thank you ever so much.
If I read that right, to the most that an attacker with observability of the Cloudflare IP addresses could get, is either ...
( using the nomenclature from the diagram at https://twitter.com/AlecMuffett/status/926032680055201792 )
correlation backwards to "Server Side Middle 1" for browsing a normal onion over Tor; or...
correlation backwards to "Client Side Middle" for browsing a single-hop onion over Tor
Am I correct?
I'm not sure what you mean by "correlation backwards".
The Onion Service and the Onion Service Guard (or Single Onion Service Rendezvous Point) both know the circuit id sent from the Onion Service to the proxy. If an attacker controls the Onion Service Guard (or Single Onion Service Rendezvous Point), then they can correlate backwards to the Server Side Middle 1 (or Client Side Middle) by looking up linked circuit ids on the node they control.
The Rendezvous Point is chosen by the client, so it is just as likely to be malicious as any other node.
That latter seems not very much worse than the information which a compromised exit node would be able to obtain ("Browsing Normal Web over Tor") although it would be a lot more available when the circID is presented to the any backbone observer who can sniff IPv6?
This IPv6 address isn't in the IP header of the packets between Cloudflare's onion service and Cloudflare's proxy.
It's sent inside the TCP (or TLS?) connection between the Tor onion service and the proxy instance, as a text header before any other inner TCP or TLS: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
If Cloudflare encrypts their onion service to proxy connections (and they should), the circuit id will only be known to the onion service and its guard (or rendezvous point, for a single-hop onion service connection).
Alternately, if Cloudflare hosts its onions in the same data centre as the proxies they talk to, then the risk of interception is low.
Then, if the proxy strips out this header before sending the request to the origin site, or connects to the origin site using TLS, then this IP address shouldn't be visible on the backbone.
Note: some origin sites still use HTTP to talk to CloudFlare: https://www.cloudflare.com/ssl/
Also note: the CloudFlare dashboard shows the circuit id to site owners: https://blog.cloudflare.com/cloudflare-onion-service/
I can't see how having the actual circuit id is useful to site owners. They can't block it effectively, because it's transient. (And the same circuit id can be re-used by independent connections.)
These are good questions for Mahrud, who I've CC'd.
T
Hi, In short, yes. I think everything mentioned above is correct, and I'm not sure what else to add.
Oh, I guess I should ask people NOT to use the "tweak" commit on my repository, which is also linked in the first email in this thread, as it actually has a bug (puts a 32bit hex in a 16bit IPv6 section ... smh). Instead use this: https://github.com/torproject/tor/pull/343 which is already merged and includes a couple of neat features that I couldn't add myself, thanks ahf!
ps: Thanks for cc'ing me. I'm trying to limit the number of tor-* mailing lists that I'm on, but feel free to cc me if needed.
On Sat, Sep 22, 2018 at 9:09 PM teor teor@riseup.net wrote:
On 23 Sep 2018, at 04:50, Alec Muffett alec.muffett@gmail.com wrote:
On Sat, 22 Sep 2018 at 19:28, Dave Rolek dmr-x@riseup.net wrote:
The circID is scoped under a given connection between adjacent nodes.
A relay node maintains a mapping of circIDs for a circuit - mapping the forward and backward circID - for traffic it is relaying.
So for a circuit ... client <-ID_a-> guard <-ID_b-> middle <-ID_c-> exit
... each of the ID_*s are independent, and any node only knows the IDs immediately "adjacent" to it. Each connection (e.g. each client to that guard) has a independent enumeration/allocation of IDs.
That is an awesome explanation, thank you ever so much.
If I read that right, to the most that an attacker with observability of the Cloudflare IP addresses could get, is either ...
( using the nomenclature from the diagram at https://twitter.com/AlecMuffett/status/926032680055201792 )
- correlation backwards to "Server Side Middle 1" for browsing a normal
onion over Tor; or...
- correlation backwards to "Client Side Middle" for browsing a
single-hop onion over Tor
Am I correct?
I'm not sure what you mean by "correlation backwards".
The Onion Service and the Onion Service Guard (or Single Onion Service Rendezvous Point) both know the circuit id sent from the Onion Service to the proxy. If an attacker controls the Onion Service Guard (or Single Onion Service Rendezvous Point), then they can correlate backwards to the Server Side Middle 1 (or Client Side Middle) by looking up linked circuit ids on the node they control.
The Rendezvous Point is chosen by the client, so it is just as likely to be malicious as any other node.
That latter seems not very much worse than the information which a compromised exit node would be able to obtain ("Browsing Normal Web over Tor") although it would be a lot more available when the circID is presented to the any backbone observer who can sniff IPv6?
This IPv6 address isn't in the IP header of the packets between Cloudflare's onion service and Cloudflare's proxy.
It's sent inside the TCP (or TLS?) connection between the Tor onion service and the proxy instance, as a text header before any other inner TCP or TLS: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
If Cloudflare encrypts their onion service to proxy connections (and they should), the circuit id will only be known to the onion service and its guard (or rendezvous point, for a single-hop onion service connection).
Alternately, if Cloudflare hosts its onions in the same data centre as the proxies they talk to, then the risk of interception is low.
Then, if the proxy strips out this header before sending the request to the origin site, or connects to the origin site using TLS, then this IP address shouldn't be visible on the backbone.
Note: some origin sites still use HTTP to talk to CloudFlare: https://www.cloudflare.com/ssl/
Also note: the CloudFlare dashboard shows the circuit id to site owners: https://blog.cloudflare.com/cloudflare-onion-service/
I can't see how having the actual circuit id is useful to site owners. They can't block it effectively, because it's transient. (And the same circuit id can be re-used by independent connections.)
These are good questions for Mahrud, who I've CC'd.
T
Hi Mahrud,
On 23 Sep 2018, at 12:10, Mahrud S dinovirus@gmail.com wrote:
In short, yes. I think everything mentioned above is correct, and I'm not sure what else to add.
I'm still not quite clear on some of the details:
On Sat, Sep 22, 2018 at 9:09 PM teor teor@riseup.net wrote:
On 23 Sep 2018, at 04:50, Alec Muffett alec.muffett@gmail.com wrote:
That latter seems not very much worse than the information which a compromised exit node would be able to obtain ("Browsing Normal Web over Tor") although it would be a lot more available when the circID is presented to the any backbone observer who can sniff IPv6?
This IPv6 address isn't in the IP header of the packets between Cloudflare's onion service and Cloudflare's proxy.
It's sent inside the TCP (or TLS?) connection between the Tor onion service and the proxy instance, as a text header before any other inner TCP or TLS: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
If Cloudflare encrypts their onion service to proxy connections (and they should), the circuit id will only be known to the onion service and its guard (or rendezvous point, for a single-hop onion service connection).
Is the connections between Cloudflare's Tor onion service and Cloudflare's proxy instance encrypted?
Alternately, if Cloudflare hosts its onions in the same data centre as the proxies they talk to, then the risk of interception is low.
Does Cloudflare host its onion services in the same data centre as the proxies they talk to?
Then, if the proxy strips out this header before sending the request to the origin site, or connects to the origin site using TLS, then this IP address shouldn't be visible on the backbone.
Does the Cloudflare proxy strip out the PROXY header? Or does it get transformed into X-Forwarded-For? (Or something similar?)
Also note: the CloudFlare dashboard shows the circuit id to site owners: https://blog.cloudflare.com/cloudflare-onion-service/
I can't see how having the actual circuit id is useful to site owners. They can't block it effectively, because it's transient. (And the same circuit id can be re-used by independent connections.)
Why does the Cloudflare dashboard show the circuit id to site owners? They can't effectively block a circuit id; if they try, there may be collateral damage to unrelated users; and it is an information leak.
That said, it's no worse than any other onion site operator using the circuit id feature, except that Cloudflare could collect and store a significant number of circuit ids.
How long does Cloudflare retain these circuit ids?
T
tor-onions@lists.torproject.org