Hey y'all,
For the past little while I've been working on a technical overview doc for #3600 (Prevent redirects from transmitting+storing cookies+identifiers) detailing the problems, scenarios and possible solutions. Please take a look and feel free to comment, edit or add!
Link: https://storm.torproject.org/grain/X4nhdNqR9fGRc7sgTefFkg/
best, -Richard
And here's a link that actually works: https://storm.torproject.org/shared/Kw99Ow0ExZFFC6FKD5CeryfVFAoAL9Z_iEVlflI0...
On 10/26/18 1:34 PM, Richard Pospesel wrote:
Hey y'all,
For the past little while I've been working on a technical overview doc for #3600 (Prevent redirects from transmitting+storing cookies+identifiers) detailing the problems, scenarios and possible solutions. Please take a look and feel free to comment, edit or add!
Link: https://storm.torproject.org/grain/X4nhdNqR9fGRc7sgTefFkg/
best, -Richard _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
I spent some time reading through the Mix and Match proposal. I'm not sure I understand it.
In particular, I am confused about:
The proposal seems to focus heavily on what we do with state we receive as part of the redirect. Do we promote it, do we leave it double keyed. It doesn't seem to explain how we choose what state to _send_. For example:
For instance, in a redirect chain from foo.com -> tracker.com -> bar.com, the tracker.com cookies will be double keyed foo.com|tracker.com, while the bar.com cookies will be double keyed foo.com|bar.com. However, after the user begins to interact with bar.com, bar.com is promoted to be the First Party Domain, and Cookies set on the initial redirect need to be moved under the bar.com key.
When we send a request to foo.com, I assume we will send any current cookies we have keyed under foo.com|foo.com[0]. When we receive a redirect to tracker.com - how do we choose what state to send? We don't know head of time whether it will give us a redirect or not, so are we sending it any state we have under tracker.com|tracker.com (treating it as a first party) or are we sending it any state we have under foo.com|tracker.com?
The latter is better for privacy; but it would require you to re-sign-in via Oauth a lot (pretend tracker.com is oauth.com); and I'm nervous it would break login flows. Especially if you interact with oauth.com and that seems to promote it into oauth.com|oauth.com and then you later go through foo.com|oauth.com and there's no state there...
[0] I'm pretty sure that we use the First Party Domain as both the primary and secondary key for state under the first party; right? In any event, when I say foo.com|foo.com I mean data keyed under the foo.com first party.
I'm also a bit confused about the difference between different targets of redirects. It seems like: - If the target is example.com: we don't double-key or need to promote upon interaction - If the target is example.com?lang=en: we do double-key any state set, and upon user interaction promote the state to first party. - If the target is example.com/foo/bar.html: we do double-key any state set, and upon user interaction promote the state to first party.
Finally, in a multi-redirect scenario like a.com -> b.com -> c.com, I'm unsure if there is a difference in how we handle state we receive for b.com if: - The target is b.com - The target is b.com?lang=en - The target is b.com/foo/bar.html
I started drawing out a matrix of what happens when. I came up with the following. I don't think I understand the proposal well enough to fill it out. I'm hoping I will be able to do so though! I'm going to paste it in its entirety:
---------- Single-Redirect, Before User Interaction
Click a link for aaa.com/foo/blah.html and the response redirects to ccc.com (before any user interaction): - To aaa.com you send state keyed under aaa.com|aaa.com - To ccc.com you send state keyed under ccc.com|ccc.com - The browser deposits you at ccc.com - Any cookies or other state set by aaa.com is set normally according to FPI rules, so will be keyed under aaa.com|aaa.com - Any cookies or other state set by ccc.com is set normally according to FPI rules, so will be keyed under ccc.com|ccc.com
Click a link for aaa.com/foo/blah.html and the response redirects to ccc.com?lang=en (before any user interaction): - To aaa.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ?? - Any cookies or other state set by aaa.com is keyed under ?? - Any cookies or other state set by ccc.com is keyed under ??
Click a link for aaa.com/foo/blah.html and the response redirects to ccc.com/new-foo/blah.html (before any user interaction): - To aaa.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ?? - Any cookies or other state set by aaa.com is keyed under ?? - Any cookies or other state set by ccc.com is keyed under ??
---------- Single-Redirect, After User Interaction Perhaps you scroll the page at ccc.com or perhaps click a link or highlight some text.
Click a link for aaa.com/foo/blah.html and the response redirects to ccc.com, and then you interact: - To aaa.com you send state keyed under aaa.com|aaa.com - To ccc.com you send state keyed under ccc.com|ccc.com - The browser deposits you at ccc.com - There is no change to state for aaa.com, as it is already stored under aaa.com|aaa.com - There is no change to state for ccc.com, as it is already stored under ccc.com|ccc.com
Click a link for aaa.com/foo/blah.html and the response redirects to ccc.com?lang=en, and then you interact: - To aaa.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ?? - Any cookies or other state set by aaa.com is migrated(?) and now keyed under ?? - Any cookies or other state set by ccc.com is migrated(?) and now keyed under ??
Click a link for aaa.com/foo/blah.html and the response redirects to ccc.com/new-foo/blah.html, and then you interact: - To aaa.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ?? - Any cookies or other state set by aaa.com is migrated(?) and now keyed under ?? - Any cookies or other state set by ccc.com is migrated(?) and now keyed under ??
---------- Multi-Redirect, Before User Interaction
Click a link for aaa.com/foo/blah.html and the response redirects to bbb.com and the bbb.com response then redirects to ccc.com (before any user interaction): - To aaa.com you send state keyed under ??? - To bbb.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ccc.com - Any cookies or other state set by aaa.com is ??? - Any cookies or other state set by bbb.com is ??? - Any cookies or other state set by ccc.com is ???
Click a link for aaa.com/foo/blah.html and the response redirects to bbb.com and then bbb.com response then redirects you to ccc.com?lang=en (before any user interaction): - To aaa.com you send state keyed under ??? - To bbb.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ?? - Any cookies or other state set by aaa.com is ??? - Any cookies or other state set by bbb.com is ??? - Any cookies or other state set by ccc.com is ???
Click a link for aaa.com/foo/blah.html and the response redirects to bbb.com and then the bbb.com response redirects you to ccc.com/new-foo/blah.html (before any user interaction): - To aaa.com you send state keyed under ??? - To bbb.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ?? - Any cookies or other state set by aaa.com is ??? - Any cookies or other state set by bbb.com is ??? - Any cookies or other state set by ccc.com is ???
*** Is there any behavior change between a the middle redirect that goes to bbb.com vs bbb.com/?querystring or bbb.com/foo/bar.html ***
---------- Multi-Redirect, After User Interaction Perhaps you scroll the page at ccc.com or perhaps click a link or highlight some text.
Click a link for aaa.com/foo/blah.html and the response redirects to bbb.com and the bbb.com response then redirects to ccc.com, and then you interact: - To aaa.com you send state keyed under ??? - To bbb.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ccc.com - Any cookies or other state set by aaa.com is ??? - Any cookies or other state set by bbb.com is ??? - Any cookies or other state set by ccc.com is ???
Click a link for aaa.com/foo/blah.html and the response redirects to bbb.com and then bbb.com response then redirects you to ccc.com?lang=en, and then you interact: - To aaa.com you send state keyed under ??? - To bbb.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ?? - Any cookies or other state set by aaa.com is ??? - Any cookies or other state set by bbb.com is ??? - Any cookies or other state set by ccc.com is ???
Click a link for aaa.com/foo/blah.html and the response redirects to bbb.com and then the bbb.com response redirects you to ccc.com/new-foo/blah.html, and then you interact: - To aaa.com you send state keyed under ??? - To bbb.com you send state keyed under ??? - To ccc.com you send state keyed under ??? - The browser deposits you at ?? - Any cookies or other state set by aaa.com is ??? - Any cookies or other state set by bbb.com is ??? - Any cookies or other state set by ccc.com is ???
*** Is there any behavior change between a the middle redirect that goes to bbb.com vs bbb.com/?querystring or bbb.com/foo/bar.html ***
Richard Pospesel:
And here's a link that actually works: https://storm.torproject.org/shared/Kw99Ow0ExZFFC6FKD5CeryfVFAoAL9Z_iEVlflI0...
Thanks for collecting and sharing all the possible ideas here. Some comments come to mind after thinking a bit about it.
1) We probably won't get that feature right in our first attempt (let's assume there is something like "right" here at all), so I would not want to spend too much time trying to fix all the rabbit holes we find while thinking about and implementing fixes. In particular, I'd suggest we try to ignore the scenario that identifiers, cookies etc. get somehow passed on in the URL bar over redirects for now. Dealing with tracking information in URLs is a tricky topic of its own and somewhat orthogonal to redirects.
2) For Tor Browser I think I am currently most interested in the "Expand First Party Double-Keying Scheme to Redirected Content" scenario, thus I'd like to look a bit closer at it. Looking over the Cons I don't see OAuth and similar authentication mechanisms being broken, is that correct? If so, great, and certainly a plus.
I think I don't understand the scenario in Con 1, that is how a user can effectively end up with two simultaneous identities depending on whether they came from https://gogle.com/ or https://google.com/. For instance, if I enter https://gogle.com, why should I end up with a different identity than coming from https://google.com? https://gogle.com is not even settings cookies, but even if it were the final response from google.com is a 200 with a Set-Cookie header (among other things). That cookie would I sent back regardless once I decide I want to log in. The same happens in the scenario where I already had been logged into Google before I think.
3) I am not sure about Con 2 yet, but another thing we can keep in mind is that we have the New Identity feature against powerful trackers/longterm tracking. If we don't find a solution to Con 2 I think pointing to that defense as a stop gap is not the worst idea. At any rate I feel not having a solution right now to that one should not stop us from experimenting.
Georg
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
For background: Currently with first-party isolation enabled if foo.com embeds content from bar.com the cookies we would send to bar.com would come from the foo.com|bar.com double-keyed bucket, whereas if we were to visit bar.com directly the cookies used would just come from the bar.com bucket. This way embedded bar.com content requests don't get correlated with first-party bar.com sessions (and vice-versa).
So the idea behind the proposed 'Expand First Party Double-Keying Scheme to Redirect Content' is that we treat redirected content the same as we currently treat embedded content now by double-keying. If foo.com redirects to bar.com, we treat the originating foo.com as the first-party domain and subsequent cookies saved by/sent to bar.com would use the foo.com|bar.com bucket, as if bar.com was embedded content in foo.co
Con 1 outlines the common scenario where websites will squat on similar or old domains that redirect to the correct one (ie gogle.com -> google.com or wachovia.com -> wellsfargo.com). So, suppose foo.com redirects to bar.com and the user then logs in to their bar.com account. Due to the redirect, your session cookie bar.com directs you to store after a successful login will be saved in the double-keyed foo.com|bar.com bucket. This is fine so long as the user always gets to bar.com via the redirect, but if the user learns the new url and now navigates to bar.com directly (rather than from the foo.com redirect), they would be pulling cookies from the bar.com bucket directly since there was no redirect and bar.com would be the first-party. Thus, you would have two concurrent session cookies, one in the bar.com bucket and one in the foo.com|bar.com bucket. If the user is trying to juggle multiple identities, then accidental use of the wrong account is one typo away.
The Double-Keyed Redirect Cookies + 'Domain Promotion' tries to fix this multiple/hidden session problem by promoting the cookies of double-keyed websites to first-party status in the case where the originating domain is positively identified as solely a redirect. In the gogle.com -> google.com scenario, if Tor Browser could identify that gogle.com is used solely to redirect to google.com, then we could take the double-keyed gogle.com|google.com cookies and move them into the google.com bucket and eliminate the double session.
I hope that clears things up.
best, - -Richard
On 1/11/19 9:05 AM, Georg Koppen wrote:
Richard Pospesel:
And here's a link that actually works: https://storm.torproject.org/shared/Kw99Ow0ExZFFC6FKD5CeryfVFAoAL9Z_iEVlflI0...
Thanks for collecting and sharing all the possible ideas here. Some comments come to mind after thinking a bit about it.
- We probably won't get that feature right in our first attempt (let's
assume there is something like "right" here at all), so I would not want to spend too much time trying to fix all the rabbit holes we find while thinking about and implementing fixes. In particular, I'd suggest we try to ignore the scenario that identifiers, cookies etc. get somehow passed on in the URL bar over redirects for now. Dealing with tracking information in URLs is a tricky topic of its own and somewhat orthogonal to redirects.
- For Tor Browser I think I am currently most interested in the "Expand
First Party Double-Keying Scheme to Redirected Content" scenario, thus I'd like to look a bit closer at it. Looking over the Cons I don't see OAuth and similar authentication mechanisms being broken, is that correct? If so, great, and certainly a plus.
I think I don't understand the scenario in Con 1, that is how a user can effectively end up with two simultaneous identities depending on whether they came from https://gogle.com/ or https://google.com/. For instance, if I enter https://gogle.com, why should I end up with a different identity than coming from https://google.com? https://gogle.com is not even settings cookies, but even if it were the final response from google.com is a 200 with a Set-Cookie header (among other things). That cookie would I sent back regardless once I decide I want to log in. The same happens in the scenario where I already had been logged into Google before I think.
- I am not sure about Con 2 yet, but another thing we can keep in mind
is that we have the New Identity feature against powerful trackers/longterm tracking. If we don't find a solution to Con 2 I think pointing to that defense as a stop gap is not the worst idea. At any rate I feel not having a solution right now to that one should not stop us from experimenting.
Georg
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Fri, 18 Jan 2019 at 21:00, Richard Pospesel richard@torproject.org wrote:
The Double-Keyed Redirect Cookies + 'Domain Promotion' tries to fix this multiple/hidden session problem by promoting the cookies of double-keyed websites to first-party status in the case where the originating domain is positively identified as solely a redirect. In the gogle.com -> google.com scenario, if Tor Browser could identify that gogle.com is used solely to redirect to google.com, then we could take the double-keyed gogle.com|google.com cookies and move them into the google.com bucket and eliminate the double session.
How would we detect this?
Let's say hypothetically (I haven't checked) gogle.com does not set any cookies; and just sends a 301 permanent redirect. We then perform the upgrade from gogle.com|google.com to google.com
If we turn it on its head: google.com decides to redirect you to tracker342451345.google.com with a 301 (and setting no cookies.) We upgrade google.com|tracker342451345.google.com to tracker342451345.google.com and do so for as long as your session is open. Does this enabling a tracking vector? I don't think so; couldn't identify one - but it feels like there might be something here...
-tom
New development: https://webkit.org/blog/8613/intelligent-tracking-prevention-2-1/
In particular:
--------- WebKit implemented partitioned caches more than five years ago. A partitioned cache means cache entries for third-party resources are double-keyed to their origin and the first-party eTLD+1. This prohibits cross-site trackers from using the cache to track users. Even so, our research has shown that trackers, in order to keep their practices alive under ITP, have resorted to partitioned cache abuse. Therefore, we have developed the verified partitioned cache.
When a partitioned cache entry is created for a domain that’s classified by ITP as having cross-site tracking capabilities, the entry gets flagged for verification. After seven days, if there’s a cache hit for such a flagged entry, WebKit will act as if it has never seen this resource and load it again. The new response is then compared to the cached response and if they match in the ways we care about for privacy reasons, the verification flag is cleared and the cache entry is from that point considered legitimate. However, if the new response does not match the cache entry, the old entry is discarded, and a new one is created with the verification flag set, and the verification process starts over.
ITP currently does this verification for permanent redirects since that’s where we see abuse today. ----------
It's not clear to me if the permanent redirects are in a partitioned cache though. Either way, this doesn't affect Tor too much given that we don't save history.
Although it does bring up a simple case that e could implement with no problem: never remember a permanent redirect.
-tom
Richard Pospesel:
And here's a link that actually works: https://storm.torproject.org/shared/Kw99Ow0ExZFFC6FKD5CeryfVFAoAL9Z_iEVlflI0...
Thanks for collecting and sharing all the possibly ideas here. Some comments come to mind after thinking a bit about it.
1) We probably won't get that feature right in our first attempt (let's assume there is something like "right" here at all), so I would not want to spend too much time trying to fix all the rabbit holes we find while thinking about and implementing fixes. In particular, I'd suggest we try to ignore the scenario that identifiers, cookies etc. get somehow passed on in the URL bar over redirects for now. Dealing with tracking information in URLs is a tricky topic of its own and somewhat orthogonal to redirects.
2) For Tor Browser I think I am currently most interested in the "Expand First Party Double-Keying Scheme to Redirected Content" scenario, thus I'd like to look a bit closer at it. Looking over the Cons I don't see OAuth and similar authentication mechanisms being broken, is that correct? If so, great, and certainly a plus.
I think I don't understand the scenario in Con 1, that is how a user can effectively end up with two simultaneous identities depending on whether they came from https://gogle.com/ or https://google.com/. For instance, if I enter https://gogle.com, why should I end up with a different identity than coming from https://google.com? https://gogle.com is not even settings cookies, but even if it were the final response from google.com is a 200 with a Set-Cookie header (among other things). That cookie would I sent back regardless once I decide I want to log in. The same happens in the scenario where I already had been logged into Google before I think.
3) I am not sure about Con 2 yet, but another thing we can keep in mind is that we have the New Identity feature against powerful trackers/longterm tracking. If we don't find a solution to Con 2 I think pointing to that defense as a stop gap is not the worst idea. At any rate I feel not having a solution right now to that one should not stop us from experimenting.
Georg