On 5/19/20 6:02 AM, Matthew Finkel wrote:
On Wed, Apr 29, 2020 at 01:07:40PM +0200, Alex Catarineu wrote:
With respect to 2), I think it's interesting, but I also don't know whether it's feasible in practice. Specifically, I was thinking of Gijs idea of trying to keep state about whether the canvas is safe to read or not, fingerprinting-wise. I assume that there is a (non-empty) subset of canvas write operations that are "fingerprinting-safe". Probably a bit naively, I'd like to think that `canvas.drawImage` is "fp-safe" (irrespective of the image source). But even if we have to check the image source, I think implementing this could potentially unbreak some of these common legit canvas use cases.
For example, in the WhatsApp case mentioned above, I'm quite sure it's just used for image format conversion, since the bug does not occur when uploading "jpeg" images. So, that would be something like `canvas.drawImage(pngImage, 0, 0);` plus `canvas.toDataURL('image/jpeg');`, which should be covered if we implement the `canvas.drawImage` exemption when the image was uploaded by the user. This "fingerprinting-tainting" canvas logic might start with just the `drawImage` case, but perhaps it would be possible to extend little by little, if we know that some canvas write operation is safe and can help fixing breakage for legit use cases.
I generally agree with your message, but I am curious about this idea. Are you saying that ctx.drawImage() is fingerprinting-safe, or are you saying that any "canvas extraction" from a canvas element initialized by ctx.drawImage is fingerprinting-safe? As far as I'm aware, drawImage() is not protected by the Canvas prompt (so that should never be a problem). If your comment was about "subsequent canvas extraction", then that is worth investigating.
Yes, by fingerprinting-safe I meant the subsequent canvas extraction after a `drawImage`. And by checking the image source I meant that we might consider a `drawImage` fp-safe if we know the input is an image uploaded by the user, even if `drawImage` was not "fingerprinting-safe" in general (with the idea that canvas extraction might not result in useful fingerprinting in that case).
Are any of the conversions passed onto the GPU? Do we know if format conversation is deterministic?
True, I did not consider that the extraction (e.g. `toDataURL('image/jpeg')`) might add some entropy by itself. Good questions, we would need to investigate if this approach is going to be pursued. And I agree with tom, it would be good first to investigate what these sites are doing exactly with the canvas to evaluate what would be the best approach.
Hello,
On 5/19/20 6:02 AM, Matthew Finkel wrote:
On Wed, Apr 29, 2020 at 01:07:40PM +0200, Alex Catarineu wrote: Are any of the conversions passed onto the GPU? Do we know if format conversation is deterministic?
True, I did not consider that the extraction (e.g. `toDataURL('image/jpeg')`) might add some entropy by itself. Good questions, we would need to investigate if this approach is going to be pursued. And I agree with tom, it would be good first to investigate what these sites are doing exactly with the canvas to evaluate what would be the best approach.
I did a little bit of digging and it seems like `toDataURL('image/xxx')` calls one of the encoders listed here https://searchfox.org/mozilla-central/search?q=symbol:_ZN11imgIEncoder12InitFromDataEPKhjjjjjRK12nsTSubstringIDsE&redirect=false. In particular, the JPEG Encoder https://searchfox.org/mozilla-central/source/image/encoders/jpeg/nsJPEGEncoder.cpp#94 seems to be doing numerical math to compress the image. Indeed, it seems like this observation was already made by
Hiding in the Crowd: an Analysis of the Effectiveness of Browser Fingerprinting at Large Scale Gómez-Boix, Laperdrix, and Baudry WWW '18 https://doi.org/10.1145/3178876.3186097
where on page 314, they say
We also tested the impact of compressing a canvas rendering to the JPEG format. It should be noted that the JPEG compression comes directly from the Canvas API and is not applied after collection. Due to the lossy compression, it should come as no surprise that the entropy from JPEG images is lower than the PNG one usually used by canvas fingerprinting tests (from 0.407 to 0.391)
Is there an easy way for us to do a study on this? Specifically, fix a random image, and then on a bunch of different computers, read the image and then do a `toDataURL('image/xxx')` for each of the formats.
Best, Sanketh
On Tue, 2 Jun 2020 at 02:20, Sanketh Menda sgmenda@uwaterloo.ca wrote:
We also tested the impact of compressing a canvas rendering to the JPEG format. It should be noted that the JPEG compression comes directly from the Canvas API and is not applied after collection. Due to the lossy compression, it should come as no surprise that the entropy from JPEG images is lower than the PNG one usually used by canvas fingerprinting tests (from 0.407 to 0.391)
Is there an easy way for us to do a study on this? Specifically, fix a random image, and then on a bunch of different computers, read the image and then do a `toDataURL('image/xxx')` for each of the formats.
That's typically the first tact to take to test fingerprinting; but it can only prove a positive difference, it can't prove a negative.
From your description, if JPEG compression is deterministic then
toDataURL is deterministic. I bountied https://stackoverflow.com/questions/25303201/does-lossy-decompression-always... to try to figure out.
Also, FWIW I think we are at the point where if we aren't ready to implement, we should document this idea in a tbb-spec so it doesn't get lost.
-tom
On Fri, Jun 12, 2020 at 06:56:22PM +0000, Tom Ritter wrote:
On Tue, 2 Jun 2020 at 02:20, Sanketh Menda sgmenda@uwaterloo.ca wrote:
We also tested the impact of compressing a canvas rendering to the JPEG format. It should be noted that the JPEG compression comes directly from the Canvas API and is not applied after collection. Due to the lossy compression, it should come as no surprise that the entropy from JPEG images is lower than the PNG one usually used by canvas fingerprinting tests (from 0.407 to 0.391)
Is there an easy way for us to do a study on this? Specifically, fix a random image, and then on a bunch of different computers, read the image and then do a `toDataURL('image/xxx')` for each of the formats.
That's typically the first tact to take to test fingerprinting; but it can only prove a positive difference, it can't prove a negative.
From your description, if JPEG compression is deterministic then
toDataURL is deterministic. I bountied https://stackoverflow.com/questions/25303201/does-lossy-decompression-always... to try to figure out.
With toDataURL("image/jpeg"), you can also specify the quality level. DIfferent quality levels do result in different compressed outputs, even with a blank canvas. I did a quick test of this in 2018 (probably using whatever Firefox ESR was at the time) and saved partial results: >> c = document.createElement("canvas") >> c.width = 100 >> c.height = 100 >> urls = [0.0, 0.25, 0.5, 0.75, 1.0].map(q => c.toDataURL("image/jpeg", q)) >> urls.map(x => x.length) Array [ 1119, 1119, 1123, 1123, 1255 ] >> urls.map(x => x.substr(-10)) Array [ "ACiiigD//Z", "oAKKKKAP/Z", "iigD//2Q==", "oooA//2Q==", "CgAoAKAP/Z" ]
Running the same console commands today in Firefox 68.9.0esr produces the same output.
On Fri, Jun 12, 2020 at 06:56:22PM +0000, Tom Ritter wrote:
That's typically the first tact to take to test fingerprinting; but it can only prove a positive difference, it can't prove a negative.
Yes, it wouldn't be a comprehensive test but would add some weight to our theories. Also, as David pointed out, we should probably include quality levels in our test.
From your description, if JPEG compression is deterministic then toDataURL is deterministic. I bountied https://stackoverflow.com/questions/25303201/does-lossy-decompression- always-generate-same-output to try to figure out.
Yup, it seems so, but toDataURL doesn't necessarily need to be deterministic for us to implement this feature since we will only allow toDataURL calls if the canvas was "untainted"; that is, it only has user-uploaded data. Thus, unless the attacker can get everyone to upload the same image, they will not be able to distinguish between users via subtle differences.
Also, FWIW I think we are at the point where if we aren't ready to implement, we should document this idea in a tbb-spec so it doesn't get lost.
I vote draft tbb-spec. There still seems to be a lot of wiggle room (for instance, what operations are "tainting") in designing this feature and getting the design all fleshed out before writing code might help save time and prevent confusion in the future.
Best, Sanketh
-----Original Message----- From: tbb-dev tbb-dev-bounces@lists.torproject.org On Behalf Of David Fifield Sent: June 12, 2020 3:36 PM To: tbb-dev@lists.torproject.org Subject: Re: [tbb-dev] Canvas Breakage Ideas
On Fri, Jun 12, 2020 at 06:56:22PM +0000, Tom Ritter wrote:
On Tue, 2 Jun 2020 at 02:20, Sanketh Menda sgmenda@uwaterloo.ca wrote:
We also tested the impact of compressing a canvas rendering to the JPEG format. It should be noted that the JPEG compression comes directly from the Canvas API and is not applied after collection. Due to the lossy compression, it should come as no surprise that the entropy from JPEG images is lower than the PNG one usually used by canvas fingerprinting tests (from 0.407 to 0.391)
Is there an easy way for us to do a study on this? Specifically, fix a random image, and then on a bunch of different computers, read the image and then do a `toDataURL('image/xxx')` for each of the formats.
That's typically the first tact to take to test fingerprinting; but it can only prove a positive difference, it can't prove a negative.
From your description, if JPEG compression is deterministic then
toDataURL is deterministic. I bountied https://stackoverflow.com/questions/25303201/does-lossy-decompression- always-generate-same-output to try to figure out.
With toDataURL("image/jpeg"), you can also specify the quality level. DIfferent quality levels do result in different compressed outputs, even with a blank canvas. I did a quick test of this in 2018 (probably using whatever Firefox ESR was at the time) and saved partial results: >> c = document.createElement("canvas") >> c.width = 100 >> c.height = 100 >> urls = [0.0, 0.25, 0.5, 0.75, 1.0].map(q => c.toDataURL("image/jpeg", q)) >> urls.map(x => x.length) Array [ 1119, 1119, 1123, 1123, 1255 ] >> urls.map(x => x.substr(-10)) Array [ "ACiiigD//Z", "oAKKKKAP/Z", "iigD//2Q==", "oooA//2Q==", "CgAoAKAP/Z" ]
Running the same console commands today in Firefox 68.9.0esr produces the same output. _______________________________________________ tbb-dev mailing list tbb-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tbb-dev
I was wondering whether color management affects canvas rendering. Like, does `rgb(100, 120, 140)` always represent the same sRGB-encoded value when written to a PNG, or could it depend on monitor and OS color management settings?
What about the new wide-gamut colors available in CSS? Can those be used in canvas? What happens when they are written to a PNG? If the writing process maps them back into sRGB space, is that process deterministic? https://lea.verou.me/2020/04/lch-colors-in-css-what-why-and-how/ https://graphicdon.com/2020/05/27/the-expanding-gamut-of-color-on-the-web/
On Tue, Jun 30, 2020 at 12:42:57PM -0600, David Fifield wrote:
I was wondering whether color management affects canvas rendering. Like, does `rgb(100, 120, 140)` always represent the same sRGB-encoded value when written to a PNG, or could it depend on monitor and OS color management settings?
What about the new wide-gamut colors available in CSS? Can those be used in canvas? What happens when they are written to a PNG? If the writing process maps them back into sRGB space, is that process deterministic? https://lea.verou.me/2020/04/lch-colors-in-css-what-why-and-how/ https://graphicdon.com/2020/05/27/the-expanding-gamut-of-color-on-the-web/
Sorry, just found one more relevant link. https://webkit.org/blog/6682/improving-color-on-the-web/
# Wide-gamut colors in HTML
While CSS handles most of the presentation of an HTML document, there is still one important area which is outside its scope: the `canvas` element. Both 2D and WebGL canvases assume they operate within the sRGB color space. This means that even on wide-gamut displays, you won’t be able to create a canvas that exercises the full range of color.
The proposed solution is to add an optional flag to the `getContext` function, specifying the color space the canvas should be color matched to. For example:
``` // NOTE: Proposed syntax. Not yet implemented. canvas.getContext("2d", { colorSpace: "p3" }); ```
I created a draft tbb-spec proposal: https://gitlab.torproject.org/tpo/applications/tor-browser-spec/-/merge_requ...
It is nowhere close to ready but I think it has the key ideas and could help us move the ball forward.
Best, Sanketh
The draft has been merged into `proposals/ideas` (https://gitlab.torproject.org/tpo/applications/tor-browser-spec/-/commit/620...) and I think this achieves the original goal of documenting these ideas. Yay!
I think the final, merged draft looks good, I'd appreciate any suggestions to improve it, and if you think it looks good (and have commit access) could you move it to the `proposals` directory?
I guess once this becomes a proposal, we can start planning experiments.
Best, Sanketh
Sanketh Menda:
The draft has been merged into `proposals/ideas` (https://gitlab.torproject.org/tpo/applications/tor-browser-spec/-/commit/620...) and I think this achieves the original goal of documenting these ideas. Yay!
I think the final, merged draft looks good, I'd appreciate any suggestions to improve it, and if you think it looks good (and have commit access) could you move it to the `proposals` directory?
I'll wait like 1 or 2 weeks for folks giving input and then move it if everything still looks fine.
I guess once this becomes a proposal, we can start planning experiments.
Sounds good, thanks!
Georg
Best, Sanketh
tbb-dev mailing list tbb-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tbb-dev
Georg Koppen:
Sanketh Menda:
The draft has been merged into `proposals/ideas` (https://gitlab.torproject.org/tpo/applications/tor-browser-spec/-/commit/620...) and I think this achieves the original goal of documenting these ideas. Yay!
I think the final, merged draft looks good, I'd appreciate any suggestions to improve it, and if you think it looks good (and have commit access) could you move it to the `proposals` directory?
I'll wait like 1 or 2 weeks for folks giving input and then move it if everything still looks fine.
I moved ahead and added the draft as proposal 105 (commit 7835e2b2d6f7c9e79330171f5fcab0c9e9ae7977). I've not opened a ticket yet. We could do so now or once someone is actually working on it. I am fine either way.
Georg