Hello,
On 5/19/20 6:02 AM, Matthew Finkel wrote:
On Wed, Apr 29, 2020 at 01:07:40PM +0200, Alex Catarineu wrote: Are any of the conversions passed onto the GPU? Do we know if format conversation is deterministic?
True, I did not consider that the extraction (e.g. `toDataURL('image/jpeg')`) might add some entropy by itself. Good questions, we would need to investigate if this approach is going to be pursued. And I agree with tom, it would be good first to investigate what these sites are doing exactly with the canvas to evaluate what would be the best approach.
I did a little bit of digging and it seems like `toDataURL('image/xxx')` calls one of the encoders listed here https://searchfox.org/mozilla-central/search?q=symbol:_ZN11imgIEncoder12InitFromDataEPKhjjjjjRK12nsTSubstringIDsE&redirect=false. In particular, the JPEG Encoder https://searchfox.org/mozilla-central/source/image/encoders/jpeg/nsJPEGEncoder.cpp#94 seems to be doing numerical math to compress the image. Indeed, it seems like this observation was already made by
Hiding in the Crowd: an Analysis of the Effectiveness of Browser Fingerprinting at Large Scale Gómez-Boix, Laperdrix, and Baudry WWW '18 https://doi.org/10.1145/3178876.3186097
where on page 314, they say
We also tested the impact of compressing a canvas rendering to the JPEG format. It should be noted that the JPEG compression comes directly from the Canvas API and is not applied after collection. Due to the lossy compression, it should come as no surprise that the entropy from JPEG images is lower than the PNG one usually used by canvas fingerprinting tests (from 0.407 to 0.391)
Is there an easy way for us to do a study on this? Specifically, fix a random image, and then on a bunch of different computers, read the image and then do a `toDataURL('image/xxx')` for each of the formats.
Best, Sanketh