[anti-censorship-team] "Snowflake Anonymous Network Traffic Identification", November 2023

Fri Apr 19 01:41:31 UTC 2024

Snowflake Anonymous Network Traffic Identification
Yuying Wang, Guilong Yang, Dawei Xu, Cheng Dai, Tianxin Chen, Yunfan Yang 
https://link.springer.com/chapter/10.1007/978-981-99-9247-8_40

This paper tries to detect Snowflake by focusing on the DTLS transfer.
Unlike the F-ACCUMUL paper (https://www.mdpi.com/2076-3417/13/1/622),
however, it doesn't use DTLS handshake features, but rather traffic
analysis features such as packet lengths and interarrival times.

Their classifier is a multilayer perceptron, though they also try SVM,
random forest, and naive Bayes. The experiment is a closed world
consisting of Snowflake, Facebook Messenger, Google Hangouts, and
Discord. They don't evaluate realistic base rates: the most imbalanced
evaluation they try is 40:1 (Figure 5). In short, it doesn't look like
there is much to learn from this paper.

> Traffic identification is a dynamic process, and as we identify
> traffic features, Snowflake developers may modify these features to
> render our model ineffective. This could lead to an ongoing
> cat-and-mouse situation. It is necessary to continually collect
> traffic data to maintain a high level of accuracy. In the future, we
> hope to automate this process to adapt to updates and changes in
> Snowflake versions.

Unusually, the authors have publised the Docker image they used to
generate training and test data:
https://hub.docker.com/layers/xinbigworld/ubuntu/1.2/images/sha256-3213fded0606c0e59b4b31845910a020d2a340c9dc4e810c2e7381ed02d3b22e
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 978-981-99-9247-8_40.pdf
Type: application/pdf
Size: 1176785 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/anti-censorship-team/attachments/20240418/43e3ce3f/attachment-0001.pdf>