Snowflake Anonymous Network Traffic Identification Yuying Wang, Guilong Yang, Dawei Xu, Cheng Dai, Tianxin Chen, Yunfan Yang https://link.springer.com/chapter/10.1007/978-981-99-9247-8_40
This paper tries to detect Snowflake by focusing on the DTLS transfer. Unlike the F-ACCUMUL paper (https://www.mdpi.com/2076-3417/13/1/622), however, it doesn't use DTLS handshake features, but rather traffic analysis features such as packet lengths and interarrival times.
Their classifier is a multilayer perceptron, though they also try SVM, random forest, and naive Bayes. The experiment is a closed world consisting of Snowflake, Facebook Messenger, Google Hangouts, and Discord. They don't evaluate realistic base rates: the most imbalanced evaluation they try is 40:1 (Figure 5). In short, it doesn't look like there is much to learn from this paper.
Traffic identification is a dynamic process, and as we identify traffic features, Snowflake developers may modify these features to render our model ineffective. This could lead to an ongoing cat-and-mouse situation. It is necessary to continually collect traffic data to maintain a high level of accuracy. In the future, we hope to automate this process to adapt to updates and changes in Snowflake versions.
Unusually, the authors have publised the Docker image they used to generate training and test data: https://hub.docker.com/layers/xinbigworld/ubuntu/1.2/images/sha256-3213fded0...