March 29, 2023

From Emulation to Mathematical: A More General Traffic Obfuscation Approach To Encounter Feature based Mobile App traffic Classification. (arXiv:2302.03118v1 [cs.CR])

The usage of the mobile app is unassailable in this digital era. While tons
of data are generated daily, user privacy security concerns become an important
issue. Nowadays, tons of techniques, such as machine learning and deep learning
traffic classifiers, have been applied to analyze users app traffic. These
techniques allow the monitor to get the fingerprints of using apps while the
user traffic is still encrypted, which raises a severe privacy issue. In order
to fight against this type of data analysis, people have been researching
obfuscation algorithms to confuse feature-based machine learning classifiers
with data camouflage by modification on packet length distribution. The
existing works achieve this goal by remapping traffic packet length
distribution from the source app to the fake camouflage app. However, this
solution suffers from its lack of scalability and flexibility in practical
application since the method needs to pre-sample the target fake apps traffic
before the use of traffic camouflage. In this paper, we proposed a practical
solution by using a mathematical model to calculate the target distribution
while maintaining at least 50 percent accuracy drops on the performance of the
AppScanner mobile traffic classifier and roughly 20 percent overhead created
during packet modification.