May 8, 2021

Analysis of Machine Learning Approaches to Packing Detection. (arXiv:2105.00473v1 [cs.CR])

Packing is an obfuscation technique widely used by malware to hide the
content and behavior of a program. Much prior research has explored how to
detect whether a program is packed. This research includes a broad variety of
approaches such as entropy analysis, syntactic signatures and more recently
machine learning classifiers using various features. However, no robust results
have indicated which algorithms perform best, or which features are most
significant. This is complicated by considering how to evaluate the results
since accuracy, cost, generalization capabilities, and other measures are all
reasonable. This work explores eleven different machine learning approaches
using 119 features to understand: which features are most significant for
packing detection; which algorithms offer the best performance; and which
algorithms are most economical.