A Novel Approach to Detect Malware Variants Based on Classified Behaviors

An application programming interface (API) is an excellent feature since it is a procedure call interface to an operating system resource. Behavior features based on API play an important role in analyzing malware variants. However, the existing malware detection approaches have a lot of complex operations on construction and matching. Graph matching is an NP-complete problem and is time-consuming because of computational complexity. To address these issues, a promising approach is proposed to construct the classified behavior features from different malware families. In the proposed approach, a classified behavior feature consists of a kernel object (an API call parameter) and a series of operations (an API trace). Besides, a classified behavior graph (CBG) is represented as a number by hash to reduce workload and matching time. Subsequently, multiple machine learning classifiers are used for system classification. In particular, to verify the efficiency of our approach, we perform a series of experiments with different families. The experiments on 1220 malware samples show that the true positive rate is up to 88.3% and the false positive rate keeps within 3.9% by the support vector machine (SVM).

[1]  Yanfang Ye,et al.  IMDS: intelligent malware detection system , 2007, KDD '07.

[2]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[3]  Sattar Hashemi,et al.  To Incorporate Sequential Dynamic Features in Malware Detection Engines , 2012, 2012 European Intelligence and Security Informatics Conference.

[4]  David A. Wagner,et al.  Mimicry attacks on host-based intrusion detection systems , 2002, CCS '02.

[5]  Bazara I. A. Barry,et al.  Enhancing the Detection of Metamorphic Malware using Call Graphs , 2015 .

[6]  Xiaojiang Du,et al.  Privacy-Preserving and Efficient Aggregation Based on Blockchain for Power Grid Communications in Smart Communities , 2018, IEEE Communications Magazine.

[7]  Longbing Cao,et al.  SVM-based multi-state-mapping approach for multi-class classification , 2017, Knowl. Based Syst..

[8]  Douglas S. Reeves,et al.  Deriving common malware behavior through graph clustering , 2011, ASIACCS '11.

[9]  Mark Stamp,et al.  Malware Detection Using Dynamic Birthmarks , 2016, IWSPA@CODASPY.

[10]  Zhou Di,et al.  Feature representation and selection in malicious code detection methods based on static system calls , 2011 .

[11]  Stefano Zanero,et al.  Lines of malicious code: insights into the malicious software industry , 2012, ACSAC '12.

[12]  Christopher Krügel,et al.  Scalable, Behavior-Based Malware Clustering , 2009, NDSS.

[13]  Shahid Mumtaz,et al.  ECOSECURITY: Tackling Challenges Related to Data Exchange and Security: An Edge-Computing-Enabled Secure and Efficient Data Exchange Architecture for the Energy Internet , 2019, IEEE Consumer Electronics Magazine.

[14]  Mamoun Alazab,et al.  Profiling and classifying the behavior of malicious codes , 2015, J. Syst. Softw..

[15]  Bezawada Bruhadeshwar,et al.  Signature Generation and Detection of Malware Families , 2008, ACISP.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Wei Dai,et al.  Control flow-based opcode behavior analysis for Malware detection , 2014, Comput. Secur..

[18]  Somesh Jha,et al.  Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors , 2010, 2010 IEEE Symposium on Security and Privacy.

[19]  Mark Stamp,et al.  A comparison of static, dynamic, and hybrid analysis for malware detection , 2015, Journal of Computer Virology and Hacking Techniques.

[20]  Bo Yu,et al.  Automatic malware classification and new malware detection using machine learning , 2017, Frontiers of Information Technology & Electronic Engineering.

[21]  S. Parik,et al.  Malware Detection in Cloud Computing Infrastructures , 2015 .

[22]  Sattar Hashemi,et al.  A graph mining approach for detecting unknown malwares , 2012, J. Vis. Lang. Comput..

[23]  Stephanie Forrest,et al.  Intrusion Detection Using Sequences of System Calls , 1998, J. Comput. Secur..

[24]  Meltem Ozsoy,et al.  EnsembleHMD: Accurate Hardware Malware Detectors with Specialized Ensemble Classifiers , 2020, IEEE Transactions on Dependable and Secure Computing.

[25]  Nguyen Minh Hai,et al.  Packer identification based on metadata signature , 2017 .

[26]  Muhammad Zubair Shafiq,et al.  Using spatio-temporal information in API calls with machine learning algorithms for malware detection , 2009, AISec '09.

[27]  Christopher Krügel,et al.  Automating Mimicry Attacks Using Static Binary Analysis , 2005, USENIX Security Symposium.

[28]  Wanlei Zhou,et al.  Control Flow-Based Malware VariantDetection , 2014, IEEE Transactions on Dependable and Secure Computing.

[29]  Yong Tang,et al.  A New Malware Classification Approach Based on Malware Dynamic Analysis , 2017, ACISP.

[30]  Jin Kwak,et al.  Automatic malware mutant detection and group classification based on the n-gram and clustering coefficient , 2015, The Journal of Supercomputing.

[31]  S. Sathiya Keerthi,et al.  Developing parallel sequential minimal optimization for fast training support vector machine , 2006, Neurocomputing.

[32]  David Brumley,et al.  All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask) , 2010, 2010 IEEE Symposium on Security and Privacy.

[33]  Christopher Krügel,et al.  A survey on automated dynamic malware-analysis techniques and tools , 2012, CSUR.