dCollection 디지털 학술정보 유통시스템

Investigating the Effect of Traffic Sampling on Machine Learning-Based Network Intrusion Detection Approaches

주제(키워드) Network intrusion detection , Monitoring , Deep learning , Convolutional neural networks , Computer science , Systematics , Switches , Flow information export , network traffic sampling , intrusion detection , machine learning , deep learning , CNN
주제(기타) Computer Science, Information Systems
주제(기타) Engineering, Electrical & Electronic
주제(기타) Telecommunications
설명문(일반) [Alikhanov, Jumabek; Noh, Youngtae] Inha Univ, Dept Comp Sci & Informat Engn, Incheon 402751, South Korea; [Jang, Rhongho] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA; [Abuhamad, Mohammed] Loyola Univ Chicago, Dept Comp Sci, Chicago, IL 60626 USA; [Mohaisen, David] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA; [Nyang, Daehun] Ewha Womans Univ, Dept Cyber Secur, Seoul 03760, South Korea
관리정보기술 faculty
등재 SCIE, SCOPUS
OA유형 gold
발행기관 IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
발행년도 2022
세부유형 Article
URI http://www.dcollection.net/handler/ewha/000000190295
본문언어 영어
Published As https://doi.org/10.1109/ACCESS.2021.3137318

초록/요약

Machine Learning (ML) based Network Intrusion Systems (NIDSs) operate on flow features which are obtained from flow exporting protocols (i.e., NetFlow). Recent success of ML and Deep Learning (DL) based NIDS solutions assume such flow information (e.g., avg. packet size) is obtained from all packets of the flow. However, often in practice flow exporter is deployed on commodity devices where packet sampling is inevitable. As a result, applicability of such ML based NIDS solutions in the presence of sampling (i.e., when flow information is obtained from sampled set of packets instead of full traffic) is an open question. In this study, we explore the impact of packet sampling on the performance and efficiency of ML-based NIDSs. Unlike previous work, our proposed evaluation procedure is immune to different settings of flow export stage. Hence, it can provide a robust evaluation of NIDS even in the presence of sampling. Through sampling experiments we established that malicious flows with shorter size (i.e., number of packets) are likely to go unnoticed even with mild sampling rates such as 1/10 and 1/100. Next, using the proposed evaluation procedure we investigated the impact of various sampling techniques on NIDS detection rate and false alarm rate. Detection rate and false alarm rate is computed for three sampling rates (i.e., 1/10, 1/100, 1/1000), for four different sampling techniques and for three (two tree-based, one deep learning based) classifiers. Experimental results show that systematic linear sampler - SketFlow performs better compared to non-linear samplers such as Sketch Guided and Fast Filtered sampling. We also found that random forest classifier with SketchFlow sampling was a better combination. The combination showed higher detection rate and lower false alarm rate across multiple sampling rates compared to other sampler-classifier combinations. Our results are consistent in multiple sampling rates, exceptional case is observed for Sketch Guided Sampling (SGS) as it caused a drastic performance drop when sampling rate was changed from 1/100 to 1/1000. Our results provide valuable insights for network practitioners and researchers regarding on how packet sampling effects ML-based NIDS performance. In this regard full source code for sampling and ML experiments has been released: github.com/Jumabek/sampledFlowMeter and github.com/Jumabek/nids-with-sampling

반출 Meta View 목록

검색 상세

Investigating the Effect of Traffic Sampling on Machine Learning-Based Network Intrusion Detection Approaches

초록/요약