TSFN: A Novel Malicious Traffic Classification Method Using BERT and LSTM.

Entropy (Basel)

College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China.

Published: May 2023

AI Article Synopsis

  • Traffic classification is crucial for detecting network anomalies and enhancing security, but current methods face challenges with feature design and data set limitations.
  • The proposed BERT-based Time-Series Feature Network (TSFN) model incorporates both global and time-series features by using a BERT packet encoder and an LSTM module for improved accuracy.
  • Testing the TSFN on the USTC-TFC dataset achieved an impressive F1 score of 99.50%, demonstrating the effectiveness of considering time-series features in malicious traffic classification.

Article Abstract

Traffic classification is the first step in network anomaly detection and is essential to network security. However, existing malicious traffic classification methods have several limitations; for example, statistical-based methods are vulnerable to hand-designed features, and deep learning-based methods are vulnerable to the balance and adequacy of data sets. In addition, the existing BERT-based malicious traffic classification methods only focus on the global features of traffic and ignore the time-series features of traffic. To address these problems, we propose a BERT-based Time-Series Feature Network (TSFN) model in this paper. The first is a Packet encoder module built by the BERT model, which completes the capture of global features of the traffic using the attention mechanism. The second is a temporal feature extraction module built by the LSTM model, which captures the time-series features of the traffic. Then, the global and time-series features of the malicious traffic are incorporated together as the final feature representation, which can better represent the malicious traffic. The experimental results show that the proposed approach can effectively improve the accuracy of malicious traffic classification on the publicly available USTC-TFC dataset, reaching an F1 value of 99.50%. This shows that the time-series features in malicious traffic can help improve the accuracy of malicious traffic classification.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10216927PMC
http://dx.doi.org/10.3390/e25050821DOI Listing

Publication Analysis

Top Keywords

malicious traffic
32
traffic classification
24
features traffic
16
time-series features
16
traffic
13
malicious
8
classification methods
8
methods vulnerable
8
global features
8
module built
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!