brainbaking/network-intrusion-detection-through-packet-image-classifications.md at 58ecb765171f056298a62ec08e7b24979ba52fbb

5.9 KiB

Raw Blame History

title

date

tags

Network Intrusion Detection Through Packet Image Classifications

2024-01-08T12:50:00+01:00

programming

fpga

machine learning

A few months ago, Laurens, a colleague from another department, successfully defended his PhD thesis titled Machine Learning for Network Intrusion Detection on FPGA. Laurens is a hardware engineer and spent the last four years finding ways to speed up network intrusion detection using machine learning on programmable hardware or FPGAs. I know little about the topic, but Laurens' presentation was captivating, especially the image classification part.

The problem is this: how do we detect malicious network packets at scale? A network packet---a unit of data part of the network stream that passes through your network as you browse the internet---at byte level might look like this:

f80ff967e43300f48d
6b6d0d080045000096
2aff400080060ef1c0
a81f9dc0a81f84d0a5
1f4939c8790c4bb70c
20501801ff44830000
17030300695a7e87ae
5a26d1a8e6af38cc0c
e02b9f4aa1cd153785

The problem is that there are trillions of packets passing through the wires every few seconds, and even though standards define pieces of information within each packet (IP address, header information, payload, ...), these can all be faked, so filtering packets based on IP is just one step and won't be enough. Laurens instead based his approach on the work of Wang et al. who, in 2017, applied a very neat trick: why not convert the packet to pixels using a gray-scale and then use image recognition neural networks to quickly classify each packet? That means the above blob of hex data could be converted into a 28x28 pixel bock image.

We humans are gifted with the ability to quickly spot visual patterns. That's exactly what a convolutional neural network does to segment and classify images: you feed it an image, and it'll say it's a car, a dog, or a tree, with a surprisingly high accuracy, given the initial training set is big enough. That works because cars, dogs, and trees all look alike. Our daughter is nine months, but will soon too learn that our cats look a lot like the cat of our neighbor.

But how can we apply that to network packets? Simple: if you map a packet to a pixelated version, you'll immediately see that they too show striking visual resemblances. For example, take a look at the visualization analysis of Wang's team of different network-related sessions, from mail checking using outlook to World of Warcraft gaming or FTP server browsing:

Some images¹, like FTP and SMB, are very similar, but can be further broken in case the image recognition system has trouble discerning both. Overall, all other images are remarkably different. The question then becomes: are these patterns consistent? If a World of Warcraft packet looks like x, but another one from the same game looks like y, the neural network (and our own eyes) will have trouble correctly identifying it. The answer is a surprising yes:

Of course hackers are usually smart enough to conceal parts of data meaning their malicious network packets might showcase a bit more variation. Laurens investigated many different intrusion detection model architectures and categorized them according to how well attacks are detected (Detection Score) and how well attacks are identified (Identification Score). For existing architectures, this differed based on the trained sample size. This means that we first need to know how a malicious packet looks like before we can keep it from intruding.

For unknown attack detection, things are a bit more complicated, involving specific attack classes and shortcut learning. Furthermore, contamination robustness can be an issue, where only a small part of the current network packet might contain attack traffic (say, the last 10 bits), continued with the first 20 bits of the next packet.

In addition, thanks to the sheer volume of network packets, identification needs to be done quick, hence the idea to use FPGA heavyweight processors to process this closer to the bare metal. For that, Laurens relied on Brevitas, PyTorch, and FINN that exports his software model into Vivado which can burn it into an FPGA. FINN is an experimental framework from Xilinx specifically to experiment with neural networks on FPGAs, and comes with example datasets.

Again, I know little about neural networks and even less about FPGAs, but the idea to approach a network packet byte block as a visual image is really ingenious---a true example of creativity! I wonder how researchers came up with that idea. Perhaps by looking at how neural networks classify other data? Would visually looking at portions of sound waves also help in speech recognition? (The answer is yes, but there are other ways to do this).

It's stunning to see how much information can be extracted by literally looking at a network packet. Who knew that Facetime and Skype packets looked totally different, and that World of Warcraft data packets are that consistent? I wonder what happens when a new expansion is released---or a major network code refactor. As long as neural networks are capable of grouping and then classifying unknown groups, that's also not a problem.

If you're interested in reading more about network intrusion detection, take a look at Laurens' publications.

In case you are wondering: Weibo is a Chinese Facebook-like blogging website, and the researchers are Chinese. Neris and Geodo are Malware packets. ↩︎

5.9 KiB Raw Blame History

5.9 KiB

Raw Blame History