network intrusion detection Laurens

This commit is contained in:
Wouter Groeneveld 2024-01-08 13:48:22 +01:00
parent 86ab2436b2
commit 58ecb76517
6 changed files with 59 additions and 1 deletions

View File

@ -31,7 +31,7 @@ The _Malloreon_ chronicles by [David Eddings](https://www.eddingschronicles.com/
After finishing _Super Mario Bros. Wonder_, I continued with the 2D Mario strike with the _Mario Land_ Game Boy series that evolved into _Wario Land_ and _Wario Land II_. The best is yet to come, though, as _Wario Land 3_ is on my [25 Best Games of All Time](/post/2023/10/top-25-best-games-of-all-time) list! I know them all by heart but still breeze through these platformers once every few years.
My wife discovered a cheap Switch eShop code for _Mario + Rabids: Kingdom Battle_ and since I love tactical turn-based games, I eagerly dove in. It's okay so far: the core gameplay is very solid, but everything slathered on top is not my cup of tea. I'm nearing the end and will have a review up shortly. Meanwhile, Kristien insists on playing [Railbound](https://store.steampowered.com/app/1967510/Railbound/), a cosy railway puzzle game with sometimes devilishly difficult levels! Most of the time, I just don't "see it", but we're having fun together nonetheless.
My wife discovered a cheap Switch eShop code for _Mario + Rabbids: Kingdom Battle_ and since I love tactical turn-based games, I eagerly dove in. It's okay so far: the core gameplay is very solid, but everything slathered on top is not my cup of tea. I'm nearing the end and will have a review up shortly. Meanwhile, Kristien insists on playing [Railbound](https://store.steampowered.com/app/1967510/Railbound/), a cosy railway puzzle game with sometimes devilishly difficult levels! Most of the time, I just don't "see it", but we're having fun together nonetheless.
## Selected (blog) posts

View File

@ -93,3 +93,7 @@ tags:
```
The tags/genre keys are still ambiguous, but that's for another time.
---
**Addendum** 05th Jan.: It seems that Hugo recently gained an improved feature set for finding and [indexing related content](https://gohugo.io/content-management/related/)! Page content itself isn't indexed though, so my backlinks/forwardlinks idea can't be done with that.

View File

@ -0,0 +1,53 @@
---
title: Network Intrusion Detection Through Packet Image Classifications
date: 2024-01-08T12:50:00+01:00
categories:
- programming
tags:
- fpga
- machine learning
---
A few months ago, Laurens, a colleague from another department, successfully defended his PhD thesis titled [Machine Learning for Network Intrusion Detection on FPGA](https://lirias.kuleuven.be/4120937). Laurens is a hardware engineer and spent the last four years finding ways to speed up network intrusion detection using machine learning on programmable hardware or FPGAs. I know little about the topic, but Laurens' presentation was captivating, especially the image classification part.
The problem is this: _how do we detect malicious network packets at scale?_ A network packet---a unit of data part of the network stream that passes through your network as you browse the internet---at byte level might look like this:
```
f80ff967e43300f48d
6b6d0d080045000096
2aff400080060ef1c0
a81f9dc0a81f84d0a5
1f4939c8790c4bb70c
20501801ff44830000
17030300695a7e87ae
5a26d1a8e6af38cc0c
e02b9f4aa1cd153785
```
The problem is that there are trillions of packets passing through the wires every few seconds, and even though standards define pieces of information within each packet (IP address, header information, payload, ...), these can all be faked, so filtering packets based on IP is just one step and won't be enough. Laurens instead based his approach [on the work of Wang et al.](https://ieeexplore.ieee.org/document/7899588) who, in 2017, applied a very neat trick: why not convert the packet to pixels using a gray-scale and then use image recognition neural networks to quickly classify each packet? That means the above blob of hex data could be converted into a 28x28 pixel bock image.
We humans are gifted with the ability to quickly spot visual patterns. That's exactly what a [convolutional neural network](https://en.wikipedia.org/wiki/Convolutional_neural_network) does to segment and classify images: you feed it an image, and it'll say it's a car, a dog, or a tree, with a surprisingly high accuracy, given the initial training set is big enough. That works because cars, dogs, and trees all look alike. Our daughter is nine months, but will soon too learn that our cats look a lot like the cat of our neighbor.
But how can we apply that to network packets? Simple: if you map a packet to a pixelated version, you'll immediately see that they too show striking visual resemblances. For example, take a look at the visualization analysis of Wang's team of different network-related sessions, from mail checking using outlook to _World of Warcraft_ gaming or FTP server browsing:
![](../cnn-visualisation.jpg "Different networked sessions lead to different visualizations of packets!")
Some images[^we], like FTP and SMB, are very similar, but can be further broken in case the image recognition system has trouble discerning both. Overall, all other images are remarkably different. The question then becomes: are these patterns consistent? If a _World of Warcraft_ packet looks like `x`, but another one from the same game looks like `y`, the neural network (and our own eyes) will have trouble correctly identifying it. The answer is a surprising yes:
[^we]: In case you are wondering: Weibo is a Chinese Facebook-like blogging website, and the researchers are Chinese. Neris and Geodo are Malware packets.
![](../cnn-consistency.jpg "Consistent packet pixelated images for a few example sessions.")
Of course hackers are usually smart enough to conceal parts of data meaning their malicious network packets might showcase a bit more variation. Laurens investigated many different intrusion detection model architectures and categorized them according to how well attacks are detected (_Detection Score_) and how well attacks are identified (_Identification Score_). For existing architectures, this differed based on the trained sample size. This means that we first need to know _how_ a malicious packet looks like before we can keep it from intruding.
For unknown attack detection, things are a bit more complicated, involving specific attack classes and shortcut learning. Furthermore, contamination robustness can be an issue, where only a small part of the current network packet might contain attack traffic (say, the last 10 bits), continued with the first 20 bits of the next packet.
In addition, thanks to the sheer volume of network packets, identification needs to be done quick, hence the idea to use FPGA heavyweight processors to process this closer to the bare metal. For that, Laurens relied on [Brevitas, PyTorch](https://xilinx.github.io/brevitas/), and FINN that exports his software model into Vivado which can burn it into an FPGA. FINN is an experimental framework from Xilinx specifically to experiment with neural networks on FPGAs, and comes with [example datasets](https://github.com/Xilinx/finn-examples).
---
Again, I know little about neural networks and even less about FPGAs, but the idea to approach a network packet byte block as a visual image is really ingenious---a true example of creativity! I wonder how researchers came up with that idea. Perhaps by looking at how neural networks classify other data? Would visually looking at portions of sound waves also help in speech recognition? (The answer is yes, but there are other ways to do this).
It's stunning to see how much information can be extracted by literally _looking_ at a network packet. Who knew that Facetime and Skype packets looked totally different, and that World of Warcraft data packets are that consistent? I wonder what happens when a new expansion is released---or a major network code refactor. As long as neural networks are capable of grouping and then classifying unknown groups, that's also not a problem.
If you're interested in reading more about network intrusion detection, [take a look at Laurens' publications](http://lirias.kuleuven.be/cv?Username=u0132357).

View File

@ -35,6 +35,7 @@ I also write about retro PC/Handheld gaming and actual _bread baking_ on sister
## By Year
- [2024](/post/2024) ... when I don't yet know what the year will bring
- [2023](/post/2023) ... when I intend to publish my PhD and be finally done with it
- [2022](/post/2022) ... when working from home was still a thing
- [2021](/post/2021) ... when I got back into both retro (80486) and modern (M1) hardware

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB