An automated tool for real-time feature engineering on network traffic data, optimized for intrusion detection using the NSL-KDD dataset. This tool processes live network traffic, extracts relevant features, and prepares data for use in machine learning models.
The NSL-KDD Real-Time Feature Engineering Tool is designed to process live network traffic data, extract meaningful features, and prepare datasets for machine learning models, particularly in the context of intrusion detection. This tool is optimized for real-time operation, allowing for continuous feature extraction as network data flows in, making it ideal for use with the NSL-KDD dataset.
- Real-Time Processing: Capture and process live network traffic data.
- Feature Extraction: Automatically extract key features from the raw network data.
- NSL-KDD Optimization: Specifically designed to work with and enhance the NSL-KDD dataset for intrusion detection tasks.
- Modular Design: Easy to extend and customize for specific use cases or additional features.
Ensure you have the following installed:
- Python 3.7+
scapy
for packet capturingpandas
,numpy
for data processingscikit-learn
for machine learning tasks
You can install the required Python packages using:
pip install -r requirements.txt
Clone the Repository:
bash Copy code git clone https://github.com/your-username/nsl-kdd-realtime-features.git cd nsl-kdd-realtime-features Prepare the NSL-KDD Dataset:
Download the NSL-KDD dataset. Place the dataset files into the data/ directory. Run the Tool:
bash Copy code python feature_engineering.py This will start capturing network traffic and performing real-time feature extraction.
Command-Line Interface You can customize the behavior of the tool using various command-line arguments:
--interface: Specify the network interface to capture traffic from. --output: Set the output directory for the feature-engineered data. --interval: Define the interval (in seconds) for processing batches of traffic. Example:
bash Copy code python feature_engineering.py --interface eth0 --output ./output/ --interval 5 Extending the Tool The tool's modular design allows you to easily add new feature extraction methods or integrate it with other datasets or machine learning models. Simply modify or extend the feature_engineering.py file as needed.
Contributions are welcome! Please open an issue or submit a pull request with any improvements or new features.
This project is licensed under the MIT License - see the LICENSE file for details.
The NSL-KDD dataset is provided by the Canadian Institute for Cybersecurity. Special thanks to the open-source community for providing the tools and libraries that made this project possible.