Explainable Intrusion Detection System (X-IDS)

ICCN-INE2 Deep Learning Project β€” Project 5: Explainable IDS

Project Overview

This project builds an Intrusion Detection System using deep learning on the NSL-KDD dataset, then applies post-hoc explainability methods (SHAP, LIME) to make decisions interpretable. We evaluate explanation stability and analyze security implications of exposing model explanations.

Core Research Question

Can we make IDS decisions interpretable without compromising detection performance, and are these explanations stable enough to be trusted in security-critical settings?

Repository Structure

.
β”œβ”€β”€ README.md                          # This file
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ project_plan.md                # Detailed project plan & methodology
β”‚   β”œβ”€β”€ threat_model.md                # Threat model document
β”‚   └── architecture.md                # Model architecture & design choices
β”œβ”€β”€ data/
β”‚   └── preprocess.py                  # Data loading & preprocessing pipeline
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ mlp_baseline.py                # MLP baseline model
β”‚   β”œβ”€β”€ lstm_model.py                  # LSTM variant
β”‚   └── cnn1d_model.py                 # 1D-CNN variant
β”œβ”€β”€ explainability/
β”‚   β”œβ”€β”€ shap_analysis.py               # SHAP explanations
β”‚   β”œβ”€β”€ lime_analysis.py               # LIME explanations
β”‚   └── stability_eval.py             # Explanation stability evaluation
β”œβ”€β”€ experiments/
β”‚   β”œβ”€β”€ train_baseline.py              # Training script
β”‚   β”œβ”€β”€ run_explainability.py          # Run all XAI methods
β”‚   └── run_stability.py              # Stability evaluation experiments
β”œβ”€β”€ results/                           # Generated results (figures, metrics)
β”œβ”€β”€ requirements.txt                   # Dependencies
└── reproduce.sh                       # One-command reproducibility script

Quick Start

# Install dependencies
pip install -r requirements.txt

# Reproduce all experiments
bash reproduce.sh

# Or run step by step:
python data/preprocess.py              # Download & preprocess NSL-KDD
python experiments/train_baseline.py   # Train 3 models (MLP, LSTM, CNN)
python explainability/shap_analysis.py # SHAP + LIME analysis
python explainability/stability_eval.py # Stability evaluation

Dataset

NSL-KDD (Network Security Laboratory - KDD) β€” an improved version of KDD Cup 99.

Models

Model Architecture Parameters
MLP 41β†’256β†’128β†’64β†’2 with BatchNorm + Dropout ~50K
LSTM 41-step sequence β†’ 2-layer LSTM(64) β†’ FC(2) ~35K
1D-CNN Conv1d(64)→Conv1d(128)→AvgPool→FC(2) ~45K

Explainability Methods

  • SHAP (SHapley Additive exPlanations): KernelExplainer (model-agnostic)
  • LIME (Local Interpretable Model-agnostic Explanations): Tabular explainer with perturbation sampling

Evaluation Metrics

  • Classification: Precision, Recall, F1-Score (per-class + weighted), PR-AUC, ROC-AUC
  • Explanation Quality: Faithfulness (feature masking), Sensitivity (SENS_MAX), Stability (PCC across perturbations)

Reproducibility

  • Random seed: 42 (fixed across all experiments)
  • Python 3.10+ | PyTorch 2.x | scikit-learn 1.x
  • All preprocessing steps documented
  • Commands in reproduce.sh

References

  1. Tavallaee et al. (2009). A Detailed Analysis of the KDD CUP 99 Data Set. IEEE Symposium on CISDA.
  2. Lundberg & Lee (2017). A Unified Approach to Interpreting Model Predictions. NeurIPS.
  3. Ribeiro et al. (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier. KDD.
  4. Huang et al. (2022). SAFARI: Versatile and Efficient Evaluations for Robustness of Interpretability. ICCV.

Author

ICCN-INE2 Student Project

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = 'cathrica/deep-learning-project'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support