Related Work

Name	Content	Relevance	Journal/Conference	Publisher	Personal Rating	Link to annotated paper	Year	Cites
An Unsupervised Deep Learning Model for Early Network Traffic Anomaly Detection	D-PACK: A deep-learning-based anomaly detection system for IoT traffic. Combines a Convolutional Neural Network (CNN) with an unsupervised Autoencoder to automatically learn traffic patterns and detect malicious flows. Instead of relying on manual, flow-level features or signatures, D-PACK analyzes only the first few bytes of the first two packets of each flow, enabling ultra-early detection. Demonstrates near 100% detection accuracy with a very low false-positive rate (0.83%) in large-scale DDoS scenarios like Mirai-based attacks. Designed for real-time, resource-efficient mitigation of abnormal traffic.	This approach could be helpful, as it utilizes the first bytes of the network packet payload. It might be worth exploring whether I can integrate this method and combine it with other techniques to enhance detection accuracy and keep the model slim.	IEEE Access	IEEE	⭐⭐⭐	Click Me	2020	159
Anomaly Detection From Log Files Using Unsupervised Deep Learning	Unsupervised LSTM Autoencoder model that processes raw log text without preprocessing or handcrafted features. It outputs an anomaly score per log entry, reflecting content and temporal rarity. Trained on 1M HDFS log lines and tested on 1M lines. Acts as a coarse filter for anomaly detection when labeled data is unavailable.	The approach has not been applied to security-related data. However, it aligns well with my concept, as it combines temporal context modeling through LSTMs with an autoencoder. Additionally, it may offer valuable insights, given that it operates on raw, unlabeled data without requiring preprocessing — matching the conditions of my use case.	Formal Methods. FM 2019 International Workshops	Springer	⭐⭐⭐⭐	Click Me	2020
DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning	DeepLog: Uses an LSTM-based deep learning model to treat system logs as language sequences. Automatically learns normal log patterns and detects anomalies when deviations occur. Supports incremental online updates to adapt to new patterns over time. Builds workflows from logs to aid in root cause analysis. Outperforms traditional log-based anomaly detection methods in large-scale experiments.	While DeepLog presents an elegant approach by modeling system logs as sequences using LSTM-based architectures and autoencoders to learn normal patterns, it also has significant limitations in the context of this work. The entire approach is focused on structured system logs and does not extend to network traffic, interactive attacker behavior, or unstructured data streams. This log-centric design makes it far from suitable for honeypot or intrusion detection scenarios, where anomalies frequently occur in network flows, command sequences, or multi-modal attacker interactions. Additionally, the model is not designed to handle complex, heterogeneous data sources typical in security monitoring setups, and it lacks flexibility for incorporating contextual or behavioral information beyond static logs. Therefore, despite being a prominent example of sequence-based anomaly detection, its applicability in honeypot-based or network-level anomaly detection is limited.	CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security	ACM	⭐⭐	Click Me	2017	1055
Unsupervised Machine Learning Techniques for Network Intrusion Detection on Modern Data	The paper compares four unsupervised models (PCA, Isolation Forest, One-Class SVM, and Autoencoder) for intrusion detection on the CIC-IDS-2017 dataset. All models are trained on benign traffic only, focusing on zero-day attack detection. The autoencoder outperforms others with an AUROC of 0.9775 and an F1 score of 0.9616, combining high recall and precision. The OC-SVM is best when low false positives are critical. The study highlights the trade-off between detection performance and computational efficiency, with all models being fast and suitable for real-time deployment. Autoencoders and OC-SVMs are recommended for robust, adaptive NIDS.	This paper is highly relevant to my thesis as it provides a structured comparison between several unsupervised learning algorithms for intrusion detection, directly addressing the challenge of detecting unknown, zero-day attacks. The evaluation on CIC-IDS-2017 aligns with my focus on modern, realistic datasets. In particular, the paper's demonstration of autoencoders' ability to capture complex attack patterns without manual feature engineering supports my approach of using deep learning for feature extraction in honeypot-based environments. Moreover, the discussion on computational overhead, model optimization, and threshold selection provides valuable insights for practical system design. The results showing the superior balance between detection accuracy and runtime efficiency of autoencoders can guide my architecture choice. The OC-SVM’s performance in minimizing false positives also offers potential for integration as a fallback or secondary detection stage in my work.	2020 4th Cyber Security in Networking Conference (CSNet)	IEEE	⭐⭐⭐⭐	Click Me	2020	26
A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks	The authors test their RNN-based IDS (RNN-IDS) on the NSL-KDD dataset for both binary (normal vs. anomaly) and multiclass classification (normal, DoS, R2L, U2R, Probe). They preprocess the dataset with feature encoding and normalization and use a fully connected RNN model. The paper compares RNN-IDS against traditional machine learning algorithms (J48, SVM, Random Forest, ANN) and a reduced-size RNN approach from prior work. Results show that RNN-IDS outperforms classical methods in accuracy, detection rate, and false-positive rate, although training times are longer (not GPU-enabled). The authors suggest that with GPU acceleration and advanced architectures (LSTM, BiRNNs), even better results are achievable.	This paper is helpful for my thesis in multiple ways. First, it demonstrates that even relatively simple RNN architectures can outperform classical machine learning models on benchmark datasets like NSL-KDD — without GPU acceleration and using fairly basic setups. This reassures me that more modern approaches (GPU training, LSTMs, attention mechanisms) will easily surpass those results. Second, the paper is very technical and methodical; it explains each step, including dataset structure, feature preprocessing, and hyperparameter tuning. This shows me how straightforward it is to fill out these sections in my own thesis. Finally, by showing performance analysis in both binary and multiclass classification with confusion matrices, training times, and metrics like accuracy, TPR, and FPR, it sets a clear template for how I can structure and present my own results in a way that will be accepted as thorough and rigorous.	IEEE Access	IEEE	⭐⭐	Click Me	2017	1228
AutoLog: Anomaly detection by deep autoencoding of system logs	AutoLog proposes a semi-supervised deep autoencoder model that uses entropy-based scoring on log chunks from heterogeneous systems. Scores from normal operations train the model; deviations are detected via reconstruction error. It does not rely on log structure or templates and works across distributed systems. Evaluated on industrial, microservices, BG/L supercomputer, and Hadoop logs, AutoLog achieved recall between 0.96 and 0.99 and precision between 0.93 and 0.98, outperforming isolation forest, one-class SVM, decision trees, and variational autoencoders.	This work is relevant due to its template-independent design and ability to handle heterogeneous, unstructured log data without prior feature engineering. The entropy-based scoring combined with deep autoencoding offers a robust method for anomaly detection in noisy, multi-source environments. This aligns with my approach for adaptive anomaly detection in dynamic, complex honeypot setups, where structured features are not always available.	Expert Systems with Applications	Elsevier	⭐⭐⭐	Click Me	2021
An anomaly detection method to detect web attacks using Stacked Auto-Encoder	Proposes an anomaly detection method for web attacks using a stacked autoencoder (SAE) for feature extraction and isolation forest as a one-class classifier. Uses character-level n-gram models for feature construction (mostly unigram and bigram), which suffers from high dimensionality. SAE architecture consists of layers with 1000, 400, and 100 hidden neurons, quadratic loss function, and experiments with different optimizers (Adam, RMSProp, etc.). Results show improvement over simple n-gram models, but the paper is written in poor English, uses outdated hardware, and lacks depth.	Only marginally relevant. The approach (SAE + isolation forest) is interesting but not innovative. Mostly helpful for fine-tuning my research gap and showing how not to design evaluation setups. Demonstrates the difference between short conference papers and more thorough research.	2018 6th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS)	IEEE	⭐	Click Me	2018	73
Anomaly Detection for HTTP Using Convolutional Autoencoders	Proposes a novel anomaly detection approach for HTTP traffic using a convolutional autoencoder (CAE) combined with character-level binary image transformation. Instead of manual feature engineering, HTTP messages are converted into binary "images" representing characters, which are then processed by a modified Inception-ResNet-v2 CAE. Detection is based on reconstruction errors (BCE) and a novel decision metric, binary cross varentropy (BCV), which captures variance properties. The model is trained unsupervised on normal HTTP traffic and achieves superior performance compared to one-class SVMs, Isolation Forest, and a shallower CREPE-based CAE. The paper also evaluates character embedding but finds minimal performance gains. Trained on ~129,000 HTTP messages and evaluated on ~26,000 messages. Demonstrates strong results with low false-positive rates (~4% at TPR 0.99) and high MCC/F1.	Very interesting due to the use of CAEs with character-level transformation, showing that feature engineering can be avoided. The use of BCV as a decision metric is novel and may offer inspiration for alternative scoring metrics in my honeypot setup. However, the approach is tailored to structured HTTP data and image-like transformations; it’s unlikely that this technique would generalize to packet-level byte streams or more complex, mixed traffic. Additionally, their infrastructure relies on large GPU resources (4× Tesla P100), which might not scale well for real-time inference without significant optimization. Overall, the paper helps to sharpen my research gap: my focus is on packet-level raw byte processing, multi-modal attacker behavior, and resource-efficient, real-time detection rather than protocol-specific image encodings	IEEE Access	IEEE	⭐⭐⭐⭐	Click Me	2018	33
A Deep Learning Approach to Network Intrusion Detection	This paper presents a novel intrusion detection approach using multi-layer non-symmetric deep auto-encoders (NDAEs) for unsupervised feature learning, followed by a shallow learning classifier based on Random Forest. The architecture consists of multiple stacked NDAEs for deep feature extraction, after which the learned representations are classified using a Random Forest model. The system is implemented in GPU-enabled TensorFlow and evaluated on the KDD Cup '99 and NSL-KDD datasets. Results demonstrate significant improvements in detection accuracy and false positive rates compared to traditional machine learning approaches, showcasing its potential for application in modern network intrusion detection systems.	This paper is highly relevant, as it presents a deep feature extraction approach combined with a lightweight classifier (Random Forest). The concept of using unsupervised representation learning and passing the results to a separate shallow classifier aligns well with my idea of balancing deep learning and efficiency. The methodology could serve as a foundation for combining unsupervised feature learning with scalable classifiers in modern honeypot-based anomaly detection. The use of the kdd dataset and the reference to other paperps who used this dataset could be helpful for my evaluation. The overall structure could be helpful for my paper as well.	IEEE Transactions on Emerging Topics in Computational Intelligence	IEEE	⭐⭐⭐⭐	Click Me	2017	1039
An LSTM-Based Deep Learning Approach for Classifying Malicious Traffic at the Packet Level	The paper presents a packet-level intrusion detection model using word embeddings and a three-layer LSTM architecture. Each packet is parsed into a fixed 54-byte representation, converted into "sentences" of header fields, and embedded for input into the LSTM. The model is trained on large datasets (ISCX2012, USTC-TFC2016, Mirai-RGU, and self-collected Mirai-CCU), demonstrating extremely high accuracy (near 100%) on both training and validation sets. The focus on real-time detection without flow aggregation is innovative. However, the model's practicality is questionable due to heavy GPU requirements (Tesla K80) and large model complexity (>4 million parameters). Evaluation lacks discussion on handling encrypted payloads or highly variable traffic, and no adversarial robustness evaluation is provided.	Relevant as it shows packet-level detection without requiring flow reconstruction, using LSTMs on raw headers. However, the paper mainly focuses on syntactic structure and known datasets. Their embedding strategy is interesting but heavily handcrafted and limited to static header structures. It does not generalize well to dynamic payload-based anomalies or encrypted/obfuscated traffic. This reinforces my research gap: going beyond static header field embeddings towards dynamic, payload-level learning and multi-modal behavior modeling.	Applied sciences	Basel : MDPI AG	⭐⭐⭐	Click Me	2020
A Machine Learning Approach to Classify Network Traffic	The paper focuses on classifying benign vs. darknet traffic using the CIC-Darknet 2020 dataset. The authors apply preprocessing (PCA for dimensionality reduction), balance the dataset using SMOTE, and compare various classical machine learning algorithms (e.g., Random Forest, Decision Tree, Extra Trees, AdaBoost). The best results are achieved by Decision Tree and Extra Trees classifiers, with near 100% accuracy and MCC. The paper lacks any deep learning approaches, does not handle real-time performance evaluation, and relies on feature-based manual preprocessing rather than automated feature learning.	Only slightly relevant. It provides a good overview of classical machine learning baselines for traffic classification, which can serve as a comparison point for deep learning-based approaches. However, the approach is heavily dependent on manual feature extraction (via PCA and dataset features), lacks discussion on real-time applicability, adversarial robustness, or packet-level raw data processing. The work confirms that classical methods are useful but also highlights their limitations for dynamic or unknown attack patterns.	13th International Conference on ELECTRONICS, COMPUTERS and ARTIFICIAL INTELLIGENCE – ECAI-2021	International Conference on ELECTRONICS, COMPUTERS and ARTIFICIAL INTELLIGENCE – ECAI	⭐	Click Me	2021	7
Raw Packet Data Ingestion with Transformers for Malicious Activity Classifications	This paper proposes using the ByT5 transformer — a token-free, byte-level NLP model — for direct classification of raw packet data as malicious or benign without manual feature engineering or tokenization. The model is fine-tuned on the ISOT dataset and evaluated across multiple days with different malicious traffic compositions. Achieves a maximum recall of 0.834 and F1 score of 0.693. Shows that short truncated packets (100 bytes) yield better results than larger truncations. The paper emphasizes the advantage of direct byte ingestion but also notes large resource requirements (300M parameter model, Tesla V100 GPUs), long training times (37 hours per epoch), and convergence issues (vanishing gradients).	Highly relevant, as this paper is among the first to apply large transformer models directly on raw packet data — very close to my goal. It demonstrates that direct byte-level learning is feasible but computationally heavy. The results are not yet production-ready (F1 = 0.693), and real-time deployment issues are not addressed. This reinforces my gap: developing more lightweight architectures or hybrid models that can achieve comparable detection results without requiring massive infrastructure. The discussion around training complexities, truncation, and input sequence lengths is very helpful.	2023 International Conference on Machine Learning and Applications (ICMLA)	International Conference on Machine Learning and Applications (ICMLA)	⭐⭐⭐⭐	Click Me	2023	0
Hybrid System Between Anomaly Based Detection System and Honeypot to Detect Zero Day Attack	The paper discusses the limitations and strengths of anomaly-based detection systems and honeypots and proposes a hybrid model to detect zero-day attacks more effectively. The proposed system uses honeypots to lure attackers and gather behavior data, while anomaly-based systems detect deviations from learned normal behavior. They suggest feeding honeypot observations back into the anomaly detection system to improve accuracy and reduce false positives. However, the paper remains conceptual without presenting concrete experimental validation or implementation details.	Conceptually interesting, as it aligns with the idea of dynamic, feedback-based anomaly detection. But the paper is weak in terms of experimental contribution and lacks any implementation or performance evaluation. It can serve as theoretical support for integrating honeypot data into anomaly detection models, but adds little technical depth. It reinforces my focus on actually implementing and validating such hybrid systems with deep learning components.	2018 21st Saudi Computer Society National Computer Conference (NCC)	IEEE	⭐	Click Me	2018	11
A Near Real-Time Algorithm for Autonomous Identification and Characterization of Honeypot Attacks	The paper presents UNADA, an unsupervised anomaly detection and characterization algorithm designed for honeypot traffic. It uses sub-space clustering, evidence accumulation, and inter-cluster correlation to identify and characterize attacks from unlabeled honeypot traffic in near real-time. Signatures are automatically generated and can be used to configure firewalls or routers autonomously. The algorithm is evaluated on real-world data from the University of Maryland, showing high detection accuracy and efficient parallelization for scalability. The work addresses both classification and risk-based prioritization of detected anomalies.	Highly relevant. It demonstrates an unsupervised clustering-based approach for automatically identifying and characterizing honeypot-based attacks. The combination of clustering ensemble methods and signature generation is interesting and confirms the value of using honeypot data in combination with unsupervised ML techniques. However, the paper focuses on flow-level NetFlow data and classical clustering techniques rather than deep learning. This highlights a gap that my research addresses: leveraging packet-level raw data and neural architectures for autonomous anomaly detection, rather than relying on flow aggregation and handcrafted features.	ASIA CCS '15: Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security	ACM	⭐⭐⭐⭐⭐	Click Me	2015
A Comparative Study of Unsupervised Anomaly Detection Techniques Using Honeypot Data	The paper presents an extensive comparative analysis of eight unsupervised anomaly detection techniques (three outlier detection methods, four clustering approaches, and one-class SVM) using real-world honeypot traffic data from Kyoto University. The study evaluates detection accuracy, false positive rates, robustness to noise, training data size impact, detection of unknown attacks, and time complexity. Key findings: clustering-based methods outperform outlier detection for IDS purposes; DBScan and LOF are computationally heavy; Chebyshev and Euclidean distances are most suitable similarity metrics; and one-class SVM performs in between clustering and outlier detection methods. While the methodology is solid and the use of real honeypot data was ahead of its time, the paper is now over 15 years old and focused entirely on classical machine learning methods without any consideration of deep learning, modern high-throughput environments, or encrypted traffic. This strongly emphasizes the need for updated approaches that leverage deep neural models and raw packet data to handle current attack vectors	Still relevant for understanding the evolution of anomaly detection and as a reference point for older methods. However, its age and focus on classical methods make it insufficient for addressing modern, dynamic attack landscapes. It reinforces my research gap: integrating deep learning-based packet-level detection in honeypot scenarios, moving beyond handcrafted features and clustering methods.	IEICE Transactions on Information and Systems VOL.E93–D, NO.9 SEPTEMBER 2010	IEICE	⭐⭐⭐⭐⭐	Click Me	2010	2
FedNIDS: A Federated Learning Framework for Packet-Based Network Intrusion Detection System	TFedNIDS proposes a two-stage federated learning framework combining decentralized training on packet-based data (using a DNN) and subsequent fine-tuning for novel attack detection. It addresses challenges with non-IID data distributions and adapts rapidly (in ~4 rounds) to zero-day attacks. The paper demonstrates high accuracy (F1 = 0.97) on CIC-IDS2017/2018 datasets and robust defense against adversarial attacks after fine-tuning. It uses raw packet features (normalized byte values) instead of handcrafted flow features. While impressive, it still requires substantial infrastructure and focuses on supervised DNN classification with federated aggregation, rather than unsupervised or self-supervised methods for anomaly detection.	Highly relevant. It shows current progress in federated NIDS on packet-level data, emphasizing scalability and adaptability. However, the focus is on federated supervised classification, not unsupervised detection or lightweight models. My research gap lies in creating real-time-capable, unsupervised, or self-supervised transformer/autoencoder hybrids for honeypot environments, where labeled attack data is not guaranteed, and lightweight inference is critical. FedNIDS also does not address encrypted payload handling or multi-modal attacker behavior detection.	Digital Threats: Research and Practice, Vol. 6, No. 1, Article 4 (February 2025)	ACM	⭐⭐⭐⭐	Click Me	2025
Improving Adaptive Honeypot Functionality with Efficient Reinforcement Learning Parameters for Automated Malware	Proposes parameter tuning for RL agents in adaptive honeypots. Explores discount factor and learning rate settings in Q-learning setups to improve response behavior against malware. Evaluates against simulated attacks.	Minor technical relevance but helpful for parameter optimization. Offers hints on tuning Q-table based agents for better learning speed and stability.	Journal of Cyber Security Technology	Taylor & Francis	⭐⭐	Click Me	2018	—
Using Reinforcement Learning to Conceal Honeypot Functionality	Uses Q-learning to tune honeypot responses and delay detection by attackers. Focuses on balancing stealth (concealment) with engagement. Evaluates the timing and response manipulation to reduce detectability.	Relevant for RL-based deception tactics. Complements my goal of increasing attacker trust in the honeypot through learned response realism.	AIxIA 2018, Italian Conference on Artificial Intelligence	Springer	⭐⭐⭐	Click Me	2018	—
New Framework for Adaptive and Agile Honeypots	Proposes HARM: honeypots using reinforcement learning to handle repetitive malware like worms. Introduces agile policy updates, automated redeployment, and captures attacker interaction via Q-learning/SARSA. Evaluates state-action space for malware behavior.	Supports dynamic policy learning and system redeployment, relevant for container-based honeypots. Shows benefits of agility and adaptability in active threat environments.	ETRI Journal	Wiley	⭐⭐⭐⭐	Click Me	2020	—
A Comparison of an Adaptive Self-Guarded Honeypot with Conventional Honeypots	Compares Asgard and Midgard (adaptive SSH honeypots with Q-learning) to Cowrie and real Linux. Focuses on trade-off between attacker engagement and containment. Asgard uses state-action dependent rewards for fine-grained policy learning.	Directly aligns with my goals: adaptive interaction, attacker deception, and safe containment. Highlights design strategies for maximizing data collection while preventing full system compromise.	Applied Sciences	MDPI	⭐⭐⭐⭐	Click Me	2022	—
Evaluation of Reinforcement Learning Algorithm on SSH Honeypot	The paper evaluates the impact of reinforcement learning on SSH honeypots using Cowrie. It aims to increase the duration and depth of attacker interaction by learning behavioral sequences before certain commands (e.g. download). It explores how RL can guide interaction policies and discusses the reward function design for deeper attacker engagement.	Helpful as a basic empirical test of RL in honeypots, especially focused on SSH. Offers insight into sequence-based reward strategies and attacker behavior profiling. Can inform RL reward tuning in my framework.	2022 6th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE)	IEEE	⭐⭐⭐	Click Me	2022	—
Asguard: Adaptive Self-guarded Honeypot	opinioin	journal	rating	annotated link	year
Deep Reinforcement Learning for Building Honeypots Against Runtime DoS Attack	Introduces DARLH, a system combining deep reinforcement learning with IDS agents to respond to DoS attacks. Uses Deep RNNs and event tracking on datasets like UNSW-NB20 and Bot-IoT. Evaluates DARLH against prior methods like Naïve Bayes Honeypot and Blockchain-based systems, showing 5–10% performance gains.	Valuable for the use of DRL in real-time detection with structured datasets. Though more focused on DoS, it informs model architectures and agent behaviors in honeypot-based defense. Could inspire hybrid detection components in adaptive honeypots.	Int. J. of Intelligent Systems	Wiley	⭐⭐⭐⭐	Click Me	2021	—
RASSH – Reinforced Adaptive SSH Honeypot	Proposes RASSH, a medium-interaction SSH honeypot using SARSA and Markov models to adapt based on attacker commands. Implemented in Python using PyBrain and Kippo as base. The system learns to keep attackers engaged while detecting typical behaviors.	Foundational for RL-based honeypots. Demonstrates early, working application of SARSA to attacker behavior. Can guide state-action modeling and inspire comparisons with modern RL techniques.	2014 International Conference on Communications (COMM)	IEEE	⭐⭐⭐⭐	Click Me	2014	—
Adaptive and Self-Configurable Honeypots	One of the first high-interaction honeypots using reinforcement learning. Based on UML and a probabilistic automaton of attacker behavior. Adapts program responses (allow/block/modify) based on attack state. Focuses on self-management and attacker deception.	Seminal work in adaptive honeypots. Introduces attacker-driven feedback loop via RL and explores adversarial learning. Supports the case for dynamic honeypot control policies.	12th IFIP/IEEE International Symposium on Integrated Network Management	IEEE	⭐⭐⭐⭐⭐	Click Me	2011	—
HoneyIoT: Adaptive High-Interaction Honeypot for IoT Devices Through Reinforcement Learning	HoneyIoT uses MDP modeling and reinforcement learning to mimic IoT devices and evade detection. Learns optimal attacker engagement via attack trace replay and differential response mutation. Covert against honeypot detection tools, deployed publicly.	Very relevant due to adaptive interaction design. Shows strong RL use in realistic honeypot deployment. Supports my goal of making attack sessions longer and more informative.	WiSec '23: ACM Conference on Security and Privacy in Wireless and Mobile Networks	ACM	⭐⭐⭐⭐⭐	Click Me	2023	—
Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes	opinioin	journal	rating	annotated link	year
Reinforcement Learning-assisted Threshold Optimization for Dynamic Honeypot Adaptation to Enhance IoBT Networks Security	opinioin	journal	rating	annotated link	year
QRASSH - A Self-Adaptive SSH Honeypot Driven by Q-Learning	opinioin	journal	rating	annotated link	year
On the Rewards of Self-Adaptive IoT Honeypots	Describes IRASSH-T, a self-adaptive honeypot using Inverse Reinforcement Learning (IRL) to model attacker behavior and derive optimal reward functions. Applied to SSH and Telnet-based IoT honeypots. Focuses on learning behavior patterns like Mirai botnet actions.	Important for reward design and IRL perspective. Shows how real attacker behavior can be modeled and used to train RL agents effectively, even without explicit labels.	Annals of Telecommunications	Springer	⭐⭐⭐⭐	Click Me	2019	—
Generative AI SSH Honeypot With Reinforcement Learning	opinioin	14th IEEE International Conference on Communication Systems and Network Technologies 2025	rating	annotated link	2025
Playing Atari with Deep Reinforcement Learning	DeepMind first DQN	journal	rating	annotated link	2013
Q-learning	first Q-Learning	Springer Nature Machine learning	rating	annotated link	1992