Secure Routing in Software-Defined Networks via Proximal Policy Optimization-Based Deep Reinforcement Learning

Main Article Content

Artur Kaczmarek

Abstract

Due to the new architecture, SDN separates the control plane and data plane, enhancing network programmability. As a result, new security issues have emerged in large-scale, complex dynamic routing environments. This study proposes a security adaptive routing framework based on Proximal Policy Optimization (PPO) deep reinforcement learning to address the constantly changing network performance and security risks. The proposed method describes the secure routing problem as a Markov decision process and includes security alerts, network topology, and traffic characteristics in the agent's state space. The redesigned rewards take into account throughput, latency, and quantified security risks, and include regularization and penalty terms to ensure system stability. Experimentally evaluate the performance of the SDN simulation platform under attack conditions and during normal operation in enterprise mesh and data center scenarios. The results indicate that the PPO-based agent has high throughput and low latency, with fewer security incidents compared to static and deep Q-learning baselines. Ablation studies indicate that safety-aware features, regularization, and penalty mechanisms are crucial components in building robust network controllers. As the scale of networks and the types of attacks increase, new methods are easier to scale and more universally applicable. It has already been shown that the next generation of secure SDN routing protocols will be designed using advanced reinforcement learning techniques.

Article Details

How to Cite
Kaczmarek, A. (2026). Secure Routing in Software-Defined Networks via Proximal Policy Optimization-Based Deep Reinforcement Learning. Journal of Intelligent Information and Communication, 4, 1–13. https://doi.org/10.64972/jiic.2026v4.110p1-13
Section
Articles