Optimization of Real-Time Big Data Stream Processing Systems Based on Deep Reinforcement Learning

Mariusz Ostrowski; Jarosław Bąk; Seweryn Sokołowski

doi:10.64972/dea.2025.v4i1.1939d:113-126

Authors

Mariusz Ostrowski Faculty of Information Technology, University of Warmia and Mazury, Olsztyn, 10-719, Poland
Jarosław Bąk Faculty of Computer Science and Information Technology, West Pomeranian University of Technology, Szczecin, 70-310, Poland
Seweryn Sokołowski Faculty of Computer Science and Information Technology, West Pomeranian University of Technology, Szczecin, 70-310, Poland

DOI:

https://doi.org/10.64972/dea.2025.v4i1.1939d:113-126

Keywords:

Real-Time Stream Processing, Resource Optimization, Distributed Systems, Fault Tolerance, Cluster Scheduling

Abstract

The large number of high-speed data sources that have emerged now include the Industrial Internet of Things, financial services, online platforms, etc., all of which require real-time processing and analysis. To address the latency, scalability, and adaptability issues of traditional big data architectures under fluctuating workloads, this paper proposes a dynamic optimization framework based on Deep Reinforcement Learning (DRL). Resource allocation and task scheduling in the closed-loop feedback system are coordinated using a distributed stream processing core and a scalable Deep Reinforcement Learning (DRL) agent. Experiments were conducted on Apache Flink and Yahoo!, using a small cluster of ten nodes. New York taxi dataset and streaming suite. The above experiments show that the maximum throughput of the DRL-based framework is 108.2 thousand events per second, which is 18.7% higher than the rule-based and model-driven baseline. The framework has a 97.1% SLA compliance rate, a median event latency of 42 milliseconds, a peak CPU utilization of approximately 54%, and can recover from node failures within 28.5 seconds. In addition to the previous research, ablation and sensitivity analyzes were conducted to examine the impact of reward function design on the selection of key system parameters. The aforementioned research indicates that Deep Reinforcement Learning (DRL) can be used for intelligent control in distributed flow environments. By avoiding handcrafted methods and fixed rules, DRL can improve the system's real-time response speed and resource utilization efficiency.