Coordinated Optimization of Multi-Robot Automated Assembly Lines Based on Deep Q-Learning

Julia Duda; Tomasz Wiśniewski

doi:10.64972/dea.2026.v5i2.1734d:44-57

Authors

Julia Duda Faculty of Mechanical Engineering, University of Applied Sciences in Nysa, Nysa 48-300, Poland
Tomasz Wiśniewski Faculty of Mechanical Engineering, University of Applied Sciences in Konin, Konin 62-500, Poland

DOI:

https://doi.org/10.64972/dea.2026.v5i2.1734d:44-57

Keywords:

Deep Reinforcement Learning, Multi-Robot Scheduling, Real-Time Control, Resource Allocation, Robust Optimization

Abstract

Current large-scale multi-robot automated assembly line systems face common issues such as real-time coordination and intelligent resource scheduling. Here, a dynamic scalable manufacturing framework based on deep Q-learning is proposed. Deep reinforcement learning, combined with real-time perception, distributed robot control, and fast peer-to-peer communication, is a component of the proposed method. Create an event-driven assembly line simulator and simulate multiple robot teams, each with different workloads. The above experiments show that deep Q-learning scheduling outperforms traditional optimization and heuristic methods. Under different team sizes and task allocations, its throughput increased by up to 18.4%, and the median idle ratio of robots decreased by 32%. In robustness tests under robot failure and task interruption scenarios, the system's success rate exceeded 85%, with only a slight decrease in performance. Moreover, sensitivity analysis indicates that it is relatively stable when changing the reward strategy and learning rate; ablation experiments show that the multi-head attention architecture is the most sample-efficient in industrial scheduling. According to the above research results, learning-driven adaptive control strategies can meet the demands of large-scale and flexible production. This will improve the production efficiency and operational stability of industrial automation in the real world.