Application of Parallelized Graph Neural Networks in Predicting Molecular Properties from Large-Scale Chemical Databases

Authors

  • Piotr Tomaszewski Faculty of Electrical and Computer Engineering, Wrocaw University of Science and Technology, Wrocaw 50-370, Poland
  • Adam Wojcik Faculty of Electrical Engineering, Automatics, Computer Science and Biomedical Engineering, University of Silesia in Katowice, Katowice 40-007, Poland

DOI:

https://doi.org/10.64972/dea.2025.v3i1.68

Keywords:

Parallel Computing, Graph Neural Networks, Molecular Property Prediction, Chemical Databases

Abstract

A parallel GNN architecture is proposed to solve the problems of large computation and poor scalability in molecular property prediction of large-scale chemical databases. This study uses data-level and model-level parallel methods, distributed training and memory optimization methods to learn graph-structured molecular data. To make GNN training as efficient as possible on heterogeneous datasets on multi-node GPU clusters, we optimize graph batching, adaptive sampling, workload balancing, and communication-efficient synchronization. According to the results of the standard chemical benchmark experiment, the parallelized GNN obtained a great acceleration ratio, a small peak memory usage, and still had a high prediction accuracy compared with the basic method. The comprehensive analysis analyzes how using better partitioning and sampling techniques affects how many conversations and what is needed. In computational chemistry and materials science, these distributed and parallel GNN methods seem to have good large-scale molecular prediction capabilities.

Downloads

Published

2025-01-13

How to Cite

Tomaszewski, P., & Wojcik, A. (2025). Application of Parallelized Graph Neural Networks in Predicting Molecular Properties from Large-Scale Chemical Databases. Data Engineering and Applications, 3(1), 54–67. https://doi.org/10.64972/dea.2025.v3i1.68

Issue

Section

Articles