Biomedical Article Abstract Generation Based on the BART Seq2Seq Model

Bence Kovács; Zsófia Szabó; Dávid Tóth

doi:10.64972/jiic.2026v4.142p6s:67-79

Authors

Bence Kovács Faculty of Engineering and Information Technology, University of Pécs, 7624 Pécs, Hungary
Zsófia Szabó Faculty of Informatics, Óbuda University, 1034 Budapest, Hungary
Dávid Tóth Faculty of Engineering and Information Technology, University of Pécs, 7624 Pécs, Hungary

DOI:

https://doi.org/10.64972/jiic.2026v4.142p6s:67-79

Keywords:

Computer-Aided Summarization, Sequence-to-Sequence Learning, Biomedical NLP, Domain-Specific Pretraining, Data Augmentation, Automatic Abstract Generation

Abstract

To meet the needs of biomedical text mining and information summarization, this paper introduces a structured solution based on the BART sequence-to-sequence (Seq2Seq) neural architecture. This study systematically examines the vocabulary, factual inconsistencies, and heterogeneous document structures in biomedical literature. Large-scale domain-specific pre-training, targeted model fine-tuning, and robust data augmentation methods are three approaches to achieve this goal. A dataset containing 250,000 pairs of biomedical document summaries has been widely used in experiments. It is divided into a training set, a validation set, and a test set. ROUGE-1, ROUGE-2, BLEU, and BERTScore were evaluated by both systems and human experts. The results show that ROUGE-1 reached 46.1, BLEU reached 22.5; the average human consistency score was 4.3 out of 5. Ablation analysis shows that all components of the model—pre-training strategies, data augmentation, architecture optimization, etc.—contribute to improving the model's performance and reducing over 50% of redundancy and hallucination errors. This program can reduce manual sorting time by over 35% in practice. It has performed well across various biomedical data and text lengths. The model can independently generate scientific narratives with high reliability and accuracy to support advanced management and research in the biomedical field.

Biomedical Article Abstract Generation Based on the BART Seq2Seq Model

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Information

Make a Submission