Biomedical Article Abstract Generation Based on the BART Seq2Seq Model

Authors

  • Bence Kovács Faculty of Engineering and Information Technology, University of Pécs, 7624 Pécs, Hungary
  • Zsófia Szabó Faculty of Informatics, Óbuda University, 1034 Budapest, Hungary
  • Dávid Tóth Faculty of Engineering and Information Technology, University of Pécs, 7624 Pécs, Hungary

DOI:

https://doi.org/10.64972/jiic.2026v4.142p6s:67-79

Keywords:

Computer-Aided Summarization, Sequence-to-Sequence Learning, Biomedical NLP, Domain-Specific Pretraining, Data Augmentation, Automatic Abstract Generation

Abstract

To meet the needs of biomedical text mining and information summarization, this paper introduces a structured solution based on the BART sequence-to-sequence (Seq2Seq) neural architecture. This study systematically examines the vocabulary, factual inconsistencies, and heterogeneous document structures in biomedical literature. Large-scale domain-specific pre-training, targeted model fine-tuning, and robust data augmentation methods are three approaches to achieve this goal. A dataset containing 250,000 pairs of biomedical document summaries has been widely used in experiments. It is divided into a training set, a validation set, and a test set. ROUGE-1, ROUGE-2, BLEU, and BERTScore were evaluated by both systems and human experts. The results show that ROUGE-1 reached 46.1, BLEU reached 22.5; the average human consistency score was 4.3 out of 5. Ablation analysis shows that all components of the model—pre-training strategies, data augmentation, architecture optimization, etc.—contribute to improving the model's performance and reducing over 50% of redundancy and hallucination errors. This program can reduce manual sorting time by over 35% in practice. It has performed well across various biomedical data and text lengths. The model can independently generate scientific narratives with high reliability and accuracy to support advanced management and research in the biomedical field.

Downloads

Published

2026-01-15

How to Cite

Kovács, B., Szabó, Z., & Tóth, D. (2026). Biomedical Article Abstract Generation Based on the BART Seq2Seq Model. Journal of Intelligent Information and Communication, 4, 6s:67–79. https://doi.org/10.64972/jiic.2026v4.142p6s:67-79

Issue

Section

Articles