Creation of an automatic summary tool for arabic texts
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Tlemcen
Abstract
This thesis presents an advanced Arabic text summarization system built by fine tuning a T5-base transformer model on the AGS-Corpus, a curated Arabic
summarization dataset. The model effectively handles the language’s rich
morphology, diverse dialects, and complex syntax. Evaluation using BERTScore
and ROUGE shows that the system generates high-quality summaries closely
aligned with human references in content, fluency, and semantic similarity. To
improve accessibility, the system is deployed as a user-friendly web application
using Streamlit, enabling real-time summarization for non-technical users. Future
work includes expanding the dataset to include more dialects and exploring
alternative architectures like BART or GPT to further enhance performance,
especially on complex or lengthy texts