Citation
Hazmi Wahab, Muhammad Hafizul
(2024)
Optimization of extractive Automatic Text Summarization using Decomposition-based Multi-objective Differential Evolution and parallelization.
Doctoral thesis, Universiti Putra Malaysia.
Abstract
The central challenge in Automatic Text Summarization (ATS) is efficiently
generating machine-generated text summaries through optimization
algorithms, a critical component for systems dealing with textual information
processing. The current approach encounters a significant hurdle due to the
long execution time, especially when employing complex optimization
techniques alongside a computationally expensive ATS repair operator that
repairs multiple candidate solutions.
While the current approach yields impressive Recall-Oriented Understudy for
Gisting Evaluation (ROUGE) metrics for the generated summary, it struggles
with inefficiencies, mainly attributed to the substantial optimization time
consumed by the ATS repair operator scheme. In order to address this, a novel
solution called Decomposition-based Multi-objective Differential Evolution
(MODE/D) is proposed. It is built upon the foundation of Differential
Evolution for Multi-objective optimization (DEMO) and the weighted sum
method (WS), coupled with an innovative ATS repair operator scheme.
Through experimentation on Document Understanding Conferences (DUC)
datasets, the novel approach of MODE/D is validated by evaluating the results
using ROUGE metrics. The outcomes are twofold: a remarkable reduction in
serial execution time and a noteworthy enhancement over existing techniques
in the scholarly domain, as evidenced by improved ROUGE-1, ROUGE-2, and
ROUGE-L scores.
The multi-core variant of MODE/D explored an alternative computational
environment, which not only demonstrates stability but also achieves
remarkable efficiency when static loop scheduling is employed. Notably, in a
multi-core environment, parallel multi-core MODE/D attains a commendable
speedup of 2 times faster than the serial version of MODE/D, with the highest
efficiency peaking at an impressive 86.35% when employing 6 CPU cores.
Additionally, when the input size is tripled, the parallel multi-core MODE/D
achieves a 7.9 speedup with 98.98% efficiency under static scheduling. The
commendable speedup achieved comes with a slight degradation in terms of ROUGE-2 metrics. However, this efficiency milestone underscores the
robustness and scalability of the proposed approach, showcasing its ability to
harness the computational power of multiple cores while maintaining stability
in summary quality metrics, yielding 31 words per second (WPS), a 233.13%
increase compared to its serial counterpart for the topic of d061j in DUC2002.
Furthermore, two GPU variants of GMODE/D, namely variant I and variant
II, are implemented, with both incorporating unified and non-unified memory
architectures. Variant I performs sentence scoring at the outset of the
accelerator region, while variant II conducts sentence scoring within the
accelerator region. GMODE/D variant I with unified memory achieves a
significant speedup of 18.17 compared to the serial variant when a 256 vector
size is used with NVIDIA Tesla V100 as an accelerator device, resulting in a
substantial increase in WPS, amounting to 215.517. Despite suffering a slight
reduction in ROUGE scores, it exhibits the most stable CV values among the
serial, multi-core, and many core variants.
These advancements collectively propel optimization-based ATS approaches
closer to real-time applications where thousands of documents could be
involved, demonstrating the versatility and efficiency of the proposed
MODE/D algorithm across diverse computing architectures, including multicore
and many core environments.
Download File
Additional Metadata
Item Type: |
Thesis
(Doctoral)
|
Subject: |
Optimization algorithms (Computer science) |
Subject: |
Parallel processing (Computer science) |
Subject: |
Natural language processing (Computer science) |
Call Number: |
FSKTM 2024 11 |
Chairman Supervisor: |
Associate Professor Nor Asilah Wati binti Abdul Hamid, PhD |
Divisions: |
Faculty of Computer Science and Information Technology |
Keywords: |
Automatic Text Summarization, Document Understanding
Conferences, Multi-objective Differential Evolution, Multi-objective Artificial
Bee Colony. |
Depositing User: |
Ms. Rohana Alias
|
Date Deposited: |
09 Oct 2025 08:27 |
Last Modified: |
09 Oct 2025 08:27 |
URI: |
http://psasir.upm.edu.my/id/eprint/120029 |
Statistic Details: |
View Download Statistic |
Actions (login required)
 |
View Item |