UPM Institutional Repository

Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers


Citation

Palasundram, Kulothunkan (2021) Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers. Doctoral thesis, Universiti Putra Malaysia.

Abstract

Chatbots that can answer user questions have a great future to assist humans to be very productive. Question-answering (QA) chatbots can be implemented using machine learning (ML) or rules. ML chatbots are better compared to rule-based chatbots because ML chatbots are expandable with continuously training. Since its inception for the machine-learning-based translation problem domain in 2014, the sequence-to-sequence (Seq2Seq) training approach has shown remarkable progress in developing chatbots. Nevertheless, Seq2Seq chatbots have a weakness whereby it tends to produce irrelevant responses and is not meaningful hence may reduce the chatbot acceptance. The flaw is caused by three factors: “Language Model Influence”, “Question Encoding Overfitting”, and “Answer Generation Overfitting”. Besides, many chatbots are developed using the single-task learning (“STL”) method which executes only the response generation task. Recent works utilize multi-task learning (MTL) to overcome the weakness, but they still produce generic answers which are not consistent with the questions. Therefore, this research presents “SEQ2SEQ++”. “SEQ2SEQ++” is a Seq2Seq MTL learning method which comprises of four (4) components (“Multi-Functional Encoder” (MFE), “Answer Decoder”, “Answer Encoder”, “Ternary-Classifier” (TC)) and is trained using “Dynamic Weights” algorithm and “Comprehensive Attention Mechanism” (CAM). All these methods and mechanisms are novel approaches proposed in this work. Experiments were conducted on two (2) publicly available published academic datasets (SQuAD and NarrativeQA) to measure the performance of the suggested method against two current MTL methods (“MTL-LTS” and “MTL-BC”). “MTL-BC” executes response generation and binary question-response categorization in parallel. “MTL-LTS” executes first-word generation subsequently response generation in sequential order. Experiment outcomes show that “SEQ2SEQ++” outexecutes the benchmark works on all assessment metrics used in this study. For the “BLEU” metric, “SEQ2SEQ++” performed 44.42% superior to “MTL-BC” on NarrativeQA and 17.31% superior to “MTL-BC” on SQuAD correspondingly. On “WER”, “SEQ2SEQ++” performed 58.83% superior to “MTL-LTS” on NarrativeQA and 37.26% superior to “MTL-BC” on SQuAD correspondingly. As for “Distinct-2”, “SEQ2SEQ++” performed 0.73% superior to “MTL-BC” on NarrativeQA and 0.21% superior to “MTL-LTS” on SQuAD correspondingly.


Download File

[img] Text
FSKTM 2022 11 IR.pdf

Download (1MB)

Additional Metadata

Item Type: Thesis (Doctoral)
Subject: Chatbots
Subject: Computing platforms
Call Number: FSKTM 2022 11
Chairman Supervisor: Nurfadhlina Mohd Sharef, PhD
Divisions: Faculty of Computer Science and Information Technology
Depositing User: Editor
Date Deposited: 07 Jul 2023 02:28
Last Modified: 07 Jul 2023 02:28
URI: http://psasir.upm.edu.my/id/eprint/104064
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item