Citation
Abstract
This research demonstrates a new algorithmic method of moderating emotional content within Chinese dating reality shows based on cross-media analysis, combining text, audio, video, and social media feedback. Five functional layers are the components of the model architecture. The above are the major components of a vision system: Data Acquisition, Multimodal Preprocessing, Cross-Media Feature Extraction, Emotional Value Detection and Moderation, and Interpretability and Visualization. Multimodal raw data is processed through Automatic Speech Recognition (ASR), recognition of facial emotions and voice analysis in order to align and organize inputs. Preprocessing removes noise from text data, normalizes sentiment word lists and undertakes temporal transformation. During the feature extraction part, different machine-learning models are applied: Bidirectional Encoder Representations from Transformers (BERT) or Enhanced Representation through Knowledge Integration (ERNIE) for text; Convolutional Recurrent Neural Network (CRNN), and Bidirectional Long Short-Term Memory (Bi-LSTM) for audio; and Residual Neural Network (ResNet50) and Inflated 3D Convolutional Network (I3D) for video. The Multimodal Transformer Fusion (MMTF) model uses the cross-modal attention mechanisms to combine these streams of data to produce unified emotional representations. These representations are divided into emotions such as joy, sadness, love, jealousy, conflict, and embarrassment using a Multi-Layer Perceptron (MLP) with softmax activation. To dynamically control content, a Deep Q-Network (DQN)-based Reinforcement Learning (RL) engine moderates’ scenes using the cultural standards and the resilience in the viewership. Interpretability is made easier with SHapley Additive exPlanations (SHAP), attention heatmaps and interactive dashboards. This research holds significant implications for the fields of Communication, Radio, and Television, as it enhances content moderation strategies in emotionally charged programming through intelligent cross-media data fusion.
Download File
Full text not available from this repository.
Official URL or Download Paper: https://www.worldscientific.com/doi/10.1142/S17569...
|
Additional Metadata
| Item Type: | Article |
|---|---|
| Subject: | Modeling and Simulation |
| Subject: | Computer Science Applications |
| Divisions: | Faculty of Modern Language and Communication |
| DOI Number: | https://doi.org/10.1142/S1756973726400123 |
| Publisher: | World Scientific |
| Keywords: | Content moderation; Cross-media data fusion; Emotion moderation; Multimodal learning; Reinforcement learning |
| Depositing User: | Ms. Nur Faseha Mohd Kadim |
| Date Deposited: | 12 Mar 2026 07:25 |
| Last Modified: | 12 Mar 2026 07:25 |
| Altmetrics: | http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1142/S1756973726400123 |
| URI: | http://psasir.upm.edu.my/id/eprint/123552 |
| Statistic Details: | View Download Statistic |
Actions (login required)
![]() |
View Item |
