UPM Institutional Repository

Lexical criminal identification for chatting corpus


Citation

Marjuni, Siti Hanom and Mahmod, Ramlan and Abd Ghani, Abdul Azim and Mohd Zain, Abdullah and Mustapha, Aida (2009) Lexical criminal identification for chatting corpus. In: 2009 2nd IEEE International Conference on Computer Science and Information Technology (ICCSIT 2009), 8-11 Aug. 2009, Beijing, China. (pp. 360-364).

Abstract

This paper aims to identify lexical of criminal elements for chatting corpus, which involved suspect and victim conversation utterances. Lexical criminal identification requires three processes. The first is tokenization to automatically assign each lexical with a corresponding serial number in every suspect and victim utterance. The second is tagging the lexical with parts of speech to identify verbs and nouns in the utterances. The third is to identify and analyze the interrogative criminal construct to get the criminal evidence. The chatting corpus consists of 3,067 suspect and victim utterances with 16,278 words, collected from 9 criminal chatting cases. The results indicate that both verb and noun are the most important part of speech elements that represent the criminal constructs in chat utterances.


Download File

[img]
Preview
Text (Abstract)
Lexical criminal identification for chatting corpus.pdf

Download (34kB) | Preview

Additional Metadata

Item Type: Conference or Workshop Item (Paper)
Divisions: Faculty of Computer Science and Information Technology
DOI Number: https://doi.org/10.1109/ICCSIT.2009.5234700
Publisher: IEEE
Keywords: Chatting; Lexicon; Part of speech; Tagging; Criminal construct; Criminal evidence
Depositing User: Nabilah Mustapa
Date Deposited: 10 Jun 2019 02:19
Last Modified: 10 Jun 2019 02:19
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1109/ICCSIT.2009.5234700
URI: http://psasir.upm.edu.my/id/eprint/68487
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item