UPM Institutional Repository

A corpus-based comparative study on lexical richness and syntactic complexity of Chinese English-major students’ EFL writing


Citation

Yang, Yang (2024) A corpus-based comparative study on lexical richness and syntactic complexity of Chinese English-major students’ EFL writing. Doctoral thesis, Universiti Putra Malaysia.

Abstract

In a bid to better understand English writing competence, which is seen as a vital reflection of an English learner’s proficiency, this study zeroes in on the language aspect of Chinese English-major students’ (CEMS) English as a foreign language (EFL) writing, specifically focusing on lexical richness (LR) and syntactic complexity (SC). The research is anchored around four primary objectives, which aim to 1) compare CEMS’ EFL writing to English as a native language (ENL) writing regarding LR and SC, 2) explore differences across four academic years of CEMS’ EFL writing, 3) determine how LR and SC evolve over the CEMS’ academic years, and 4) unearth the types of lexical and syntactic errors CEMS typically make. To this end, 400 CEMS’ EFL writing samples were sourced from the Spoken and Written Corpus of Chinese Learners Version 2.0, with 100 compositions selected for each academic year. Additionally, 200 native English writing samples were extracted from the Louvain Corpus of Native English Essays. For comparative analyses, the Mann-Whitney U test and the Kruskal-Wallis H test were conducted. Dynamic systems theory was applied to interpret the developmental features of CEMS’ EFL writing. Finally, lexical and syntactic errors were identified and investigated with comparative, correlational, and developmental analyses. The results showed that, in terms of LR, while ENL writing showcased superior lexical sophistication and variation, EFL writing closely matched ENL writing in lexical density. The lag in lexical sophistication for CEMS could be linked to China’s pedagogical focus on grammar over vocabulary sophistication. For SC, CEMS’ EFL writing was notably less syntactically complex than ENL writing. This was evident from fewer subordinate and coordinate structures and a lower proportion of complex phrases. The limited use of subordination in EFL writing might stem from the inherent differences between English and Chinese. Across the four academic years, while lexical density among EFL students even exceeded the ENL level by the fourth year, lexical sophistication remained steady. Regarding SC, CEMS’ EFL writing, on certain indices, is approaching ENL writing throughout its four academic years. However, a consistent gap remained, underscoring a growth potential. In terms of developmental trajectories, most indices showed non-linear paths, with phenomena like fossilization and the “plateau effect” becoming evident, especially in the final academic year. An exhaustive examination of errors in CEMS’ EFL writing led to the identification of 1,572 errors, with six main categories constituting over 90% of the total, including noun, sentence, verb, form, word, and preposition error categories. Common specific errors pertained to articles, noun agreement, punctuation, verb agreement, and transitivity. These errors were attributed to factors like negative language transfer or overgeneralization. A pronounced challenge with prepositions was also noted. This study sheds light on the nuances and intricacies of LR and SC in non-native English writing. It emphasizes the importance of understanding language development as a dynamic and non-linear process. Pedagogically, the research holds vast potential, offering educators valuable insights to refine their instruction techniques and curricula.


Download File

[img] Text
123898.pdf

Download (1MB)
Official URL or Download Paper: https://ethesis.upm.edu.my/id/eprint/18735

Additional Metadata

Item Type: Thesis (Doctoral)
Subject: English language - Study and teaching - Foreign speakers
Subject: English language - Grammar - Errors of usage
Call Number: FBMK 2024 39
Chairman Supervisor: Associate Professor Yap Ngee Thai, PhD
Divisions: Faculty of Modern Language and Communication
Keywords: Lexical richness; Syntactic complexity; Chinese English-major students; EFL writing
Depositing User: Ms. Rohana Alias
Date Deposited: 07 Apr 2026 01:38
Last Modified: 07 Apr 2026 01:38
URI: http://psasir.upm.edu.my/id/eprint/123898
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item