MULTI-LABEL TEXT CLASSIFICATION VIA DOCUMENT ENHANCEMENT AND LABEL CORRELATIONS LEARNING

Main Article Content

Chuzhen Li
Mohd Juzaiddin Ab Aziz
Mohd Ridzwan Yaakub

Abstract

Multi-label text classification (MLTC) has become increasingly popular due to its broader applicability and closer alignment with real-world objects' inherent properties and rules. Numerous approaches have been suggested to capture the label correlations. Yet, most of them capture relationships between labels in an implicit manner and typically do not explicitly distinguish or define label similarity correlations and label pairing correlations, but rather treat them as a unified label correlation. To this end, in this paper, we propose an approach to distinguish and explicitly define label similarity correlations and pairing correlations. The approach begins by acquiring text and label representations simultaneously. Next, the document representations are enhanced by concatenating with the most similar document subsets. Finally, the label similarity correlations and pairing correlations are explicitly learned in the label correlations learning. This approach shows that the performance surpasses the previous competitive models, with micro-F1 scores of 75.3% and 89.6% on the AAPD and RCV1-V2 datasets, respectively.

Downloads

Download data is not yet available.

Article Details

How to Cite
Li, C., Ab Aziz, M. J., & Yaakub, M. R. (2024). MULTI-LABEL TEXT CLASSIFICATION VIA DOCUMENT ENHANCEMENT AND LABEL CORRELATIONS LEARNING. Malaysian Journal of Computer Science, 38(2). Retrieved from https://samudera.um.edu.my/index.php/MJCS/article/view/54339
Section
Articles

Most read articles by the same author(s)