Linguistic Feature Classifying and Tracing

Authors

  • Mohammadreza Moohebat University of Malaya, Department of Artificial Intelligence, Faculty of Computer Science & Information Technology, Kuala Lumpur, Malaysia
  • Ram Gopal Raj University of Malaya, Department of Artificial Intelligence, Faculty of Computer Science & Information Technology, Kuala Lumpur, Malaysia
  • Dirk Thorleuchter Fraunhofer INT, D-53879 Euskirchen, Appelsgarten 2, Germany
  • Sameem Binti Abdul Kareem University of Malaya, Department of Artificial Intelligence, Faculty of Computer Science & Information

DOI:

https://doi.org/10.22452/mjcs.vol30no2.1

Keywords:

Scientific articles, Linguistic features, Latent semantic indexing, Text Mining

Abstract

We investigate the identification and analysis of linguistic (lexico-grammatical) features that are characteristically used by articles of a specific year of publication. Linguistic features differ from shallow features because they represent authors’ lexico-grammatical writing styles and do not consider well-known bag-of-words model. Current literature focusses on shallow features rather than on linguistic features and existing methods for identifying linguistic features use well-known knowledge-structure based approaches. In contrast to this, we advance these existing methods by applying semantic clustering instead of using knowledge-structure based approaches. For evaluation purpose, a linguistic feature-based prediction model is built to enable an automated assignment of articles to their years of publication. In a case study, the proposed methodology is applied to articles of the Springer book series 'Communications in Computer and Information Science' published from 2009 to 2013. The Case study results show the feasibility of the proposed approach as compared to frequently used baseline.

Downloads

Download data is not yet available.

Downloads

Published

2017-06-01

How to Cite

Moohebat, M., Raj, R. G., Thorleuchter, D., & Binti Abdul Kareem, S. (2017). Linguistic Feature Classifying and Tracing. Malaysian Journal of Computer Science, 30(2), 77–90. https://doi.org/10.22452/mjcs.vol30no2.1

Most read articles by the same author(s)