IR @ Goa University

Improving word embedding coverage in less-resourced languages through multi-linguality and cross-linguality: A case study with aspect-based sentiment analysis

Show simple item record

dc.contributor.author Akhtar, Md.S.
dc.contributor.author Sawant, P.
dc.contributor.author Sen, S.
dc.contributor.author Ekbal, A.
dc.contributor.author Bhattacharyya, P.
dc.date.accessioned 2020-01-06T08:47:06Z
dc.date.available 2020-01-06T08:47:06Z
dc.date.issued 2019
dc.identifier.citation ACM Transactions on Asian and Low-Resource Language Information Processing. 18(2); 2019; ArticleID_3273931. en_US
dc.identifier.uri https://doi.org/10.1145/3273931
dc.identifier.uri http://irgu.unigoa.ac.in/drs/handle/unigoa/5930
dc.description.abstract In the era of deep learning-based systems, efficient input representation is one of the primary requisites in solving various problems related to Natural Language Processing (NLP), data mining, text mining, and the like. Absence of adequate representation for an input introduces the problem of data sparsity, and it poses a great challenge to solve the underlying problem. The problem is more intensified with resource-poor languages due to the absence of a sufficiently large corpus required to train a word embedding model. In this work, we propose an effective method to improve the word embedding coverage in less-resourced languages by leveraging bilingual word embeddings learned from different corpora. We train and evaluate deep Long Short Term Memory (LSTM)-based architecture and show the effectiveness of the proposed approach for two aspect-level sentiment analysis tasks (i.e., aspect term extraction and sentiment classification). The neural network architecture is further assisted by hand-crafted features for prediction. We apply the proposed model in two experimental setups: multi-lingual and cross-lingual. Experimental results show the effectiveness of the proposed approach against the state-of-the-art methods. en_US
dc.publisher ACM Digital Library en_US
dc.subject Computer Science and Technology en_US
dc.title Improving word embedding coverage in less-resourced languages through multi-linguality and cross-linguality: A case study with aspect-based sentiment analysis en_US
dc.type Journal article en_US
dc.identifier.impf cs


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Advanced Search

Browse

My Account