IR @ Goa University

A framework for learning morphology using suffix association matrix

Show simple item record

dc.contributor.author Desai, S.
dc.contributor.author Pawar, J.D.
dc.contributor.author Bhattacharya, P.
dc.date.accessioned 2015-09-22T08:51:04Z
dc.date.available 2015-09-22T08:51:04Z
dc.date.issued 2014
dc.identifier.citation Proc. 5. Workshop on South and Southeast Asian NLP; 25. Int. Conf. on Computational Linguistics, Dublin, Ireland. 2014; 28-36. en_US
dc.identifier.uri http://irgu.unigoa.ac.in/drs/handle/unigoa/3640
dc.description.abstract Unsupervised learning of morphology is used for automatic affix identification, morphological segmentation of words and generating paradigms which give a list of all affixes that can be combined with a list of stems. Various unsupervised approaches are used to segment words into stem and suffix. Most unsupervised methods used to learn morphology assume that suffixes occur frequently in a corpus. We have observed that for morphologically rich Indian Languages like Konkani, 31 percent of suffixes are not frequent. In this paper we report our framework for Unsupervised Morphology Learner which works for less frequent suffixes. Less frequent suffixes can be identified using p-similar technique which has been used for suffix identification, but cannot be used for segmentation of short stem words. Using proposed Suffix Association Matrix, our Unsupervised Morphology Learner can also do segmentation of short stem words correctly. We tested our framework to learn derivational morphology for English and two Indian languages, namely Hindi and Konkani. Compared to other similar techniques used for segmentation, there was an improvement in the precision and recall.
dc.subject Computer Science and Technology en_US
dc.title A framework for learning morphology using suffix association matrix en_US
dc.type Conference article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Advanced Search

Browse

My Account