Abstract:
A Morphological Analyzer is a crucial tool for any language. In popular tools used to build morphological analyzers like XFST, HFST and Apertium's lttoolbox, the finite state approach is used to sequence input characters. We have used the finite state approach to sequence morphemes instead of characters. In this paper we present the architecture and implementation details of a Corpus assisted FSA approach for building a Verb Morphological Analyzer. Our main contribution in this paper is the paradigm definition methodology used for the verbs in a morphologically rich Indian Language Konkani. The mapping of citation form of the verbs to paradigms was carried out using an untagged corpus for Konkani. Besides a reduction in human effort required an F-Score of 0.95 was obtained when the mapping was tested on a tagged corpus.