Croatian Morphological Lexicon
Croatian Morphological Lexicon is a lexical database which encompasses more than 45,000 lemmas of general language, 15,000 personal male and female first names and more than 50,000 surnames registered in the Republic of Croatia. From this repository of lemmas more than 3,900,000 word-forms has been generated. This Lexicon can be highly useful for students of Croatian language (both native speakers and foreigners), experts and systems which deal with document retrieval (Internet and intranet search-engines), information extraction, text-mining and computational processing of Croatian.
See more in detail:
- Tadic, Marko & Fulgosi, Sanja (2003) Building the Croatian Morphological Lexicon u: Proceedings of the EACL2003 Workshop on Morphological Processing of Slavic Languages (Budimpešta 2003), ACL, str. 41-46.
- Tadic, Marko. Croatian Lemmatization Server // Formal Approaches to south Slavic and Balkan Languages / Vulchanova, Mila Dimitrova ; Koeva, Svetla ; Krapova, Iliyana ; Vulchanov, Valentin (ur.). Sofia : Bulgarian Academy of Sciences, 2006. 140-146.