Lemmatization and Visualization of Tamil Documents

G.T. Prabavathi

Year: 2009
Volume: 1
Issue: 1

Lemmatization and Visualization of Tamil Documents

Author:
G.T. Prabavathi
Total Page Count: 10
Page Number: 83 to 92

Lecturer (SS) in Computer Science, Department of Computer Science, Gobi Arts & Science College, Gobichettipalayam – 638453. Email: gtpraba@gmail.com

Online published on 11 June, 2014.

Abstract

Powerful methods for interactive exploration and search from collections of textual documents are essential to manage the ever-increasing flood of digital information. This paper deals with lemmatizing Tamil text documents and visualizing the clustered documents for faster retrieval system. Tamil - a language belonging to the south-central branch of the Dravidian languages is highly inflectional which requires huge lemmatization techniques for extracting the correct root word. The mined documents are automatically clustered onto a map in an unsupervised manner through statistical information of word contexts using self-organizing map (SOM) increasing the search efficiency in Tamil digital library collection.

Keywords

Text mining, Stemming, Lemmatization, Self-organizing maps

Lemmatization and Visualization of Tamil Documents

Abstract

Keywords

Products

Company

Account

Support