There are several micro-blogging services available today such as Twitter, Tumblr, Jisko, Thimbl, Wordpress, blog, etc. These micro-bloggers communicate in the form of posts containing videos, images, text, links, etc. This large amount of raw data may overwhelm the users. One solution to this problem is the classification of this raw data. As these raw data such as “tweets” do not account for sufficient word occurrences, we approach traditional methods of classification such as-Bag of Words, Naïve Bayes classifier, frequency distribution. However, to address this problem, we propose to use a small set of domain specific features extracted from the author's profile and text. The proposed approach effectively classifies the text to a predefined set of generic classes such as Positive, Negative, Neutral.
Internet, mining, blogging