Journal of Innovation in Computer Science and Engineering

  • Year: 2019
  • Volume: 9
  • Issue: 1

Information Extraction For Afaan Oromo News Texts Using Hybrid Approach

  • Author:
  • Obsa Gilo1, Anitha2
  • Total Page Count: 8
  • DOI:
  • Page Number: 1 to 8

1Lecturer, Department of Computer Science and Information Technology, College of Engineering and Technology, Wollega University, Ethiopia, Email: obsigilo5@gmail.com

2Associate Professor, Department of Computer Science and Information Technology, College of Engineering and Technology, Wollega University, Ethiopia, anithapt74@gmail.com

Abstract

The numbers of Afaan Oromo documents on the web and in other machine readable forms are also increasing from time to time. As a result of this growth, the huge amount of text which contains different valuable information which can be used in education, business, security and other many areas are hidden under the unstructured representation of the textual data. This shows that getting the right information for decision making from existing abundant unstructured text is a big challenge. The unavailability of tools for extracting and exploiting the valuable information from Afaan Oromo text, which is effective enough to satisfy the users has been a major problem and manually extracting information from a large amount of unstructured text is a very tiresome and time consuming job. The overall objective of this research is to develop a hybrid information extraction model for Afaan Oromo news texts. The system was developed by using Python, General architecture text engineering(GATE) and Weka for classification and rule-based technique was applied to address the problem of automatically deciding the correct candidate texts based on its surrounding context words. Nearly hundred sport news texts which contain 550 words were collected from Oromia broadcasting network(OBN).

Keywords

Afaan Oromo, Information extraction, Candidate texts, Information Retrieval and News Texts