An analysis of e hadoop/mapreduce/h base framework and its current applications in bioinformatics

Ramesh Chandra Tripathi

Year: 2021
Volume: 11
Issue: 11

An analysis of e hadoop/mapreduce/h base framework and its current applications in bioinformatics

Author:
Ramesh Chandra Tripathi¹
Total Page Count: 7
DOI: 10.5958/2249-7137.2021.02501.5
Page Number: 565 to 571

¹Professor, Department of Computer Engineering, Teerthanker Mahaveer University, Moradabad, Uttar Pradesh, India, Email id: tripathi.computers@tmu.ac.in

Online Published on 13 January, 2022.

Abstract

High-performance computing (HPC) has become more essential in bioinformatics data processing as a result of new computational difficulties. Work is usually distributed over a cluster of computers that connect to a shared file system housed on a storage area network. The Message Passing Interface (MPI) and, more recently, Hadoop's MapReduce API have been used to achieve work parallelization. Cloud computing is another computer architecture/service model that is currently being investigated. In a nutshell, cloud computing is HPC with a web interface plus the flexibility to scale up and down quickly for on-demand usage. Remote clients upload potentially large data sets for analysis in the Hadoop framework or other parallelized environments running in the data center, with the server side deployed in data centers working on clusters. The present use of Hadoop, a toplevel Apache Software Foundation project, and related open source software projects in the bioinformatics field is discussed. The principles underlying Hadoop and the HBase project are explained, as well as the existing bioinformatics software that uses Hadoop. The emphasis is on next-generation sequencing, which is now the most popular application area.

Keywords

API, Hadoop, HBase, Map Reduce, Pig

An analysis of e hadoop/mapreduce/h base framework and its current applications in bioinformatics

Abstract

Keywords

Products

Company

Account

Support