GENOME DATABASES FORMATS AND THEIR TOOLS -THE STATE OF THE ART SURVEY
Keywords:
VCF, BAM, SAMtools, GFF, BEDtool, BigWig, BigBed, Bwtool, Tabix, SAM tools, NCList, Ensembl, IndexingAbstract
Biological sciences have large amount of genomic data and there is challenge to deal with this huge amount of
data for the researchers. Genomic data are commonly represented in tables stored as plain text files and requires parsing for
analysis, which is very time consuming and error prone method. The indexing facilities provide efficient access to data along
with providing useful methods of summarizing columns. Analysis of code can also be substantially simpler as well as being
uniform across different data formats. These benefits of reduced code complexity and greatly increased performance allow
users much greater freedom to explore their data.