Welcome to genomation

This is an R package that contains a collection of tools for visualizing and analyzing genome-wide data sets. The package works with a variety of genomic interval file types and enables easy summarization and annotation of high throughput data sets with given genomic annotations.

Features

genomation

Documentation

Citation

Akalin A, Franke V, Vlahoviček K, Mason CE, Schübeler D. genomation: a toolkit to summarize, annotate and visualize genomic intervals. Bioinformatics. 2014 Nov 21. pii: btu775

Installation

Install via Bioconductor

source("http://bioconductor.org/biocLite.R")
biocLite("genomation")

Install the latest version via devtools::install_github

You can install genomation via install_github() function from devtools package.

# Install dependencies
install.packages( c("data.table","plyr","reshape2","ggplot2","gridBase","devtools"))
source("http://bioconductor.org/biocLite.R")
biocLite(c("GenomicRanges","rtracklayer","impute","Rsamtools"))

# install the packages
library(devtools)
install_github("BIMSBbioinfo/genomation",build_vignettes=FALSE)

# install the data package to be able to run examples in the vignette
install_github("frenkiboy/genomationData",build_vignettes=FALSE)

Data import

Functions such as readBed, gff2GRanges, readTranscriptFeatures and readGeneric can read multiple flat file formats as GRanges objects into R.

# Read a BED12 file and return a GRangesList object with promoters, exons and introns
my.bed12.file = system.file("extdata/chr21.refseq.hg19.bed", package = "genomation")
my.bed12.file
feats = readTranscriptFeatures(my.bed12.file)

# Read a generic tabular text file containing genomic locations
my.file=system.file("extdata","chr21.refseq.hg19.bed",package="genomation")
refseq = readGeneric(my.file,chr=1,start=2,end=3,strand=NULL,
                      meta.cols=list(score=5,name=4),
                     keep.all.metadata=FALSE, zero.based=TRUE)

Summarize GRanges object on defined regions such as promoters

You can summarize GRanges objects that overlap with a set of promoters and return a ScoreMatrix object. The object will contain scores for each base in each promoter, columns correspond to bases and rows correspond to promoters.

 data(cage)
 data(promoters)
 scores1=ScoreMatrix(target=cage,windows=promoters,strand.aware=TRUE,
                                 weight.col="tpm")

Summarize BAM files on pre-defined regions

BAM files can also be used in ScoreMatrix() function as well.

 bam.file = system.file('tests/test.bam', package='genomation')
 windows = GRanges(rep(c(1,2),each=2), IRanges(rep(c(1,2), times=2), width=5))
 scores3 = ScoreMatrix(target=bam.file,windows=windows, type='bam')

Summarize BigWig files on pre-defined regions

You can also use bigWig files in ScoreMatrix() function.

bw.file = system.file('tests/test.bw', package='rtracklayer')
windows = GRanges(rep('chr2',each=4), IRanges(start=c(250,350,450,550), width=50))
scores3 = ScoreMatrix(target=bw.file ,windows=windows, type='bigWig')

Visualize summary matrices as heatmap

ScoreMatrix or ScoreMatrixList objects can be visualized with heatMatrix, multiHeatMatrix, plotMeta and heatMeta functions.

data(cage)
data(promoters)
scores1=ScoreMatrix(target=cage,windows=promoters,strand.aware=TRUE)

data(cpgi)
scores2=ScoreMatrix(target=cpgi,windows=promoters,strand.aware=TRUE)

sml=new("ScoreMatrixList",list(a=scores1,b=scores2))
multiHeatMatrix(sml,kmeans=TRUE,k=2,matrix.main=c("cage","CpGi"),cex.axis=0.8)

Visualize summary matrices as meta-region plots

plotMeta(mat=sml,overlay=TRUE,main="my plotowski")
heatMeta(mat=sml,main="my plotowski")

Authors and Contributors

Vedran Franke (@frenkiboy) and Altuna Akalin (@al2na) initially authored this package. Check here to see other contributors . You can contribute by checking out the "development" branch, making changes and submitting a pull request.

Support or Contact

send an e-mail to genomation@googlegroups.com or use the web interface to post a question https://groups.google.com/forum/#!forum/genomation