TIGRA-SV Support Migrated

Support for TIGRA-SV has migrated to Ken Chen’s laboratory at MD Anderson Cancer Center. Please visit http://bioinformatics.mdanderson.org/main/TIGRA for the latest versions and support information. Full source history is still maintained in the github repository.

You can download this release on Github.

SomaticSniper v1.0.5.0

Major Changes

This release alters how counts and average base qualities are reported in the VCF. Previous versions double counted ambiguous bases and improperly restricted the BCOUNT field to bases called in the tumor and normal genotypes (see #5). With this release, DP4 is now more stringent, only counting a read as a reference base if it matches that base exactly. BCOUNT is similarly more stringent and no longer reports ambiguous bases.

For calculating VAFs from the VCF output, we recommend using the BCOUNT field’s value for your base of interest and dividing by the sum of the values in the BCOUNT field. This will prevent you from including ambiguous bases in your overall depth measure.

Minor Changes

  • CMAKE_INSTALL_PREFIX is now respected if specified during compilation.
  • A compilation error with unit tests on Max OS X (and possibly other platforms) was fixed.
  • Numerous documentation updates

You can download this release on Github.

SomaticSniper v1.0.4.2

Major Changes

This is a patch release that fixes an edge case where the program would enter an infinite loop.

Minor Changes

  • The source tree now support four digit versions and some documentation has been updated.

You can download this release on Github.

DGIdb v1.66 Released

DGIdb version 1.66 has been released - this release contains clinical trial data from MyCancerGenome.

You can download this release on Github.

SomaticSniper v1.0.4

Major Changes

This release adds options to filter loss-of-heterozygosity and gain-of-reference allele calls from SomaticSniper output. Loss-of-heterozygosity calls are defined as calls where the tumor genotype is fully a subset of the normal genotype. Gain-of-heterozygosity calls are defined as those where the reference allele is not present in the normal genotype call, but is present in the tumor. For example the following calls would both be suppressed,

Ref Tumor Normal
A GG AG
A AG GG

Minor Changes

  • build-common is now utilized as a git subtree. Downloads from github will now be functional and recursive clones are no longer necessary.

You can download this release on Github.

MendelScan v1.2.1

This release is a minor fix to v1.2 and addresses a couple of VEP annotation (native output) format bugs. A secondary output file (.excluded) now lists variants excluded from the main output file due to lack of VEP annotation. Also, sites with 2+ alternate alleles will be reported in multiple lines (one per alternate allele) when VEP native output format is provided.

You can download this release on Github.

MendelScan v1.2.0

We fixed a bug related to segmentation scoring, added broader VEP output format compatibility, and enabled preliminary recessive scoring functionality for the “score” subcommand.

If you use MendelScan to aid your research, please cite the publication: Koboldt DC, Larson DE, Sullivan LS, Bowne SJ, Steinberg KM, et al. Exome-based mapping and variant prioritization for inherited Mendelian disorders. Am J Hum Genet. doi:10.1016/j.ajhg.2014.01.016

You can download this release on Github.

MendelScan v1.1.1

This is the initial release of MendelScan, a software package for analyzing variants identified by family-based sequencing of rare genetic diseases. There are three main functions included with this release:

java -jar MendelScan.jar score – prioritize variants in a VCF based upon segregation, annotation, population frequency, and gene expression java -jar MendelScan.jar rhro – apply rare heterozygote rule out to map dominant disease genes java -jar MendelScan.jar sibd – apply shared IBD analysis to map disease genes

A manuscript describing this tool has been submitted; in the interim, please cite MendelScan by noting the version number and citing its home URL: http://gmt.genome.wustl.edu/mendelscan.

You can download this release on Github.

Introducing DGIdb

In the era of high throughput genomics, investigators are frequently presented with lists of mutated or otherwise altered genes implicated in disease. Numerous resources exist to help form hypotheses about how such genomic events might be targeted therapeutically or prioritized for drug development. However, utilizing these resources typically requires tedious manual review of literature and knowledge bases. No informatics tools currently exist which mine these resources and provide a simple interface for searching lists of genes against the existing compendia of both known or predicted drug-gene interactions and potentially druggable genes. The Drug-Gene Interaction database (DGIdb) addresses this challenge. Drug-gene interactions have been mined from existing databases and literature to populate DGIdb. Similarly, genes have been categorized as potentially druggable according to membership in selected pathways, molecular functions and gene families. Currently, DGIdb contains 2,611 genes and 6,307 drugs involved in 14,144 drug-gene interactions, and 6,761 genes that belong to one or more of 39 potentially druggable gene categories. Users can enter a gene or list of genes to retrieve all known or potentially druggable genes in that list. Results can be filtered by source, interaction type, or treatment type and are sorted by source trust level. Our goal was to create a user-friendly search tool and comprehensive database of genes that have the potential to be druggable with a particular focus on cancer. DGIdb can be accessed programmatically or through a web-based interface at dgidb.org.

Pindel 0.2.4 Available

An updated version of Pindel is available for download.

Pindel is a program that detects short indels and complex structural variants (large deletions, inversions, tandem duplications, mobile element insertions and translocations) from next-generation sequence data using pattern growth.

It takes either extracted reads (using sam2pindel or bam2pindel.pl) or multiple bam files as input. A pindel2vcf converter is provided to report variant calls in VCF format.

The source code for Pindel is available on GitHub, and pre-built packages for Ubuntu 10.04 systems are available from The Genome Institute. For installation instructions, see the Pindel project page.

iBWA Alpha v0.5 Released

Source code and Ubuntu 10.04 (64-bit) packages of iBWA Alpha v0.5 are now available for testing. iBWA is a fork of Heng Li's BWA aligner with support for iteratively adding alternate haplotypes, reference patches, and variant hypotheses. This enables you to leverage existing tools and pipelines in the diverse BWA ecosystem while also creating new analysis opportunities.

  • Take advantage of improvements to the human reference made available by the Genome Reference Consortium.
  • Represent alternate alignments in the context of the primary human reference.

For additional information about BWA please see Heng Li's BWA @ SourceForge or BWA's manual page.

Genome MuSiC v0.4 Released

MuSiC v0.4 is now available for download. This release adds new visualization tools, performance improvements, support for TCGA MAF v2.3, and coverage files in UCSC WIG format when BAMs are impractical. Here is a complete changelog:

  • Added tools to generate typical visualizations like Kaplan-Meier survival estimates, and mutation status matrices.
  • Support for TCGA Mutation Annotation Format (MAF) version 2.3.
  • Performance improvements in mutation rate calculations, and more efficient memory usage.
  • Added support for wiggle track format files describing coverage, if BAMs are unavailable.

RefCov v0.3 Released

RefCov v0.3 provides critical fixes and several new features. Fixes include restoring several modules that were absent in the previous release. New features include: 1) cluster-coverage for detecting contiguous clusters of sequence reads across a reference, 2) the ability to evaluate coverage of entire chromosomes using the BAM file header as the region-of-interest, ex. --roi-file-path=$BAM --roi-file-format=bam, 3) normalization of coverage using a defined Perl-compatible equation, 4) relative coverage based on a defined list of size bins, and 5) optional output of the chromosome start and end as BED-style columns.

More Information

SomaticSniper v1.0.0

The latest release of SomaticSniper, The Genome Institute’s somatic SNV calling workhorse, adds an alternative statistical model that better accounts for the rarity of somatic events by jointly considering the tumor and normal genotypes. This version also adds native support for the VCF and BED formats as output. The VCF output contains information useful for downstream filtering, e. g., fraction of reads on the forward and reverse strands, average read mapping quality, and average base quality for reads/bases supporting the variant allele and those supporting the reference allele.

You can download this release on Github.

Genome MuSiC v0.3 Released

MuSiC (Mutational Significance in Cancer) 0.3 is now available for download featuring numerous fixes and several new features. MuSiC performs a variety of statistical analyses on the somatic (and germline) alterations discovered in any cancer cohort. Improvements in this version include an enhanced significantly mutated gene test which introduces the ability to 1) take into account sample-specific mutation rates and 2) identify significantly mutated non-genic regions of the genome. The clinical correlation module now features a generalized linear model option allowing for the elimination of covariate influences on mutation-phenotype relationships. Support for MAF 2.2 and for Pfam annotation of GRCh37 (hg19) are now standard. Additionally, several MuSiC components have been optimized and parallelized for faster execution.

More Information

TIGRA-SV v0.1 Released

The initial release of TIGRA-SV is available for download. TIGRA-SV is a program that conducts targeted local assembly of structural variants (SV) using the iterative graph routing assembly (TIGRA) algorithm (L. Chen, unpublished). It takes input as a list of putative SV calls and a set of bam files that contain reads mapped to a reference genome. For each SV call, it assembles the set of reads that were mapped or partially mapped to the region of interest (ROI) in the corresponding bam files. TIGRA-SV is quite effective at improving the SV prediction accuracy in short reads analysis and can produce accurate breakpoint sequences.

More information

SomaticSniper v0.7.4 Released

The purpose of this program is to identify single nucleotide positions that are different between tumor and normal (or, in theory, any two bam files). It takes a tumor bam and a normal bam and compares the two to determine the differences. It outputs a file in a format very similar to Samtools consensus format. It uses the genotype likelihood model of MAQ (as implemented in Samtools) and then calculates the probability that the tumor and normal genotypes are different. This probability is reported as a somatic score. The somatic score is the Phred-scaled probability (between 0 to 255) that the Tumor and Normal genotypes are not different where 0 means there is no probability that the genotypes are different and 255 means there is a probability of 1 – 10(255/-10) that the genotypes are different between tumor and normal. This is consistent with how the SAM format reports such probabilities.

You can download this release on Github.

BreakDancer v1.2 Released

The latest release of BreakDancer, the Genome Institute's structural variation (SV) detector, is now available for download.
The new release has small bug fixes to ensure it runs reliably on the latest Ubuntu distribution. It is now available as a deb package, installable readily onto Ubuntu Linux or other Debian-based systems.

Installation Instructions

JoinX v1.1 Released

Joinx is a lightweight tool for performing set operations (e.g., intersection, difference, …) on genomic data contained in .bed files. It also provides some limited analysis functions (concordance reports). An important assumption that joinx makes is that the input data is always sorted. This allows it to compute its results in an efficient manner.

More Information

Genome MuSiC v0.2 Released

The decreasing cost of sequencing has moved the focus of cancer genomics beyond single genome studies to the analysis of tens or hundreds of patients diagnosed with similar cancers. Besides the routine discovery and validation of SNVs, indels, and SVs in individual genomes, it is now paramount to systematically analyze the function and recurrence of mutations across a cohort, and to describe how they interact with one other and with the associated clinical data. To this end we have developed the Mutational Significance In Cancer package (MuSiC). It consists of a suite of downstream analysis tools designed to (1) apply statistical methods to identify significantly mutated genes, (2) highlight significantly altered pathways, (3) investigate the proximity of amino acid mutations in the same gene, (4) search for gene-based or site-based correlations to mutations and relationships between mutations themselves, (5) correlate mutations to clinical features, and (6) cross-reference findings with relevant databases such as Pfam, COSMIC, and OMIM.

More Information