helix_bottom

Introduction


The purpose of this program is to identify single nucleotide positions that are different between tumor and normal (or, in theory, any two bam files). It takes a tumor bam and a normal bam and compares the two to determine the differences. It outputs a file in a format very similar to Samtools consensus format. It uses the genotype likelihood model of MAQ (as implemented in Samtools) and then calculates the probability that the tumor and normal genotypes are different. This probability is reported as a somatic score. The somatic score is the Phred-scaled probability (between 0 to 255) that the Tumor and Normal genotypes are not different where 0 means there is no probability that the genotypes are different and 255 means there is a probability of 1 – 10(255/-10) that the genotypes are different between tumor and normal. This is consistent with how the SAM format reports such probabilities. It is currently available as source code via github or as a Debian APT package.

David E. Larson, Travis E. Abbott and Christopher C. Harris October 26, 2011

How to Cite


Citations for SomaticSniper should reference the Bioinformatics paper.

Latest SomaticSniper News:


SomaticSniper v1.0.5.0

Major Changes

This release alters how counts and average base qualities are reported in the VCF. Previous versions double counted ambiguous bases and improperly restricted the BCOUNT field to bases called in the tumor and normal genotypes (see #5). With this release, DP4 is now more stringent, only counting a read as a reference base if it matches that base exactly. BCOUNT is similarly more stringent and no longer reports ambiguous bases.

For calculating VAFs from the VCF output, we recommend using the BCOUNT field’s value for your base of interest and dividing by the sum of the values in the BCOUNT field. This will prevent you from including ambiguous bases in your overall depth measure.

Minor Changes

  • CMAKE_INSTALL_PREFIX is now respected if specified during compilation.
  • A compilation error with unit tests on Max OS X (and possibly other platforms) was fixed.
  • Numerous documentation updates

You can download this release on Github.

SomaticSniper v1.0.4.2

Major Changes

This is a patch release that fixes an edge case where the program would enter an infinite loop.

Minor Changes

  • The source tree now support four digit versions and some documentation has been updated.

You can download this release on Github.

SomaticSniper v1.0.4

Major Changes

This release adds options to filter loss-of-heterozygosity and gain-of-reference allele calls from SomaticSniper output. Loss-of-heterozygosity calls are defined as calls where the tumor genotype is fully a subset of the normal genotype. Gain-of-heterozygosity calls are defined as those where the reference allele is not present in the normal genotype call, but is present in the tumor. For example the following calls would both be suppressed,

Ref Tumor Normal
A GG AG
A AG GG

Minor Changes

  • build-common is now utilized as a git subtree. Downloads from github will now be functional and recursive clones are no longer necessary.

You can download this release on Github.

SomaticSniper v1.0.0

The latest release of SomaticSniper, The Genome Institute’s somatic SNV calling workhorse, adds an alternative statistical model that better accounts for the rarity of somatic events by jointly considering the tumor and normal genotypes. This version also adds native support for the VCF and BED formats as output. The VCF output contains information useful for downstream filtering, e. g., fraction of reads on the forward and reverse strands, average read mapping quality, and average base quality for reads/bases supporting the variant allele and those supporting the reference allele.

You can download this release on Github.

SomaticSniper v0.7.4 Released

The purpose of this program is to identify single nucleotide positions that are different between tumor and normal (or, in theory, any two bam files). It takes a tumor bam and a normal bam and compares the two to determine the differences. It outputs a file in a format very similar to Samtools consensus format. It uses the genotype likelihood model of MAQ (as implemented in Samtools) and then calculates the probability that the tumor and normal genotypes are different. This probability is reported as a somatic score. The somatic score is the Phred-scaled probability (between 0 to 255) that the Tumor and Normal genotypes are not different where 0 means there is no probability that the genotypes are different and 255 means there is a probability of 1 – 10(255/-10) that the genotypes are different between tumor and normal. This is consistent with how the SAM format reports such probabilities.

You can download this release on Github.