PANATI is a tool for mapping and aligning short-read next generation resequencing data to a reference sequence for the purpose of detecting SNP variation, short insertions or deletions, and genotype calling across multiple samples. PANATI's design divides the mapping and alignment stage (primary analysis) from subsequent biological inquires (secondary analysis) by outputing an annotated version of the reference genome for subsequent use by separate programs focused solely on the biological question and the challenges inherent to it, rather than the tedious and time consuming parsing of alignments. Thus, PANATI's primary analysis stage might be viewed as a data compression technique in which the vast quantities of mostly redundant input data is reduced down to the essential information needed for many potential biological analyses. Presently, only SNP discovery and genotype calling across multiple samples is well developed but PANATI's primary analysis output is intended to be useful for other inquiries such as efficient RNA-seq expression analysis across multiple samples and detection of copy-number variable regions of the genome.
- Smith-Waterman alignment
- No hard limits on the number of mismatchs and in/dels imposed by the algorithm
- Designed for and best suited for analysis of population samples with high diversity or for the use of a divergent proxy reference sequence for species which have no adequent reference of their own
- Fast execution even when divergence between the sample and the reference sequence is high
- Single-end and paired-end read mapping
- Read lengths of any size
- Input can be mixes of different read lengths and single-end or paired-end formats
- Very fast SNP discovery and genotype calling
- Flexible trade-offs between speed and memory usage
- Multithreaded parallel execution of mapping and alignment scaling in linear performance up to 64 CPUs (higher has not been tested)
- Ability to read compressed FASTQ files in bzip2 or gzip formats directly. PANATI will automatically use pbzip2 for parallel decompression of pbzip2 compressed files if the program is available
- free for academic use
PANATI's development and release site is hosted by SourceForge.net. Presently there are no official release files and development version control systems are disabled and out of date since April 2012 while a manuscript describing PANATI is in review for publication.
PANATI is written in C and developed on the GNU/Linux platform. It should compile on any current GNU/Linux distribution with development versions of the gsl and zlib libraries installed. It may also compile and run on Mac OS X if gsl and zlib are installed.
Presently there is no peer-reviewed publication for PANATI. Authors of academic publications utilizing PANATI should cite PANATI as "Wright, MH. 2009-2012 http://panati.sourceforge.net/". A peer-reviewed is in review, please check back here for current information.
Other tools for short-read sequence analysis:
- SAMtools - presently PANATI does not output alignments in SAM format but will in the future most likely permit users to request SAM format alignment outputs. (I also ripped off their stylesheet for this web page)
- Burrows-Wheeler Aligner (BWA)
- Genome Analysis Toolkit (GATK)
- ALCHEMY - Genotype calling for Affymetrix and Illumina products, and others
PANATI is a sanskrit alchemical term for the process of distillation, or the extraction of a desired substance from a crude mixture. After a fashion, PANATI attempts to "distill" meaningful information out of a large volume of crude data.