Difference between revisions of "SHORE Documentation"
(→Getting help) |
|||
(5 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
SHORE is a mapping and analysis pipeline for short DNA sequences produced on Illumina Genome Analyzer and Hiseq 2000, Life Technology SOLiD, 454 Genome Sequencer FLX and PacBio RS platforms. It is designed for projects whose analysis strategy involves mapping of reads to a reference sequence. This reference sequence does not necessarily have to be from the same species, since weighted and gapped alignments allow for accuracy even in diverged regions. | SHORE is a mapping and analysis pipeline for short DNA sequences produced on Illumina Genome Analyzer and Hiseq 2000, Life Technology SOLiD, 454 Genome Sequencer FLX and PacBio RS platforms. It is designed for projects whose analysis strategy involves mapping of reads to a reference sequence. This reference sequence does not necessarily have to be from the same species, since weighted and gapped alignments allow for accuracy even in diverged regions. | ||
− | |||
− | |||
SHORE provides various prediction algorithms for genomic polymorphisms, i.e. SNPs, structural variants (indels, CNVs, unsequenced regions), SNPs and SV prediction in heterozygous or pooled samples, as well as peak detection for ChIP-Seq analysis and quantitative analysis of mRNA-Seq and sRNA-Seq. | SHORE provides various prediction algorithms for genomic polymorphisms, i.e. SNPs, structural variants (indels, CNVs, unsequenced regions), SNPs and SV prediction in heterozygous or pooled samples, as well as peak detection for ChIP-Seq analysis and quantitative analysis of mRNA-Seq and sRNA-Seq. | ||
− | |||
− | |||
SHORE stores read data, alignments and result files in a predefined directory structure, which makes it possible to keep track of all intermediate steps and, if desired, repeat parts of the analysis. This directory hierarchy has advantages and disadvantages. While there is no freedom in data structuring when applying SHORE, it makes handling of large projects comprising multiple flowcells more convenient. It is in the nature of such projects that it can take weeks to gather all information, while the initial alignments have to be performed as soon as the first flowcell is finished. This requires an extendable mapping and analysis approach that structures all information in a transparent way. Of course, also smaller projects can benefit from such data partitioning. | SHORE stores read data, alignments and result files in a predefined directory structure, which makes it possible to keep track of all intermediate steps and, if desired, repeat parts of the analysis. This directory hierarchy has advantages and disadvantages. While there is no freedom in data structuring when applying SHORE, it makes handling of large projects comprising multiple flowcells more convenient. It is in the nature of such projects that it can take weeks to gather all information, while the initial alignments have to be performed as soon as the first flowcell is finished. This requires an extendable mapping and analysis approach that structures all information in a transparent way. Of course, also smaller projects can benefit from such data partitioning. | ||
Line 12: | Line 8: | ||
SHORE was designed to run on a multi-core server (32 or 64 bit) with Linux, MacOS (10.5+) or other POSIX compliant operating systems. Required memory depends both on the application and the reference sequence. Medium sized genomes (e.g. D. melanogaster, A. thaliana) can be analyzed with 2-8GB RAM, large genomes (e.g. H. sapiens) with 8-24GB RAM. SHORE is designed to take advantage of multi-core architectures. SHORE incorporates several alignment tools (e.g. [http://www.1001genomes.org/downloads/genomemapper.html GenomeMapper], [http://bio-bwa.sourceforge.net/ bwa], [http://bowtie-bio.sourceforge.net/ bowtie], ELAND) each coming with their own hardware requirements. | SHORE was designed to run on a multi-core server (32 or 64 bit) with Linux, MacOS (10.5+) or other POSIX compliant operating systems. Required memory depends both on the application and the reference sequence. Medium sized genomes (e.g. D. melanogaster, A. thaliana) can be analyzed with 2-8GB RAM, large genomes (e.g. H. sapiens) with 8-24GB RAM. SHORE is designed to take advantage of multi-core architectures. SHORE incorporates several alignment tools (e.g. [http://www.1001genomes.org/downloads/genomemapper.html GenomeMapper], [http://bio-bwa.sourceforge.net/ bwa], [http://bowtie-bio.sourceforge.net/ bowtie], ELAND) each coming with their own hardware requirements. | ||
− | == Getting | + | == Getting Help == |
− | Please see the [[Frequently | + | Please see the [[Frequently Asked Questions]] page for solutions to some common issues. |
− | Contact information can be found on [http://1001genomes.org/software/shore.html 1001 genomes] or by running the command ''shore | + | Contact information can be found on [http://1001genomes.org/software/shore.html 1001 genomes] or by running the command ''shore help''. |
− | == Getting | + | == Getting Started == |
− | === Before | + | === Before Using SHORE === |
* [[System Requirements]] | * [[System Requirements]] | ||
* [[Downloading and Installing SHORE]] | * [[Downloading and Installing SHORE]] | ||
Line 26: | Line 22: | ||
=== Using SHORE === | === Using SHORE === | ||
* [[SHORE Overview]] | * [[SHORE Overview]] | ||
− | * [[Running SHORE for the | + | * [[Running SHORE for the First Time - A Quick Guide]] |
* [[SHORE Subprograms]] | * [[SHORE Subprograms]] | ||
− | * [[SHORE | + | * [[SHORE File Formats]] |
== SHORE Development == | == SHORE Development == | ||
+ | * [[The libshore C++ Library|Using the '' '''libshore''' '' C++ library]] | ||
* [[How to Contribute]] | * [[How to Contribute]] | ||
* [[Coding Style]] | * [[Coding Style]] | ||
== How to Cite SHORE == | == How to Cite SHORE == | ||
+ | * Ossowski S, Schneeberger K, Clark RM et al. ''Sequencing of natural strains of Arabidopsis thaliana with short reads.'' Genome Res. 2008. |
Latest revision as of 17:05, 20 June 2013
Contents
Introduction
SHORE is a mapping and analysis pipeline for short DNA sequences produced on Illumina Genome Analyzer and Hiseq 2000, Life Technology SOLiD, 454 Genome Sequencer FLX and PacBio RS platforms. It is designed for projects whose analysis strategy involves mapping of reads to a reference sequence. This reference sequence does not necessarily have to be from the same species, since weighted and gapped alignments allow for accuracy even in diverged regions.
SHORE provides various prediction algorithms for genomic polymorphisms, i.e. SNPs, structural variants (indels, CNVs, unsequenced regions), SNPs and SV prediction in heterozygous or pooled samples, as well as peak detection for ChIP-Seq analysis and quantitative analysis of mRNA-Seq and sRNA-Seq.
SHORE stores read data, alignments and result files in a predefined directory structure, which makes it possible to keep track of all intermediate steps and, if desired, repeat parts of the analysis. This directory hierarchy has advantages and disadvantages. While there is no freedom in data structuring when applying SHORE, it makes handling of large projects comprising multiple flowcells more convenient. It is in the nature of such projects that it can take weeks to gather all information, while the initial alignments have to be performed as soon as the first flowcell is finished. This requires an extendable mapping and analysis approach that structures all information in a transparent way. Of course, also smaller projects can benefit from such data partitioning.
SHORE was designed to run on a multi-core server (32 or 64 bit) with Linux, MacOS (10.5+) or other POSIX compliant operating systems. Required memory depends both on the application and the reference sequence. Medium sized genomes (e.g. D. melanogaster, A. thaliana) can be analyzed with 2-8GB RAM, large genomes (e.g. H. sapiens) with 8-24GB RAM. SHORE is designed to take advantage of multi-core architectures. SHORE incorporates several alignment tools (e.g. GenomeMapper, bwa, bowtie, ELAND) each coming with their own hardware requirements.
Getting Help
Please see the Frequently Asked Questions page for solutions to some common issues.
Contact information can be found on 1001 genomes or by running the command shore help.
Getting Started
Before Using SHORE
Using SHORE
- SHORE Overview
- Running SHORE for the First Time - A Quick Guide
- SHORE Subprograms
- SHORE File Formats
SHORE Development
How to Cite SHORE
- Ossowski S, Schneeberger K, Clark RM et al. Sequencing of natural strains of Arabidopsis thaliana with short reads. Genome Res. 2008.