Difference between revisions of "SHORE Documentation"

From SHORE wiki
Jump to: navigation, search
(Getting help)
(Getting help)
Line 16: Line 16:
 
Please see the [[Frequently Asked Questions]] page for solutions to some common issues.
 
Please see the [[Frequently Asked Questions]] page for solutions to some common issues.
  
Contact information can be found on [http://1001genomes.org/software/shore.html 1001 genomes] or by running the command ''shore about''.
+
Contact information can be found on [http://1001genomes.org/software/shore.html 1001 genomes] or by running the command ''shore help''.
  
 
== Getting started ==
 
== Getting started ==

Revision as of 15:56, 6 March 2013

Introduction

SHORE is a mapping and analysis pipeline for short DNA sequences produced on Illumina Genome Analyzer and Hiseq 2000, Life Technology SOLiD, 454 Genome Sequencer FLX and PacBio RS platforms. It is designed for projects whose analysis strategy involves mapping of reads to a reference sequence. This reference sequence does not necessarily have to be from the same species, since weighted and gapped alignments allow for accuracy even in diverged regions.

SHORE provides various prediction algorithms for genomic polymorphisms, i.e. SNPs, structural variants (indels, CNVs, unsequenced regions), SNPs and SV prediction in heterozygous or pooled samples, as well as peak detection for ChIP-Seq analysis and quantitative analysis of mRNA-Seq and sRNA-Seq.

SHORE stores read data, alignments and result files in a predefined directory structure, which makes it possible to keep track of all intermediate steps and, if desired, repeat parts of the analysis. This directory hierarchy has advantages and disadvantages. While there is no freedom in data structuring when applying SHORE, it makes handling of large projects comprising multiple flowcells more convenient. It is in the nature of such projects that it can take weeks to gather all information, while the initial alignments have to be performed as soon as the first flowcell is finished. This requires an extendable mapping and analysis approach that structures all information in a transparent way. Of course, also smaller projects can benefit from such data partitioning.

SHORE was designed to run on a multi-core server (32 or 64 bit) with Linux, MacOS (10.5+) or other POSIX compliant operating systems. Required memory depends both on the application and the reference sequence. Medium sized genomes (e.g. D. melanogaster, A. thaliana) can be analyzed with 2-8GB RAM, large genomes (e.g. H. sapiens) with 8-24GB RAM. SHORE is designed to take advantage of multi-core architectures. SHORE incorporates several alignment tools (e.g. GenomeMapper, bwa, bowtie, ELAND) each coming with their own hardware requirements.

Getting help

Please see the Frequently Asked Questions page for solutions to some common issues.

Contact information can be found on 1001 genomes or by running the command shore help.

Getting started

Before using SHORE

Using SHORE

SHORE Development

How to Cite SHORE