Shore peak

shore peak provides enriched region prediction for ChIP-Seq experiments. Significance of the predicted regions is assessed by comparison to the specified control samples.

Replicate experiments may be processed simultaneously by specifying multiple experiment and control paths. While the significance of each peak region is then tested for independently for each replicate, the region prediction itself is performed jointly for all experiments to obtain results that are immediately comparable.

The output generated by shore peak is described below.

Command line options

Usage: shore peak [OPTIONS]

Mandatory
-o, --outdir=STRING	(Default: PeakAnalysis)	Output directory (will be created)
-i, --chip-paths=STRING[:...][,...]		ChIP experiment alignment files or shore directories (replicates)
-c, --ctrl-paths=STRING[:...][,...]		Control experiment alignment files or shore directories
Segmentation
-S, --window-size=INT	(Default: 2000)	Sliding window size for dynamic segmentation
-P, --poisson-threshold=FLOAT	(Default: 0.05)	Poisson probability threshold [<=] for dynamic segmentation
-V, --probation=INT	(Default: 0)	Allow a mitigated threshold for at most <arg> base pairs inside a segment
-Q, --mitigator=FLOAT	(Default: 1)	Modifier for calculation of the mitigated threshold, value in [0,1]
-J, --minsize=INT	(Default: 131)	Segment size threshold [>=]
Normalization
-b, --binsize=INT	(Default: 4000)	Size of read bins for normalization
-q, --rankmaxquant-ubound=FLOAT	(Default: 1)	Quantile upper bound for the rank maxima of the bins used
Read filter
-H, --hits-range=INT,INT		Set the allowed range of repetitiveness ('1,1' = nonrep reads)
-M, --mm-range=INT,INT		Set the allowed range of mismatches
-R, --region=STRING		Only use reads that overlap with the range [chr1:pos1..[chr2:]pos2] Prior range indexing the alignment files using shore 2dex is recommended
--assume-length=INT	(Default: 400)	Assume maximal alignment length <arg>, enables fast range queries
-X, --p3fix=INT	(Default: 130)	Set the 3' end to a fixed distance from the 5' end (set to 0 to disable)
-N, --read-lengths=INT[,...]		Use only reads of the given length(s)
-B, --duplicates=FLOAT		Report at maximum <arg> reads with the 5' end at the same position on the same strand
--sam-ref=STRING		Reference sequence for SAM file parsing
--peflags=INT[,...]		Use only reads with the given PE flag(s)
-F, --poissonifier-width=INT	(Default: 13)	Set the window size for the adaptive duplicate read filter (set to zero to disable)
Peak filtering
-n, --nsigma=FLOAT	(Default: 6)	Allow the mean segment coverage any control sample to be at most <arg> std. deviations higher than the median before discarding the segment
--min-xshift=FLOAT	(Default: 10)	Require a certain shift for the reverse strand peak in at least one experiment
--min-foldchange=FLOAT	(Default: 2)	Require a minimum normalized fold change of <arg> for experiment vs. control for at least one experiment
Other
--non-directional		Assume that any clone may be sequenced from both ends (calculates a more conservative FDR)
-d, --rankproduct=INT	(Default: 10000)	Number of simulations for rankproduct PFP estimation (set to zero to disable PFP estimation)
--rplot=INT	(Default: 100)	Plot the first <arg> peaks using R; alignment files must be indexed using shore 2dex
-r, --index-file=STRING		Extract sequence information for each segment from *.shore index file
-a, --annotation-file=STRING		Annotation file (sequenceontology.org GFF3 format; numerical sequence IDs required)
--so-filter=STRING[,...]	(Default: gene,transposable_element_gene)	Only parse toplevel annotation features of the given SO types

SHORE peak result files

The main result file produced by shore peak is named SUMMARY.txt:

id	An arbitrary numerical ID for the peak region
chr	Sequence / chromosome ID
pos	Left-most position of the peak region on the reference sequence
size	Size of the peak region
log2_orp	Observed rank product, base 2 logarithm (only present for multiple replicates)
orp_rank	Rank of the observed rank product (only present for multiple replicates)
p_rank1	Replicate 1 rank of the P-value of the peak
fdr_bh_q1	Replicate 1 Benjamini-Hochberg adjusted FDR of the peak
rc_chip1	Replicate 1 number of reads contributing to the peak in the sample
rc_ctrl1	Replicate 1 number of reads in the same region of the control
pbexcess1	Replicate 1 per-base-excess: mean_coverage_sample(peak) - (mean_coverage_control(peak) * normalization_constant)
fc_score1	Replicate 1 fold change score: 4 * atan(mean_coverage_sample(peak) / mean_coverage_control(peak) * normalization_constant) / PI - 1.0
height_excess1	Replicate 1 peak height excess:
frfc_score1	Replicate 1 forward-reverse fold change score: Calculated like fc_score, but compares the sample forward strand and reverse strand coverage
cog_xshift1	Replicate 1 forward-reverse peak shift
overlap_names	Identifiers of the genes overlapping with the center of the peak region (only preset when the option -a was specified)
overlap_types	Parts of the genes that overlap (exon, 5' UTR etc.) (only present when the option -a was specified)
up_names	Identifiers of the closest genes 'to the left' from the center of the peak (only present when the option -a was specified)
up_dist	Distance of the closest genes 'to the left' from the center of the peak (only present when the option -a was specified)
up_strands	Strands of the closest genes 'to the left' from the center of the peak (only present when the option -a was specified)
down_names	Identifiers of the closest genes 'to the right' from the center of the peak (only present when the option -a was specified)
down_dist	Distance of the closest genes 'to the right' from the center of the peak (only present when the option -a was specified)
down_strands	Strands of the closest genes 'to the right' from the center of the peak (only present when the option -a was specified)

Shore peak

Command line options

SHORE peak result files

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools