12 Demo 3 - Processing of ChIP-seq data
Aims
- Manually process Scc1 ChIP-seq reads
- Generate IP/input ratios with
bamCoverage - Call peaks and inspect them visually
Datasets
Cohesin (Scc1) ChIP-seq data was published in Verzijlbergen et al., eLife 2014
- Scc1 ChIP-seq IP:
SRR1103930 - Scc1 ChIP-seq input:
SRR1103928
12.1 Getting data
12.1.1 Downloading reads from internet
We can download the single-end reads directly from the internet.
12.1.2 FastQC reads
fastqc program will run quick QCs on each fastq file separately.
12.2 Align reads to a genome reference
12.2.1 Mapping single-end IP and input reads
12.2.2 Filtering mapped fragments
sh
SAMTOOLS_OPTIONS="-@ 12 --output-fmt bam"
for FILE in IP Inp
do
samtools sort "${SAMTOOLS_OPTIONS}" data/mapping/Scc1_"${FILE}"_R64-1-1.sam | \
samtools markdup "${SAMTOOLS_OPTIONS}" -r - - | \
samtools view "${SAMTOOLS_OPTIONS}" -q 20 --fast -b - | \
samtools sort "${SAMTOOLS_OPTIONS}" -l 9 -o data/mapping/Scc1_"${FILE}"_R64-1-1.bam
samtools index -@ 12 data/mapping/Scc1_"${FILE}"_R64-1-1.bam
done12.2.3 Create coverage track
12.2.4 Create IP/inp track
sh
bamCompare \
-b1 data/mapping/Scc1_IP_R64-1-1.bam \
-b2 data/mapping/Scc1_Inp_R64-1-1.bam \
--outFileName data/tracks/Scc1_IP-vs-Inp.log2.bw \
--scaleFactorsMethod readCount \
--operation log2 \
--skipZeroOverZero \
--skipNonCoveredRegions \
--skipNAs \
--numberOfProcessors 16 \
--binSize 1 \
--extendReads 220