sh
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR313/030/SRR31398330/SRR31398330_1.fastq.gz -o MNase_20_R1.fq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR313/030/SRR31398330/SRR31398330_2.fastq.gz -o MNase_20_R2.fq.gzbowtie2
samtools
We will process a data from the Koszul lab, generated in 2024 and published in Science.
We can download the paired-end reads (R1 and R2 fastq files) directly from the internet.
SRR31398330 SRR ID. You can go to SRA-explorer to easily recover links.SRR31398330 SRR ID.sh
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR313/030/SRR31398330/SRR31398330_1.fastq.gz -o MNase_20_R1.fq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR313/030/SRR31398330/SRR31398330_2.fastq.gz -o MNase_20_R2.fq.gztrim_galore.sh
trim_galore \
--cores 2 \
--length 20 \
--gzip \
--paired \
--output_dir ./ \
Share/reads/MNase_20_R1.fq.gz Share/reads/MNase_20_R2.fq.gzsh
head -n 40 MNase_20_R1.fq.gz_trimming_report.txt
head -n 40 MNase_20_R1.fq.gz_trimming_report.txtSTAR be used to map MNase-seq reads to a genome reference?R64-1-1) index for bowtie2.sh
bowtie2-build Share/genome/R64-1-1.fa genomes/R64-1-1bowtie2.sam file contain, in terms of reads? In term of un/aligned reads?sh
bowtie2 \
--threads 2 \
-x genomes/R64-1-1 \
-1 MNase_20_R1_val_1.fq.gz \
-2 MNase_20_R2_val_2.fq.gz \
> MNase_20.samFilter mapped pairs using the following procedure:
Fix mates
Filter pairs:
0x001)0x002)0x004)0x008)Sort and index reads
To better understand the combinations of information described by the SAM “flag”, check the Decoding SAM flags page.
sh
## Fixing mates
# -r: "Remove unmapped reads and secondary alignments"
# -m: "Add mate score tag"
samtools fixmate \
-@ 2 \
--output-fmt bam \
-r -m \
MNase_20.sam MNase_20.bam
## - Filter read pairs
# -f 0x001: "Keep read paired"
# -f 0x002: "Keep read mapped in proper pair"
# -F 0x004: "Remove read unmapped"
# -F 0x008: "Remove mate unmapped"
# -q 20: "MAPQ >= 20"
# --fast: "Use fast bam compression"
samtools view \
-@ 2 \
--output-fmt bam \
-f 0x001 -f 0x002 -F 0x004 -F 0x008 -q 20 \
--fast \
MNase_20.bam \
-o MNase_20_filtered.bam
## - Sorting read pairs
#| -l 9: "Use best compression "
samtools sort \
-@ 2 \
--output-fmt bam \
-l 9 \
--write-index \
MNase_20_filtered.bam \
-o MNase_20_filtered_sorted.bamsh
bamCoverage \
--bam MNase_20_filtered_sorted.bam \
--outFileName MNase_20_filtered_sorted.CPM.bw \
--binSize 10 \
--numberOfProcessors 4 \
--normalizeUsing CPM \
--skipNonCoveredRegions \
--extendReadssh
bamCoverage \
--bam MNase_20_filtered_sorted.bam \
--outFileName MNase_20_filtered_sorted.135-160bp.nuc-center.CPM.bw \
--binSize 1 \
--smoothLength 10 \
--numberOfProcessors 4 \
--normalizeUsing CPM \
--skipNonCoveredRegions \
--MNaseIGV.HHO1 gene.