4 Lab 1 - Processing of MNase-seq data
Aims
- Fetching an MNase-seq dataset from GEO
- Indexing a genome with
bowtie2 - Map paired-end reads with
bowtie2 - Generate sequencing-depth normalized track
- Generate nucleosomes track
- Check the relevance of filtering out duplicates
Datasets
4.1 Getting data
4.1.1 Downloading reads from internet
We can download the paired-end reads (R1 and R2 fastq files) directly from the internet.
- Find the download links associated with the
SRR3193263SRR ID. You can go to SRA-explorer to easily recover links. - Download the two fastq files for the
SRR3193263SRR ID.
4.1.2 FastQC reads
- Run
fastqcon eachfastqfile individually.
4.2 Pre-process reads
4.2.1 Trimming reads with trim_galore
- Did you find any adapter contamination in the two original
fastqfiles? - If so, proceed to fastq file trimming with
cutadapt. Read its doc to see how to automatically runfastqcafter trimming reads.
4.3 Align reads to a genome reference
4.3.1 Indexing sacCer3 genome
Genome references for model systems can be fetched from iGenomes.
- Build S. cerevisiae genome (
R64-1-1) bowtie2 index.
4.3.2 Mapping paired-end trimmed reads
- Map paired-end reads with
bowtie2.
4.3.3 Filtering mapped fragments
Filter mapped pairs using the following procedure:
Fixing mates
Sorting reads
Removing duplicates
Filtering pairs:
- Only keep paired reads (
0x001) - Only keep reads mapped in proper pair (
0x002) - No unmapped reads (
0x004) - No reads with unmapped mate (
0x008) - Reads mapped with a MAPQ >= 20
- Only keep paired reads (
Sorting reads
Indexing reads
To better understand the combinations of information described by the SAM “flag”, check the Decoding SAM flags page.
4.3.4 Create coverage track
- Create a sequencing depth-normalized track, filling out fragments.
4.3.5 Create nucleosome track
- Create a nucleosome track by keeping the fragments between 130 and 165bp, and extending them to 40bp aligned at their center.
4.3.6 Create coverage track without removing duplicates
- Reprocess the
bamfile to generate coverage and nucleosome tracks from fragments without removing the read duplicates. - Compare the different tracks generated in IGV. Comment.