bash
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR992/004/SRR9929264/SRR9929264_1.fastq.gz -o Share/reads/Ctl_rep1_R1.fq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR992/004/SRR9929264/SRR9929264_2.fastq.gz -o Share/reads/Ctl_rep1_R2.fq.gzFastQC
bamCoverage
IGV
We will use the SRR9929264 RNA-seq data published in Nuño-Cabanes et al., Scientific Data 2020
FastQC
fastq files have already been downloaded from SRA, using the following command:
bash
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR992/004/SRR9929264/SRR9929264_1.fastq.gz -o Share/reads/Ctl_rep1_R1.fq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR992/004/SRR9929264/SRR9929264_2.fastq.gz -o Share/reads/Ctl_rep1_R2.fq.gzShare/reads folder where the reads are stored (for Ctl_rep1 at least).bash
zcat Share/reads/Ctl_rep1_R1.fq.gz | wc -l
zcat Share/reads/Ctl_rep1_R1.fq.gz | head -n 8FastQC help to understand the different options available.FastQC to generate a QC report of the quality of raw data. Think about the Share/adapters.txt file that you may provide as an option…bash
fastqc --help
mkdir qc/
fastqc --outdir qc/ \
--noextract \
--threads 2 \
--adapters Share/adapters.txt \
Share/reads/Ctl_rep1_R1.fq.gzCtl_rep1_R1_fastqc.html file in your browser.bowtie2
STAR help to understand the different options available.STAR?STAR somewhere?bash
STAR --help
ls Share/genome/R64-1-1*STAR, generate it.bash
mkdir -p genomes/R64-1-1/
STAR \
--runMode genomeGenerate \
--genomeDir genomes/R64-1-1/STAR/ \
--genomeFastaFiles Share/genome/R64-1-1.fa \
--sjdbGTFfile Share/genome/R64-1-1.gtf \
--genomeSAindexNbases 10Check the genomes/R64-1-1/STAR folder. What files are generated?
Run STAR to map the Ctl_rep1 against the yeast genome.
Use the --runThreadN option to speed up the mapping.
bash
STAR \
--genomeDir genomes/R64-1-1/STAR/ \
--readFilesIn Share/reads/Ctl_rep1_R1.fq.gz Share/reads/Ctl_rep1_R2.fq.gz \
--readFilesCommand zcat \
--runThreadN 2 \
--outFileNamePrefix Ctl_rep1. \
--outSAMtype BAM Unsorted \
--outSAMunmapped None \
--outSAMattributes StandardSTAR.bash
cat Ctl_rep1.Log.final.outsamtools view. Don’t forget to check the -h option to include the header!Ctl_rep1.Aligned.out.bam file?bash
ls -lh Ctl_rep1.Aligned.out.bam
samtools view -h Ctl_rep1.Aligned.out.bam | head -n 30
samtools view Ctl_rep1.Aligned.out.bam | cut -f 3 | sort | uniq -csamtools stats. The summary numbers can be extracted with grep ^SN.bash
samtools stats Ctl_rep1.Aligned.out.bam | grep ^SNbamCoverage
bamCoverage help to understand the different options available.bamCoverage?bash
bamCoverage --helpbamCoverage to generate a track from the Ctl_rep1 mapped reads.--binSize option to set the bin size to 10 bp.bash
samtools sort --write-index -o Ctl_rep1_sorted.bam Ctl_rep1.Aligned.out.bam
bamCoverage \
--bam Ctl_rep1_sorted.bam \
--outFileName Ctl_rep1.bw \
--outFileFormat bigwig \
--binSize 10 \
--numberOfProcessors 2 \
--normalizeUsing CPM \
--extendReadsbamCoverage again to generate two stranded tracks. Use the --filterRNAstrand option to set the strand orientation.bash
bamCoverage \
--bam Ctl_rep1_sorted.bam \
--outFileName Ctl_rep1.fwd.bw \
--outFileFormat bigwig \
--binSize 10 \
--normalizeUsing CPM \
--extendReads \
--filterRNAstrand forward
bamCoverage \
--bam Ctl_rep1_sorted.bam \
--outFileName Ctl_rep1.rev.bw \
--outFileFormat bigwig \
--binSize 10 \
--normalizeUsing CPM \
--extendReads \
--filterRNAstrand reverseIGV.