21 Lab 5 - Multi-omics data integration
Aims
- Visually inspect results from MNase-seq, Scc1 ChIP-seq and RNA-seq in yeast
- Plot aggregated profiles of stranded RNA-seq coverage @ Scc1 ChIP-seq peaks
- Compare stranded RNA-seq coverages at strong/weak Scc1 peaks
Datasets
- Scc1 ChIP-seq: unpublished
- MNase-seq: unpublished
- RNA-seq from Nuño-Cabanes et al., Scientific Data 2020:
SRR9929263
21.1 Inspecting data in IGV
- Open
.bwfiles and the Scc1 ChIP-seq peak file in IGV. - Compare the signal of stranded RNA-seq coverage tracks with the location of Scc1 peaks. Comment.
- Compare the location of transcription start sites (TSSs) with nucleosome profiles. Comment.
21.2 Plotting stranded RNA-seq signal over Scc1 ChIP-seq peaks
- Load forward and reverse RNA-seq coverage as
Rlein R - Load Scc1 peaks in R
- Resize Scc1 peaks so that they all are centered over their summit (check
peakcolumn from thenarrowPeakfile), and extend then ± 4000 bp
- Compute a
seqinfofrom therna_fwd_trackRle, addseqlengthsand convert it into aGRangesobject. - Remove the extended Scc1 peaks that lie outside of genome boundaries. To do this, check the operators
%over%and%within%.
- Extract the forward RNA-seq coverage signal over the first Scc1 peak
- Convert it into a tibble and add coordinates (± 4000 bp centered over Scc1 peak)
- Do the same for reverse RNA-seq coverage
- Plot the two stranded RNA-seq coverages together
- Iterate over all Scc1 peaks to extract stranded RNA-seq coverage around Scc1 peaks
- Calculate mean, standard deviation and 95% confidence interval of forward and reverse RNA-seq coverage around Scc1 peaks
- Plot the average forward and reverse RNA-seq coverages, with a ribbon showing the 95% confidence interval.
21.3 Plot a heaetmap of RNA-seq coverage over ordered Scc1 peaks
- Order peaks according to their Scc1 peak signal
- Re-compute the stranded RNA-seq coverages around ordered Scc1 peaks
- Plot a (rasterized) heatmap using
geom_tile(), with the distance from Scc1 peak in abscisse and each row representing an individual Scc1 peak locus. You can split forward and reverse coverage withfacet_wrap(~ strandness).
21.4 Plot stranded RNA-seq coverages for Scc1 peaks grouped by their strength
- Group the coverage tibble in 4 groups, containing the 0-25% (weakest) peaks, then 25-50%, 50-75% and 75-100% (strongest) peaks. This can be done with the
ntilefunction. - Re-plot the average forward and reverse RNA-seq coverages, with a ribbon showing the 95% confidence interval, splitting the peaks from each group in a different facet (with
facet_wrap(~ group)).