9 Lab 2 - ATAC-seq downstream analysis
Aims
- Overlap ATAC-seq peaks with annotated REs
- Check ATAC-seq fragment sizes
- Overlap ATAC-seq peaks with annotated regulatory elements (REs)
- Check tissue-specific enrichment of ATAC-seq peaks
Datasets
- The set of ATAC-seq peaks identified with
yapcas detailed in the demonstration. - The two ATAC-seq
bamfiles used to generate tracks and call peaks. - A set of regulatory elements identified across development, aging and tissues of C. elegans, available here.
9.1 Import ATAC-seq peaks in R
- Check documentation from the
rtacklayerpackage to see how to import abedfile inR.
9.2 Import ATAC-seq fragments in R
- Check documentation from the
Rsamtoolspackage to see how to create a connection to disk-storedbamfiles.
- Read
GenomicAlignmentspackage documentation to see how to import fragments from aBamFileconnection. - Import fragments from paired-end reads, in proper pairs, no duplicates and no secondary alignments, with a MAPQ >= 20. The important
bamcolumn to recover isisize(insert size).
9.3 Check distribution of ATAC fragment sizes
- Coerce fragments to
GRangeswith theasfunction. - Subset fragments to retain those overlapping imported ATAC-seq
peaks.
- Plot the width distribution for filtered ATAC-seq fragments.
- How can you interpret the resulting distribution?
9.4 Import regulatory elements in R
A comprehensive set of regulatory elements in C. elegans is provided here: https://genome.cshlp.org/content/suppl/2020/11/16/gr.265934.120.DC1/Supplemental_Table_S2.xlsx.
- Check documentation from the
readxlpackage to see how to import axlsxfile inR. - Check documentation from the
GenomicRangespackage to see how to convert adata.frameinto aGRanges.
The sequence names (or seqlevels) are not exactly the same in REs and peaks. This can be modified by changing the seqlevelsStyle of one of the two objects.
9.5 Compare peaks and REs
- Now, check how many
peaksoverlap withREs.
- Check how many
peaksoverlap with germline-specificREs. - Perform a
fisher.testto evaluate whether this significantly overlaps. - Can you speculate on the origin of the ATAC-seq dataset?