5  Improving package integration

Aims
  • Add suppporting functions to access AnnotationHub-hosted files;
  • Write a vignette to document a use case for the package;
  • Create a pkgdown website for package documentation
Tip

At any time, if you are lost or do not understand how functions in the proposed solution work, type ?<function> in the R console and a help menu will appear.

You can also check the help tab in the corresponding quadrant.

Reminder

We aim to create a package which can plot the aggregated coverage of a genomic track over a set of GRanges of interest (at a fixed width).

5.1 Use TxDb and AnnotationHub resources

5.1.1 TxDb database

TxDb packages provide transcript annotations for model organisms.

Question

Find the TxDb package associated with UCSC-provided human gene annotations and install it.

Note: You will need annotations from the hg19 human genome reference.

BiocManager::install('TxDb.Hsapiens.UCSC.hg19.knownGene')
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene

GenomicFeatures is a package designed in order to facilitate conversion of TxDb objects into GRanges objects.

Question

Which function from GenomicFeatures can be used to find the gene loci for all genes present in the human TxDb?

GenomicFeatures::genes(txdb)

5.1.2 AnnotationHub database

AnnotationHub provides access to a large number of NGS datasets.

Question
  • Can you query the AnnotationHub to explore which datasets are available?

  • Which species are most represented?

Hint: The mcols() function will extract metadata associated with each entry in the AnnotationHub.

ah <- AnnotationHub::AnnotationHub()
ah_df <- AnnotationHub::mcols(ah)
table(ah_df$species) |> sort() |> tail(10)
table(ah_df$sourcetype) |> sort() |> tail(10)

The Epigenome roadmap project by the Broad Institute performed H3K4me3 ChIP-seq in human (among other marks). They generated a library labelled ‘LL227’ (GEO ID: GSM409308).

Question

Can you query the AnnotationHub for this library?

AnnotationHub::query(ah, 'LL227')
Question
  • Which signal corresponds to a ChIP-seq coverage track? How can you download the associated file?
bw <- ah[['AH34904']]
Question

What is the path of the file now stored on disk? How can you load the file in R?

H3K4me3_bw <- BiocIO::resource(bw)
cov <- BiocIO::import(bw)

5.2 Create a AggregatedCoverage object

This AggregatedCoverage should contain aggregated H3K4me3 coverage over the TSSs of all forward human genes.

Question

Three arguments are required:

  • bw_file: the path to a disk-stored coverage track file (e.g. the path to a bw file)
  • features_file: the path to a disk-stored features file (e.g. the path to a bed file)
  • width: an integer specifying the width to use for each genomic locus
  • Preparing bw_file argument:
bw_file <- H3K4me3_bw
  • Preparing features_file argument:
library(GenomicRanges)
library(rtracklayer)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene
genes <- GenomicFeatures::genes(txdb)
forward_genes <- genes[strand(genes) == '+']
forward_tss <- resize(forward_genes, width = 1, fix = 'start')
export(forward_tss, "forward_tss.bed")
features_file <- 'forward_tss.bed'
  • Preparing width argument:
width <- 8000L

Now that all the arguments are ready, one can create an AggregatedCoverage object.

Question

Create a AggregatedCoverage object summarizing the coverage of H3K4me3 over the TSSs (± 4kb) of all forward human genes.

AC <- AggregatedCoverage(bw_file, features_file, width = width)
AC
Question

Plot the resulting AC object.

plot(AC)

5.3 Add a vignette to your package

The previous section reuses publicly available datasets and resources provided by Bioconductor to illustrate a use case for our package.

Question
  • Initiate a vignette using biocthis template;

  • Fill out the vignette header with relevant information (namely, the author field);

  • Describe the use case developed in the previous section.

biocthis::use_bioc_vignette(name = "JacquesTestPackage", title = "Introduction to my package")
Question

Because the vignette relies on several packages, these need to be declared in the DESCRIPTION file. Make sure to update it.

usethis::use_package("BiocStyle")
usethis::use_package("TxDb.Hsapiens.UCSC.hg19.knownGene")
usethis::use_package("GenomicFeatures")
usethis::use_package("GenomicRanges")
usethis::use_package("AnnotationHub")
usethis::use_package("BiocIO")

5.4 Create pkgdown website and host it on GitHub

Question

Check pkgdown documentation. What is the recommended way to add pkgdown support for your package?

usethis::use_pkgdown()
Question

Can you now build your package? What does the building step do exactly?

pkgdown::build_site()
Question
  • Comment on why setting up a pkgdown website helps with checking that your package reliably works?

  • Commit your changes. Did that include your newly created docs folder? Why?

  • Remove docs from .gitignore, then commit the docs/ folder with git.

  • Now push all your commits to your github remote repository, navigate to its webpage, and enable Pages deployment from the docs folder. Check the resulting website.