## Introduction

This R package helps the user identify k-mers (e.g. di- or tri-nucleotides) present periodically in a set of genomic loci (typically regulatory elements). It is not aimed at identifying motifs separated by a conserved distance; for this type of analysis, please visit MEME website.

## Installation

periodicDNA is available in Bioconductor. To install the current release use:

if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("periodicDNA")

For advanced users, the most recent periodicDNA can be installed from Github as follows (might be buggy):

install.packages("devtools")
devtools::install_github("js2264/periodicDNA")
library(periodicDNA)

## Citation

periodicDNA: an R/Bioconductor package to investigate k-mer periodicity in DNA
J. Serizay & J. Ahringer
F1000Research, 2021

Distinctive regulatory architectures of germline-active and somatic genes in C. elegans
J. Serizay, Y. Dong, J. Jänes, M. Chesney, C. Cerrato & J. Ahringer
Genome Research, 2020

## How to use periodicDNA

periodicDNA includes a vignette where its usage is illustrated. To access the vignette, please use:

vignette('periodicDNA')

## Main functions

The two main user-level functions of periodicDNA are getPeriodicity() and getPeriodicityTrack().

• getPeriodicity() is used to compute the power spectral density (PSD) of a chosen k-mer (i.e. TT) in a set of sequences. The PSD score at a given period indicates the strength of the k-mer at this period.
• getPeriodicityTrack() can be used to generate linear tracks representing the periodicity strength of a given k-mer at a chosen period, over genomic loci of interest.

### getPeriodicity() function

data(ce11_TSSs)
PSDs <- getPeriodicity(
ce11_TSSs[['Ubiq.']],
genome = 'BSgenome.Celegans.UCSC.ce11',
motif = 'TT',
BPPARAM = MulticoreParam(12),
n_shuffling = 100
)
plotPeriodicityResults(PSDs)

### getPeriodicityTrack() function

data(ce11_proms)
WW_10bp <- getPeriodicityTrack(
genome = 'BSgenome.Celegans.UCSC.ce11',
granges = ce11_proms,
motif = 'WW',
period = 10,
bw_file = 'WW-10-bp-periodicity_over-proms.bw',
BPPARAM = MulticoreParam(12)
)

Warning: It is recommended to run this command across many processors using BiocParallel. This command typically takes one day to produce a periodicity track over 15,000 GRanges of 150 bp (with default parameters) using BPPARAM = MulticoreParam(12). It is highly recommended to run this command in a new screen session.

## Contributions

Code contributions, bug reports, fixes and feature requests are most welcome. Please make any pull requests against the master branch at https://github.com/js2264/periodicityDNA and file issues at https://github.com/js2264/periodicityDNA/issues