This function takes a set of GRanges in a genome, recover the corresponding sequences and divides them using a sliding window. For each sub-sequence, it then computes the PSD value of a k-mer of interest at a chosen period, and generates a linear .bigWig track from these values.

getPeriodicityTrack(
  genome = NULL,
  granges,
  motif = "WW",
  period = 10,
  BPPARAM = setUpBPPARAM(1),
  extension = 1000,
  window_size = 100,
  step_size = 2,
  range_spectrum = seq(5, 50),
  smooth_track = 20,
  bw_file = NULL
)

Arguments

genome

DNAStringSet, BSgenome or genome ID

granges

GRanges object

motif

character, k-mer of interest.

period

Integer, the period of the k-mer to study (default=10).

BPPARAM

split the workload over several processors using BiocParallel

extension

Integer, the width the GRanges are going to be extended to (default 1000).

window_size

Integer, the width of the bins to split the GRanges objects in (default 100).

step_size

Integer, the increment between bins over GRanges (default 2).

range_spectrum

Numeric vector, the distances between nucleotides to take into consideration when performing Fast Fourier Transform (default seq_len(50)).

smooth_track

Integer, smooth the resulting track

bw_file

character, the name of the output bigWig track

Value

Rlelist and a bigWig track in the working directory.

Examples

data(ce11_proms) track <- getPeriodicityTrack( genome = 'BSgenome.Celegans.UCSC.ce11', ce11_proms[1], extension = 200, window_size = 100, step_size = 10, smooth_track = 1, motif = 'WW', period = 10, BPPARAM = setUpBPPARAM(1) )
#> #> The genome has been divided in 100-bp long windows #> with a sliding window of 10-bp. #> 29 windows overlap the 1 input loci (extended to 200-bp). #> The mapping will be split into 1 cores. Each core will #> process 29 windows. #> #> Generating the following track: BSgenome.Celegans.UCSC.ce11_WW_10-bp-periodicity_g-100^10_smooth-1.bw #> GENOME: BSgenome.Celegans.UCSC.ce11 #> MOTIF: WW #> PERIOD: 10 #> # LOCI: 1 #> # BASES: 380 #> #> #> Now starting [2021-04-19 08:54:43] #>
#> Merging the results... #>
#> #> SUCCESS: BSgenome.Celegans.UCSC.ce11_WW_10-bp-periodicity_g-100^10_smooth-1.bw has been created!
#> 290 bases covered by the generated track.
#> #> Finished without errors [2021-04-19 08:54:46] #>
track
#> RleList of length 7 #> $chrI #> numeric-Rle of length 15072434 with 31 runs #> Lengths: 11202 10 10 ... 10 15060942 #> Values : 0.00000e+00 2.14680e+00 1.15040e+00 ... 8.78000e-02 8.74301e-16 #> #> $chrII #> numeric-Rle of length 15279421 with 1 run #> Lengths: 15279421 #> Values : 0 #> #> $chrIII #> numeric-Rle of length 13783801 with 1 run #> Lengths: 13783801 #> Values : 0 #> #> $chrIV #> numeric-Rle of length 17493829 with 1 run #> Lengths: 17493829 #> Values : 0 #> #> $chrV #> numeric-Rle of length 20924180 with 1 run #> Lengths: 20924180 #> Values : 0 #> #> ... #> <2 more elements>
unlink( 'BSgenome.Celegans.UCSC.ce11_WW_10-bp-periodicity_g-100^10_smooth-1.bw' )