HiContacts arithmetics functionalities — arithmetics • HiContacts

Different arithmetic operations can be performed on Hi-C contact matrices:

normalize a contact matrix using iterative correction;
detrend a contact matrix, i.e. remove the distance-dependent contact trend;
autocorrelate a contact matrix: this is is typically done to highlight large-scale compartments;
divide one contact matrix by another;
merge multiple contact matrices;
despeckle (i.e. smooth out) a contact matrix out;
aggregate (average) a contact matrices over a set of genomic loci of interest;
boost Hi-C signal by enhancing long-range interactions while preserving short- range interactions (this is adapted from Boost-HiC);
subsample interactions using a proportion or a fixed number of final interactions.
coarsen a contact matrix to a larger (coarser) resolution

# S4 method for class 'HiCExperiment'
aggregate(
  x,
  targets,
  flankingBins = 51,
  maxDistance = NULL,
  BPPARAM = BiocParallel::bpparam()
)

detrend(x, use.scores = "balanced")

autocorrelate(x, use.scores = "balanced", detrend = TRUE, ignore_ndiags = 3)

divide(x, by, use.scores = "balanced", pseudocount = 0)

# S4 method for class 'HiCExperiment,HiCExperiment'
merge(x, y, ..., use.scores = "balanced", FUN = mean)

despeckle(x, use.scores = "balanced", focal.size = 1)

boost(x, use.scores = "balanced", alpha = 1, full.replace = FALSE)

coarsen(x, bin.size)

# S4 method for class 'HiCExperiment'
normalize(
  object,
  use.scores = "count",
  niters = 200,
  min.nnz = 10,
  mad.max = 3
)

subsample(x, prop)

Arguments

x, y, object: a HiCExperiment object
targets: Set of chromosome coordinates for which interaction counts are extracted from the Hi-C contact file, provided as a GRanges object (for diagnoal-centered loci) or as a GInteractions object (for off-diagonal coordinates).
flankingBins: Number of bins on each flank of the bins containing input targets.
maxDistance: Maximum distance to use when compiling distance decay
BPPARAM: BiocParallel parameters
use.scores: Which scores to use to perform operations
detrend: Detrend matrix before performing autocorrelation
ignore_ndiags: ignore N diagonals when calculating correlations
by: a HiCExperiment object
pseudocount: Add a pseudocount when dividing matrices (Default: 0)
...: HiCExperiment objects. For aggregate, targets (a set of GRanges or GInteractions).
FUN: merging function
focal.size: Size of the smoothing rectangle
alpha: Power law scaling factor. As indicated in Boost-HiC documentation, the alpha parameter influences the weighting of contacts: if alpha < 1 long-range interactions are prioritized; if alpha >> 1 short-range interactions have more weight when computing the distance matrix.
full.replace: Whether to replace the entire set of contacts, rather than only filling the missing interactions (count=0) (Default: FALSE)
bin.size: Bin size to coarsen a HiCExperiment at
niters: Number of iterations for ICE matrix balancing
min.nnz: Filter bins with less than min.nnz non-zero elements when performing ICE matrix balancing
mad.max: Filter out bins whose log coverage is less than mad.max median absolute deviations below the median bin log coverage.
prop: Float between 0 and 1, or integer corresponding to the # of

Value

a HiCExperiment object with extra scores

Examples

#### -----
#### Normalize a contact matrix
#### -----

library(HiContacts)
contacts_yeast <- contacts_yeast()
#> see ?HiContactsData and browseVignettes('HiContactsData') for documentation
#> loading from cache
normalize(contacts_yeast)
#> 
#> `HiCExperiment` object with 8,757,906 contacts over 763 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/b13267d1ded5_7752" 
#> focus: "whole genome" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 267709 
#> scores(3): count balanced ICE 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Detrending a contact matrix
#### -----

detrend(contacts_yeast)
#> `HiCExperiment` object with 8,757,906 contacts over 763 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/b13267d1ded5_7752" 
#> focus: "whole genome" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 267709 
#> scores(4): count balanced expected detrended 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Auto-correlate a contact matrix
#### -----

autocorrelate(contacts_yeast)
#> 
#> `HiCExperiment` object with 8,757,906 contacts over 763 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/b13267d1ded5_7752" 
#> focus: "whole genome" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 267709 
#> scores(5): count balanced expected detrended autocorrelated 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Divide 2 contact matrices
#### -----

contacts_yeast <- refocus(contacts_yeast, 'II')
contacts_yeast_eco1 <- contacts_yeast_eco1() |> refocus('II')
#> see ?HiContactsData and browseVignettes('HiContactsData') for documentation
#> loading from cache
divide(contacts_yeast_eco1, by = contacts_yeast)
#> `HiCExperiment` object with 996,154 contacts over 51 regions 
#> -------
#> fileName: N/A 
#> focus: "II" 
#> resolutions(1): 16000
#> active resolution: 16000 
#> interactions: 1326 
#> scores(6): count.x balanced.x count.by balanced.by balanced.fc balanced.l2fc 
#> topologicalFeatures: () 
#> pairsFile: N/A 
#> metadata(2): hce_list operation

#### -----
#### Merge 2 contact matrices
#### -----

merge(contacts_yeast_eco1, contacts_yeast)
#> `HiCExperiment` object with 733,836.5 contacts over 51 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/b13210f4fa7c_7754" 
#> focus: "II" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 1326 
#> scores(2): count balanced 
#> topologicalFeatures: () 
#> pairsFile: N/A 
#> metadata(2): hce_list operation

#### -----
#### Despeckle (smoothen) a contact map
#### -----

despeckle(contacts_yeast)
#> `HiCExperiment` object with 471,364 contacts over 51 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/b13267d1ded5_7752" 
#> focus: "II" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 16000 
#> interactions: 1326 
#> scores(3): count balanced balanced.despeckled 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Aggregate a contact matrix over centromeres, at different scales
#### -----

contacts <- contacts_yeast() |> zoom(resolution = 1000)
#> see ?HiContactsData and browseVignettes('HiContactsData') for documentation
#> loading from cache
centros <- topologicalFeatures(contacts, 'centromeres')
aggregate(contacts, targets = centros, flankingBins = 51)
#> Going through preflight checklist...
#> Parsing the entire contact matrice as a sparse matrix...
#> Modeling distance decay...
#> Filtering for contacts within provided targets...
#> `AggrHiCExperiment` object over 16 targets 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/b13267d1ded5_7752" 
#> focus: 16 targets 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 1000 
#> interactions: 10609 
#> scores(4): count balanced expected detrended 
#> slices(4): count balanced expected detrended 
#> topologicalFeatures: targets(16) compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(0):

#### -----
#### Enhance long-range interaction signal
#### -----

contacts <- contacts_yeast() |> zoom(resolution = 1000) |> refocus('II')
#> see ?HiContactsData and browseVignettes('HiContactsData') for documentation
#> loading from cache
boost(contacts, alpha = 1)
#> `HiCExperiment` object with 469,569 contacts over 814 regions 
#> -------
#> fileName: "/github/home/.cache/R/ExperimentHub/b13267d1ded5_7752" 
#> focus: "II" 
#> resolutions(5): 1000 2000 4000 8000 16000
#> active resolution: 1000 
#> interactions: 330763 
#> scores(3): count balanced boosted 
#> topologicalFeatures: compartments(0) borders(0) loops(0) viewpoints(0) centromeres(16) 
#> pairsFile: N/A 
#> metadata(1): alpha

#### -----
#### Subsample & "coarsen" contact matrix 
#### -----

subcontacts <- subsample(contacts, prop = 100000)
coarsened_subcontacts <- coarsen(subcontacts, bin.size = 4000)