factorise computes the angle matrix of the input gene expression matrix using the specified method, performs permutation to create a null distribution, and transforms the correlations into z-scores. This function is optimized for large datasets using the bigstatsr package.

factorise(
  x_mat,
  method = "cosine",
  seed = 1,
  permute_row_or_column = "column",
  permutation_function = "sample",
  normalization_method = "divide_by_total_counts"
)

Arguments

x_mat

A FBM object representing the normalized and scaled gene expression matrix.

method

A character string specifying the method for calculating the relationship between gene pairs. Default is "cosine". Other options include "spearman"

seed

An integer value for setting the seed for reproducibility during permutation. Default is 1.

permute_row_or_column

Character "row" or "column", whether permutations should be executed row-wise or column wise. Default is "column"

permutation_function

Character "sample" or "permute_nonzero". If sample, then sample is used for constructing background distributions. If permute_nonzero, then only non-zero values are permuted. Default is "sample"

normalization_method

Character "divide_by_total_counts" or "scale_by_total_counts". Default is "divide_by_total_counts"

Value

An FBM object containing the z-score-transformed angle matrix.

Details

The function performs the following steps:

  1. Permutation: The input matrix is permuted column-wise to disrupt existing angles, creating a null distribution.

  2. Angle Computation: Computes the angle matrix for both the original and permuted matrices using extract_angles.

  3. Method-Specific Processing:

    • For other methods ("cosine", "spearman"), statistical measures are computed from the permuted data.

  4. Statistical Measures: Calculates mean, variance, and standard deviation using get_dstat.

  5. Z-Score Transformation: Transforms the original angle matrix into z-scores.

This process allows for the identification of invariant gene-gene relationships by comparing them to a null distribution derived from the permuted data.

Examples

mat <- matrix(
 c(
     5, 3, 0, 0,
     0, 0, 0, 3,
     2, 1, 3, 4,
     0, 0, 1, 0,
     1, 2, 1, 2,
     3, 4, 3, 4
   ),
   nrow = 6, # 6 genes
   ncol = 4, # 4 cells
   byrow = TRUE
)

mat <- bigstatsr::FBM(nrow = nrow(mat), ncol = ncol(mat), init = mat)

# Run factorise with method "cosine" and a fixed seed
result_fbm <- factorise(mat, method = "cosine", seed = 1)
#> Creating directory "/tmp/Rtmpt97azZ/file1d43113e744f" which didn't exist..
result_fbm[]
#>            [,1]         [,2]         [,3]       [,4]       [,5]       [,6]
#> [1,]  0.0000000 -0.511657999 -1.026193139 -0.5116580  0.3120330  0.2450271
#> [2,] -0.6935391  0.000000000  0.954089829 -0.2862776  0.6648609 -0.2644019
#> [3,] -1.5135484  0.732358147  0.000000000  1.1189324 -0.6519721 -0.3581465
#> [4,] -0.4346556  0.001120625  1.726785664  0.0000000  0.2759029  1.3592689
#> [5,] -0.1043352  0.386696746 -0.575948146 -0.2533670  0.0000000  1.1680039
#> [6,]  0.2679001 -0.262671929  0.007957691  1.7464077  2.5990906  0.0000000