factorise computes the angle matrix of the input gene expression
matrix using the specified method, performs permutation to create a null
distribution, and transforms the correlations into z-scores. This function
is optimized for large datasets using the bigstatsr package.
factorise(
x_mat,
method = "cosine",
seed = 1,
permute_row_or_column = "column",
permutation_function = "sample",
normalization_method = "divide_by_total_counts"
)A FBM object representing the
normalized and scaled gene expression matrix.
A character string specifying the method for calculating the
relationship between gene pairs. Default is "cosine". Other options
include "spearman"
An integer value for setting the seed for reproducibility during
permutation. Default is 1.
Character "row" or "column", whether
permutations should be executed row-wise or column wise.
Default is "column"
Character "sample" or "permute_nonzero".
If sample, then sample is used for constructing background distributions.
If permute_nonzero, then only non-zero values are permuted.
Default is "sample"
Character "divide_by_total_counts" or
"scale_by_total_counts". Default is "divide_by_total_counts"
An FBM object containing the
z-score-transformed angle matrix.
The function performs the following steps:
Permutation: The input matrix is permuted column-wise to disrupt existing angles, creating a null distribution.
Angle Computation: Computes the angle matrix for both the
original and permuted matrices using extract_angles.
Method-Specific Processing:
For other methods ("cosine", "spearman"),
statistical measures are computed from the permuted data.
Statistical Measures: Calculates mean, variance, and standard
deviation using get_dstat.
Z-Score Transformation: Transforms the original angle matrix into z-scores.
This process allows for the identification of invariant gene-gene relationships by comparing them to a null distribution derived from the permuted data.
mat <- matrix(
c(
5, 3, 0, 0,
0, 0, 0, 3,
2, 1, 3, 4,
0, 0, 1, 0,
1, 2, 1, 2,
3, 4, 3, 4
),
nrow = 6, # 6 genes
ncol = 4, # 4 cells
byrow = TRUE
)
mat <- bigstatsr::FBM(nrow = nrow(mat), ncol = ncol(mat), init = mat)
# Run factorise with method "cosine" and a fixed seed
result_fbm <- factorise(mat, method = "cosine", seed = 1)
#> Creating directory "/tmp/Rtmpfdgcgb/file30062c3e214b" which didn't exist..
result_fbm[]
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0.0000000 -0.511657999 -1.026193139 -0.5116580 0.3120330 0.2450271
#> [2,] -0.6935391 0.000000000 0.954089829 -0.2862776 0.6648609 -0.2644019
#> [3,] -1.5135484 0.732358147 0.000000000 1.1189324 -0.6519721 -0.3581465
#> [4,] -0.4346556 0.001120625 1.726785664 0.0000000 0.2759029 1.3592689
#> [5,] -0.1043352 0.386696746 -0.575948146 -0.2533670 0.0000000 1.1680039
#> [6,] 0.2679001 -0.262671929 0.007957691 1.7464077 2.5990906 0.0000000