Integrate Samples in a Seurat Object Using Selected Features from anglemania_object
Object
Source: R/integrate_by_features.R
integrate_by_features.Rd
integrate_by_features
integrates samples or batches within a Seurat
object using canonical correlation analysis (CCA) based on a set of
selected features (genes). The function utilizes an anglemania_object
to
extract anglemania genes and handles the integration process, including
optional downstream processing steps such as scaling, PCA, and UMAP
visualization.
Usage
integrate_by_features(
seurat_object,
angl,
int_order = NULL,
process = TRUE,
verbose = FALSE
)
Arguments
- seurat_object
A
Seurat
object containing all samples or batches to be integrated.- angl
An
anglemania_object-class
previously generated usingcreate_anglemania_object
andanglemania
. It is important that thedataset_key
andbatch_key
are correctly set in theanglemania_object
.- int_order
An optional data frame specifying the integration order of samples within the Seurat list. See the
sample.tree
argument inIntegrateData
for more details. If not provided, Seurat will construct the integration order using hierarchical clustering. Default isNULL
.- process
Logical value indicating whether to further process the data after integration (i.e., scale it, run PCA, and compute UMAP embeddings). Default is
TRUE
.- verbose
Logical value indicating whether to display progress messages during integration. Default is
FALSE
.
Value
A Seurat
object containing the integrated
data. The default assay is set to "integrated"
.
Details
The function performs the following steps:
Batch Key Addition: Adds a unique batch key to the Seurat object's metadata to distinguish different batches or samples. Batch key is set to the
anglemania_object
'sbatch_key
.Splitting: Splits the Seurat object into a list of Seurat objects based on the batch key.
Integration: Calls
integrate_seurat_list
to integrate the list of Seurat objects using the features extracted from theanglemania_object
.
The integration is performed using Seurat's CCA-based methods, and
parameters are adjusted based on the smallest dataset to ensure
compatibility with small sample sizes (e.g., metacells or SEACells). If
process = TRUE
, the function will also scale the data, run PCA,
and compute UMAP embeddings.
Examples
# Integrate samples using anglemania_object
# Automatically reads the batch key from anglemania_object
# splits the seurat object into batches and integrates them
# using CCA integration and anglemania genes previously extracted
# with anglemania() or select_genes()
se <- SeuratObject::pbmc_small
angl <- create_anglemania_object(se, batch_key = "groups")
#> No dataset_key specified.
#> Assuming that all samples belong to the same dataset and are separated by batch_key: groups
#> Extracting count matrices...
#> Filtering each batch to at least 1 cells per gene...
#> Using the intersection of filtered genes from all batches...
#> Number of genes in intersected set: 228
#>
| | 0 % elapsed=00s
|==================================================| 100% elapsed=00s, remaining~00s
angl <- anglemania(angl)
#> Computing angles and transforming to z-scores...
#>
| | 0 % elapsed=00s
|========================= | 50% elapsed=00s, remaining~00s
|==================================================| 100% elapsed=00s, remaining~00s
#> Computing statistics...
#> Weighting matrix_list...
#> Calculating mean...
#> Calculating sds...
#> Filtering features...
options(future.globals.maxSize = 4000 * 1024^2)
integrated_object <- integrate_by_features(se, angl)
#> Log normalizing data...
#>
| | 0 % elapsed=00s
|========================= | 50% elapsed=00s, remaining~00s
|==================================================| 100% elapsed=00s, remaining~00s
#> Finding integration anchors...
#>
| | 0 % elapsed=00s
|==================================================| 100% elapsed=01s, remaining~00s
#> Integrating samples...
#> Warning: Layer counts isn't present in the assay object; returning NULL
#> Running PCA with 30 PCs
#> Running UMAP with 30 PCs and 10 neighbors
#> Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
#> To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
#> This message will be shown once per session