Integrate a List of Seurat Objects Using Selected Features
Source:R/integrate_by_features.R
integrate_seurat_list.Rd
integrate_seurat_list
integrates a list of Seurat objects (e.g.,
representing different samples or batches) using canonical correlation
analysis (CCA) based on a set of selected features (genes). The function
handles normalization, finding integration anchors, integrating data, and
optional downstream processing steps such as scaling, PCA, and UMAP
visualization.
Usage
integrate_seurat_list(
seurat_list,
features,
int_order = NULL,
process = TRUE,
verbose = FALSE
)
Arguments
- seurat_list
A list of
Seurat
objects to be integrated.- features
A character vector of gene names (features) used for integration.
- int_order
An optional data frame specifying the integration order of samples within the Seurat list. See the
sample.tree
argument inIntegrateData
for more details. If not provided, Seurat will construct the integration order using hierarchical clustering. Default isNULL
.- process
Logical value indicating whether to further process the data after integration (i.e., scale it, run PCA, and compute UMAP embeddings). Default is
TRUE
.- verbose
Logical value indicating whether to display progress messages during integration. Default is
FALSE
.
Value
A Seurat
object containing the integrated
data. The default assay is set to "integrated"
.
Details
The function performs the following steps:
Normalization: Each Seurat object in the list is log-normalized using
NormalizeData
.Parameter Adjustment: Integration parameters are adjusted based on the smallest dataset to accommodate cases with a small number of cells (e.g., metacells).
Finding Integration Anchors: Uses
FindIntegrationAnchors
to find anchors between datasets based on the provided features.Integration: Integrates the datasets using
IntegrateData
.Optional Processing: If
process = TRUE
, the function scales the data, runs PCA, and computes UMAP embeddings.
The integration is performed using Seurat's CCA-based methods, and the function is designed to handle datasets with varying sizes efficiently.
Examples
if (FALSE) { # \dontrun{
# Integrate a list of seurat object using selected
# features (e.g. anglemania genes or HVGs)
# and CCA integration method
seurat_list <- list(seurat_object1, seurat_object2)
integrated_seurat <- integrate_seurat_list(seurat_list, features)
} # }