Constructs an anglemania_object-class from a given Seurat or SingleCellExperiment object. This includes extracting and processing count matrices, filtering genes based on expression in a minimum number of cells, and storing results along with dataset and batch information. It also calculates weights for each dataset or batch based on the number of samples.

create_anglemania_object(object, ...)

.core_create_anglemania_object(
  meta,
  matrix_list,
  batch_key,
  dataset_key = NA_character_,
  min_cells_per_gene = 1,
  seurat_assay = "RNA",
  allow_missing_features = FALSE,
  min_samples_per_gene = 2
)

# S4 method for class 'Seurat'
create_anglemania_object(
  object,
  batch_key,
  dataset_key = NA_character_,
  min_cells_per_gene = 1,
  allow_missing_features = FALSE,
  min_samples_per_gene = 2,
  seurat_assay = "RNA",
  ...
)

# S4 method for class 'SingleCellExperiment'
create_anglemania_object(
  object,
  batch_key,
  dataset_key = NA_character_,
  min_cells_per_gene = 1,
  allow_missing_features = FALSE,
  min_samples_per_gene = 2,
  ...
)

# S4 method for class 'list'
create_anglemania_object(
  object,
  min_cells_per_gene = 1,
  allow_missing_features = FALSE,
  min_samples_per_gene = 2,
  seurat_assay = "RNA",
  ...
)

Arguments

object

A list of Seurat or SingleCellExperiment objects

...

Additional arguments passed to .core_create_anglemania_object

meta

A data frame containing metadata for the dataset. Will be extracted from the Seurat or SingleCellExperiment object or constructed from a list of objects.

matrix_list

A list of count matrices for each batch.

batch_key

A character string that names the column in the metadata that contains the batch information

dataset_key

Column name for dataset information (optional)

min_cells_per_gene

Minimum cells expressing a gene

seurat_assay

Assay of the Seurat object to extract counts from

allow_missing_features

Logical; allow genes missing in some batches

min_samples_per_gene

Minimum samples required if allow_missing_features is TRUE

Value

An anglemania_object-class containing:

matrix_list

A list of filtered count matrices for each unique batch.

dataset_key

The dataset key used for splitting the object.

batch_key

The batch key used for splitting the object.

data_info

A data frame summarizing the number of samples per dataset and their weights.

weights

A numeric vector of weights for each dataset or batch based on the number of samples.

intersect_genes

A character vector of genes expressed in at least the specified number of cells across all batches.

min_cells_per_gene

The minimum number of cells per gene threshold used for filtering.

anglemania_object

Details

The function performs the following steps:

  1. Adds a unique batch key to the metadata using add_unique_batch_key.

  2. Extracts count matrices for each batch.

  3. Filters genes based on the min_cells_per_gene threshold.

  4. Identifies intersected genes present across all batches.

  5. Converts count matrices to FBM objects.

  6. Computes weights for each batch or dataset.

Examples

sce <- sce_example()
angl <- create_anglemania_object(sce, batch_key = "batch")
#> No dataset_key specified.
#> Assuming that all samples belong to the same dataset and are separated by batch_key: batch
#> Using the intersection of filtered genes from all batches...
#> Number of genes in intersected set: 300
#> Extracting count matrices...
#> Filtering each batch to at least 1 cells per gene...
#> 
  |                                                  | 0 % elapsed=00s   
  |==================================================| 100% elapsed=00s, remaining~00s
angl
#> anglemania_object
#> --------------
#> Dataset key: NA 
#> Batch key: batch 
#> Number of datasets: 1 
#> Total number of batches: 2 
#> Batches (showing first 5):
#> batch1, batch2 
#> Number of intersected genes: 300 
#> Intersected genes (showing first 10):
#> gene1, gene2, gene3, gene4, gene5, gene6, gene7, gene8, gene9, gene10 , ...
#> Min cells per gene: 1 
sce <- sce_example()
batch_key <- "batch"
meta <- SingleCellExperiment::colData(sce) %>% 
  as.data.frame() %>% 
  add_unique_batch_key(batch_key=batch_key)

matrix_list <- lapply(unique(meta[[batch_key]]), function(batch) {
  batch_cells <- rownames(meta[meta[[batch_key]] == batch, ])
  batch_mat <- SingleCellExperiment::counts(sce[, batch_cells])
  return(batch_mat)
})

angl <- .core_create_anglemania_object(
  meta = meta,
  matrix_list = matrix_list,
  batch_key = batch_key
)
#> No dataset_key specified.
#> Assuming that all samples belong to the same dataset and are separated by batch_key: batch
#> Using the intersection of filtered genes from all batches...
#> Number of genes in intersected set: 300
#> Extracting count matrices...
#> Filtering each batch to at least 1 cells per gene...
#> 
  |                                                  | 0 % elapsed=00s   
  |==================================================| 100% elapsed=00s, remaining~00s

angl
#> anglemania_object
#> --------------
#> Dataset key: NA 
#> Batch key: batch 
#> Number of datasets: 1 
#> Total number of batches: 2 
#> Batches (showing first 5):
#> batch1, batch2 
#> Number of intersected genes: 300 
#> Intersected genes (showing first 10):
#> gene1, gene2, gene3, gene4, gene5, gene6, gene7, gene8, gene9, gene10 , ...
#> Min cells per gene: 1 
sce <- sce_example()
se <- Seurat::as.Seurat(sce, data = "counts")
se <- SeuratObject::RenameAssays(se, "originalexp", "RNA")
#> Renaming default assay from originalexp to RNA
#> Warning: Key ‘originalexp_’ taken, using ‘rna_’ instead
angl <- create_anglemania_object(se, batch_key = "batch")
#> No dataset_key specified.
#> Assuming that all samples belong to the same dataset and are separated by batch_key: batch
#> Using the intersection of filtered genes from all batches...
#> Number of genes in intersected set: 300
#> Extracting count matrices...
#> Filtering each batch to at least 1 cells per gene...
#> 
  |                                                  | 0 % elapsed=00s   
  |==================================================| 100% elapsed=00s, remaining~00s
angl
#> anglemania_object
#> --------------
#> Dataset key: NA 
#> Batch key: batch 
#> Number of datasets: 1 
#> Total number of batches: 2 
#> Batches (showing first 5):
#> batch1, batch2 
#> Number of intersected genes: 300 
#> Intersected genes (showing first 10):
#> gene1, gene2, gene3, gene4, gene5, gene6, gene7, gene8, gene9, gene10 , ...
#> Min cells per gene: 1 
sce <- sce_example()
angl <- create_anglemania_object(sce, batch_key = "batch")
#> No dataset_key specified.
#> Assuming that all samples belong to the same dataset and are separated by batch_key: batch
#> Using the intersection of filtered genes from all batches...
#> Number of genes in intersected set: 300
#> Extracting count matrices...
#> Filtering each batch to at least 1 cells per gene...
#> 
  |                                                  | 0 % elapsed=00s   
  |==================================================| 100% elapsed=00s, remaining~00s
angl
#> anglemania_object
#> --------------
#> Dataset key: NA 
#> Batch key: batch 
#> Number of datasets: 1 
#> Total number of batches: 2 
#> Batches (showing first 5):
#> batch1, batch2 
#> Number of intersected genes: 300 
#> Intersected genes (showing first 10):
#> gene1, gene2, gene3, gene4, gene5, gene6, gene7, gene8, gene9, gene10 , ...
#> Min cells per gene: 1 
sce <- sce_example()
sce_list <- list(sce1 = sce, sce2 = sce)
angl <- create_anglemania_object(sce_list)
#> No dataset_key specified.
#> Assuming that all samples belong to the same dataset and are separated by batch_key: batch
#> Using the intersection of filtered genes from all batches...
#> Number of genes in intersected set: 300
#> Extracting count matrices...
#> Filtering each batch to at least 1 cells per gene...
#> 
  |                                                  | 0 % elapsed=00s   
  |==================================================| 100% elapsed=00s, remaining~00s
angl
#> anglemania_object
#> --------------
#> Dataset key: NA 
#> Batch key: batch 
#> Number of datasets: 1 
#> Total number of batches: 2 
#> Batches (showing first 5):
#> sce1, sce2 
#> Number of intersected genes: 300 
#> Intersected genes (showing first 10):
#> gene1, gene2, gene3, gene4, gene5, gene6, gene7, gene8, gene9, gene10 , ...
#> Min cells per gene: 1 
se <- Seurat::as.Seurat(sce, data = "counts")
se <- SeuratObject::RenameAssays(se, "originalexp", "RNA")
#> Renaming default assay from originalexp to RNA
#> Warning: Key ‘originalexp_’ taken, using ‘rna_’ instead
se_list <- list(se1 = se, se2 = se)
angl <- create_anglemania_object(se_list)
#> No dataset_key specified.
#> Assuming that all samples belong to the same dataset and are separated by batch_key: batch
#> Using the intersection of filtered genes from all batches...
#> Number of genes in intersected set: 300
#> Extracting count matrices...
#> Filtering each batch to at least 1 cells per gene...
#> 
  |                                                  | 0 % elapsed=00s   
  |==================================================| 100% elapsed=00s, remaining~00s
angl
#> anglemania_object
#> --------------
#> Dataset key: NA 
#> Batch key: batch 
#> Number of datasets: 1 
#> Total number of batches: 2 
#> Batches (showing first 5):
#> se1, se2 
#> Number of intersected genes: 300 
#> Intersected genes (showing first 10):
#> gene1, gene2, gene3, gene4, gene5, gene6, gene7, gene8, gene9, gene10 , ...
#> Min cells per gene: 1