Downsamples cells from each group for IDER-based similarity calculation.
Usage
downsampling(
metadata,
n.size = 35,
seed = NULL,
include = FALSE,
replace = FALSE,
lower.cutoff = 3
)
Arguments
- metadata
A data frame containing at least two columns: one for group labels and one for batch information. Each row corresponds to a single cell. Required.
- n.size
Numeric value specifying the number of cells to use in each group. Default is
35
.- seed
Numeric value to set the random seed for sampling. Default is
12345
.- include
Logical value indicating whether to include groups that have fewer cells than
n.size
. Default isFALSE
.- replace
Logical value specifying whether to sample with replacement if a group is smaller than
n.size
. Default isFALSE
.- lower.cutoff
Numeric value indicating the minimum group size required for inclusion. Default is
3
.
Examples
# 'meta' is a data frame with columns 'label' and 'batch'
meta <- data.frame(
label = c(rep("A", 40), rep("A", 35), rep("B", 20)),
batch = c(rep("X", 40), rep("Y", 35), rep("X", 20))
)
keep_cells <- downsampling(meta, n.size = 35, seed = 12345)
# Display the selected indices
print(keep_cells)
#> [1] 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
#> [26] 66 67 68 69 70 71 72 73 74 75