Downsampling Cells — downsampling • CIDER

Downsamples cells from each group for IDER-based similarity calculation.

Usage

downsampling(
  metadata,
  n.size = 35,
  seed = NULL,
  include = FALSE,
  replace = FALSE,
  lower.cutoff = 3
)

Arguments

metadata: A data frame containing at least two columns: one for group labels and one for batch information. Each row corresponds to a single cell. Required.
n.size: Numeric value specifying the number of cells to use in each group. Default is 35.
seed: Numeric value to set the random seed for sampling. Default is 12345.
include: Logical value indicating whether to include groups that have fewer cells than n.size. Default is FALSE.
replace: Logical value specifying whether to sample with replacement if a group is smaller than n.size. Default is FALSE.
lower.cutoff: Numeric value indicating the minimum group size required for inclusion. Default is 3.

Value

A list of numeric indices (or cell names) for cells to be kept for downstream computation.

Examples

  # 'meta' is a data frame with columns 'label' and 'batch'
  meta <- data.frame(
    label = c(rep("A", 40), rep("A", 35), rep("B", 20)),
    batch = c(rep("X", 40), rep("Y", 35), rep("X", 20))
  )
  keep_cells <- downsampling(meta, n.size = 35, seed = 12345)
  
  # Display the selected indices
  print(keep_cells)
#>  [1] 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
#> [26] 66 67 68 69 70 71 72 73 74 75