Skip to contents

Downsamples cells from each group for IDER-based similarity calculation.

Usage

downsampling(
  metadata,
  n.size = 35,
  seed = NULL,
  include = FALSE,
  replace = FALSE,
  lower.cutoff = 3
)

Arguments

metadata

A data frame containing at least two columns: one for group labels and one for batch information. Each row corresponds to a single cell. Required.

n.size

Numeric value specifying the number of cells to use in each group. Default is 35.

seed

Numeric value to set the random seed for sampling. Default is 12345.

include

Logical value indicating whether to include groups that have fewer cells than n.size. Default is FALSE.

replace

Logical value specifying whether to sample with replacement if a group is smaller than n.size. Default is FALSE.

lower.cutoff

Numeric value indicating the minimum group size required for inclusion. Default is 3.

Value

A list of numeric indices (or cell names) for cells to be kept for downstream computation.

Examples

  # 'meta' is a data frame with columns 'label' and 'batch'
  meta <- data.frame(
    label = c(rep("A", 40), rep("A", 35), rep("B", 20)),
    batch = c(rep("X", 40), rep("Y", 35), rep("X", 20))
  )
  keep_cells <- downsampling(meta, n.size = 35, seed = 12345)
  
  # Display the selected indices
  print(keep_cells)
#>  [1] 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
#> [26] 66 67 68 69 70 71 72 73 74 75