Skip to contents

Find the rank that minimizes the mean squared error of test set reconstruction using cross-validation.

Usage

cross_validate_nmf(
  A,
  ranks,
  n_replicates = 3,
  tol = 1e-04,
  maxit = 100,
  verbose = 1,
  L1 = 0.01,
  L2 = 0,
  threads = 0,
  test_density = 0.05
)

# S3 method for cross_validate_nmf_data
plot(x, ...)

Arguments

A

sparse matrix (ideally variance-stabilized) of data for genes x cells (rows x columns)

ranks

a vector of ranks at which to fit a model and compute test set reconstruction error

n_replicates

number of random test sets

tol

tolerance of the fit (1e-5 for publication quality, 1e-3 for cross-validation)

maxit

maximum number of iterations

verbose

verbosity level

L1

L1/LASSO penalty to increase sparsity of model

L2

L2/Ridge penalty to increase angles between factors

threads

number of threads for parallelization across CPUs, 0 = use all available threads

test_density

fraction of values to include in the test set

x

the result of cross_validate_nmf

...

additional arguments (not implemented)

Value

a data.frame of test set reconstruction error vs. rank of class nmf_cross_validate_data. Use plot method to visualize or min to compute optimal rank.