Skip to contents

Find an "optimal" rank for a Non-Negative Matrix Factorization using cross-validation. Returns a data.frame with class nmfCrossValidate. Plot results using the plot class method.

Usage

crossValidate(data, k, reps = 3, n = 0.05, verbose = FALSE, ...)

# S3 method for nmfCrossValidate
plot(x, ...)

Arguments

data

dense or sparse matrix of features in rows and samples in columns. Prefer matrix or Matrix::dgCMatrix, respectively

k

array of factorization ranks to test

reps

number of independent replicates to run

n

fraction of values to handle as missing (default is 5%, or 0.05)

verbose

should updates be displayed when each factorization is completed

...

parameters to RcppML::nmf, not including data or k

x

nmfCrossValidate object, the result of crossValidate

Value

data.frame with class nmfCrossValidate with columns rep, k, and value

Details

A random speckled pattern of values is masked off during model fitting, and the mean squared error of prediction is evaluated after the model has reached the desired tolerance. The rank at which the model achieves the lowest error (best prediction accuracy) is the optimal rank.

See also