Find an "optimal" rank for a Non-Negative Matrix Factorization using cross-validation. Returns a data.frame
with class nmfCrossValidate
. Plot results using the plot
class method.
Usage
crossValidate(data, k, reps = 3, n = 0.05, verbose = FALSE, ...)
# S3 method for nmfCrossValidate
plot(x, ...)
Arguments
- data
dense or sparse matrix of features in rows and samples in columns. Prefer
matrix
orMatrix::dgCMatrix
, respectively- k
array of factorization ranks to test
- reps
number of independent replicates to run
- n
fraction of values to handle as missing (default is 5%, or
0.05
)- verbose
should updates be displayed when each factorization is completed
- ...
parameters to
RcppML::nmf
, not includingdata
ork
- x
nmfCrossValidate
object, the result ofcrossValidate
Details
A random speckled pattern of values is masked off during model fitting, and the mean squared error of prediction is evaluated after the model has reached the desired tolerance. The rank at which the model achieves the lowest error (best prediction accuracy) is the optimal rank.