Project a factor matrix onto new data to solve for the complementary matrix. Given \(W\) and \(A\), find \(H\) such that \(||WH - A||^2\) is minimized. Alternatively, given \(H\) and \(A\), find \(W\) such that \(||WH - A||^2\) is minimized.
Arguments
- w
factor model matrix (n_features x k) for solving H. Set to NULL to solve for W.
- h
factor model matrix (k x n_samples) for solving W. Set to NULL to solve for H.
- A
data matrix. Can be dense (matrix) or sparse (dgCMatrix).
- L1
L1/LASSO penalty, length 1 or 2. Range [0, 1).
- L2
Ridge penalty, length 1 or 2. Range [0, Inf).
- loss
loss function:
"mse"(default) or others via NMF dispatch.- upper_bound
maximum value in solution, length 1 or 2. 0 = no bound.
- nonneg
non-negativity constraints, length 1 or 2.
- threads
number of threads for OpenMP parallelization (0 = all available).
- verbose
print progress information.
- ...
advanced parameters. See Advanced Parameters section.
Details
This function solves NNLS projection problems with full flexibility.
Projection Modes:
wprovided,h = NULL: Solve for H given W and A (standard projection)hprovided,w = NULL: Solve for W given H and A (transpose projection)Exactly one of
worhmust be NULL
Advanced Parameters (via ...)
L21L2,1 group sparsity penalty, length 1 or 2 (default
c(0,0))angularAngular decorrelation penalty, length 1 or 2 (default
c(0,0))cd_maxitMax coordinate descent iterations (default 100)
cd_tolCD stopping tolerance (default 1e-8)
warm_startOptional initial solution matrix
dispersionDispersion estimation mode:
"per_row"(default) or"global"theta_initInitial GP theta (default 0.1)
theta_maxMaximum GP theta (default 5.0)
theta_minMinimum GP theta (default 0.0)
nb_size_initInitial NB size parameter (default 10.0)
nb_size_maxMaximum NB size (default 1e6)
nb_size_minMinimum NB size (default 0.01)
gamma_phi_initInitial Gamma shape parameter (default 1.0)
gamma_phi_maxMaximum Gamma shape (default 1e4)
gamma_phi_minMinimum Gamma shape (default 1e-6)
tweedie_powerTweedie variance power (default 1.5)
irls_max_iterMaximum IRLS iterations per solve (default 5)
irls_tolIRLS convergence tolerance (default 1e-4)
Target Regularization
When target_H and target_lambda are provided (via ...),
nnls() routes the solve through a single NMF iteration internally.
Positive target_lambda attracts the solution toward target_H
(label enrichment). Negative target_lambda uses eigenvalue-projected
adversarial removal (PROJ_ADV) to suppress target-correlated structure
(batch removal). See vignette("guided-nmf") for details.
References
DeBruine, ZJ, Melcher, K, and Triche, TJ. (2021). "High-performance non-negative matrix factorization for large single-cell data." BioRXiv.
Franc, VC, Hlavac, VC, and Navara, M. (2005). "Sequential Coordinate-Wise Algorithm for the Non-negative Least Squares Problem."
Examples
# \donttest{
# Generate synthetic NMF problem
set.seed(123)
w <- matrix(runif(100), 20, 5) # 20 features, 5 factors
h_true <- matrix(runif(50), 5, 10) # 5 factors, 10 samples
A <- w %*% h_true + matrix(rnorm(200, 0, 0.1), 20, 10) # Add noise
# Project W onto new data to find H
h_recovered <- nnls(w = w, A = A)
cor(as.vector(h_true), as.vector(h_recovered))
#> [1] 0.9668001
# With L1 penalty for sparse H
h_sparse <- nnls(w = w, A = A, L1 = c(0, 0.1))
# }