Compute a truncated SVD or PCA of a matrix using multiple algorithm backends. Supports sparse and dense matrices, implicit PCA centering, per-element regularization (L1/L2/non-negativity/bounds), Gram-level regularization (L2,1/angular/graph Laplacian), and automatic rank selection via speckled holdout cross-validation.
Usage
svd(
A,
k = 10,
tol = 1e-05,
maxit = 200,
center = FALSE,
scale = FALSE,
verbose = FALSE,
seed = NULL,
L1 = 0,
L2 = 0,
nonneg = FALSE,
upper_bound = 0,
test_fraction = 0,
mask = NULL,
robust = FALSE,
resource = "auto",
method = "auto",
...
)Arguments
- A
Input matrix. May be dense (
matrix), sparse (dgCMatrix), or a path to a.spzfile for out-of-core streaming SVD.- k
Number of factors (rank). Use
"auto"for automatic rank selection via cross-validation. Default: 10.- tol
Convergence tolerance per rank-1 subproblem. Default: 1e-5.
- maxit
Maximum ALS iterations per factor. Default: 200.
- center
If
TRUE, subtract row means (PCA mode). Default:FALSE.- scale
If
TRUE, divide each row by its standard deviation after centering. Default:FALSE.- verbose
Print per-factor diagnostics. Default:
FALSE.- seed
Random seed for initialization and cross-validation. Default: NULL.
- L1
L1 (lasso) penalty. Single value or length-2 vector. Default: 0.
- L2
L2 (ridge) penalty. Single value or length-2 vector. Default: 0.
- nonneg
Non-negativity constraints. Single logical or length-2 vector. Default:
FALSE.- upper_bound
Upper bound constraints. Single value or length-2 vector. Default: 0.
- test_fraction
Fraction of entries to hold out for cross-validation. Default: 0.
- mask
Masking control.
NULL(default, no masking),"zeros"(mask zero entries — only non-zero entries can be holdout in CV), adgCMatrix(custom mask matrix where non-zero entries are excluded), orlist("zeros", <dgCMatrix>)to combine both.- robust
Robustness control:
FALSE(default),TRUE(Huber delta=1.345),"mae"(near-MAE), or a positive numeric Huber delta.- resource
Compute backend:
"auto"(default),"cpu", or"gpu".- method
Algorithm:
"auto"(default),"deflation","krylov","lanczos","irlba","randomized".- ...
Advanced parameters. See Advanced Parameters section.
Value
An S4 object of class svd with slots:
uLeft singular vectors (scores), m x k matrix
dSingular values, length-k numeric vector
vRight singular vectors (loadings), n x k matrix
miscList with metadata:
centered,row_means,test_loss,iters_per_factor,wall_time_ms
Details
Multiple algorithms are available via the method parameter:
"deflation"Sequential rank-1 ALS with deflation correction. Supports all constraints/CV/auto-rank. Best for small k with mixed constraints.
"krylov"Krylov-Seeded Projected Refinement (KSPR). Block method: computes all k factors simultaneously via Lanczos seed + Gram-solve-then-project. Faster than deflation for larger k. Supports all regularization except robust. CV is evaluated post-convergence.
"lanczos"Unconstrained Lanczos bidiagonalization. Fast for unconstrained SVD. No regularization support.
"irlba"Implicitly Restarted Lanczos. Good general-purpose unconstrained SVD. No regularization support.
"randomized"Randomized SVD with power iterations. Fast approximate SVD for large matrices. No regularization support.
Note
This function shadows base::svd(). To use the base R version,
call base::svd() explicitly.
Auto-rank
When k = "auto", a speckled holdout set is created and test MSE is
evaluated after each factor. Rank selection stops when test MSE fails to
improve for patience consecutive factors. The returned model is
truncated to the best rank.
Advanced Parameters (via ...)
The following parameters can be passed via ...:
L21L2,1 (group sparsity) penalty. Single value or length-2 vector (default 0).
angularAngular (orthogonality) penalty. Single value or length-2 vector (default 0).
graph_USparse graph Laplacian for features (m x m). Default NULL.
graph_VSparse graph Laplacian for samples (n x n). Default NULL.
graph_lambdaGraph regularization strength. Single value or length-2 (default 0).
convergenceConvergence criterion:
"factor"(default) or"global".cv_seedSeparate seed for holdout mask. Default NULL.
patienceAuto-rank non-improving factor patience (default 3).
k_maxMaximum rank for auto-rank mode (default 50).
threadsNumber of OpenMP threads. 0 = all (default 0).
Examples
# \donttest{
library(RcppML)
data(iris)
A <- as.matrix(iris[, 1:4])
# Standard SVD (3 factors)
s <- svd(A, k = 3)
# PCA with centering (convenience wrapper)
p <- pca(A, k = 3)
# Auto-rank PCA
ar <- svd(A, k = "auto", center = TRUE, verbose = TRUE)
#> auto method: deflation (k=50, gpu=FALSE)
#> CV: 39 test entries held out (39 zeroed), denom_correction=0.9500
#> Factor 1: sigma=4.0891e+01 iters=4 test_mse=2.966425e+00
#> Factor 2: sigma=1.9770e+01 iters=10 test_mse=5.827467e+00
#> Factor 3: sigma=1.3789e+01 iters=6 test_mse=1.264707e+01
#> Factor 4: sigma=1.1120e+01 iters=2 test_mse=1.705114e+01
#> Auto-rank: patience exhausted at factor 4, best rank = 1 (test_mse=2.966425e+00)
# Robust PCA (outlier-resistant)
rp <- svd(A, k = 3, center = TRUE, robust = TRUE, verbose = TRUE)
#> auto method: deflation (k=3, gpu=FALSE)
#> Robust SVD: Huber delta=1.3450, IRLS reweighting active
#> Factor 1: sigma=4.0970e+01 iters=2
#> Factor 2: sigma=1.7062e+01 iters=3
#> Factor 3: sigma=2.1213e+00 iters=3
# }