Regularization
Additional loss terms that penalize model complexity, typically by adding weight magnitude penalties to prevent overfitting.
Regularization modifies the optimization objective beyond prediction error by adding terms based on model weight magnitudes. L2 regularization (ridge) adds the sum of squared weights, pushing values toward zero without eliminating them. L1 regularization (lasso) adds absolute weight values, often driving many weights exactly to zero for implicit feature selection. The regularization strength hyperparameter controls the tradeoff between fitting training data and maintaining simpler models.
Also known as
weight decay, L1 regularization, L2 regularization, lasso, ridge