Classical & Probabilistic Models
Study classical unsupervised, latent-variable, and sequential-state models that still shape modern machine learning.
Study this path with flashcards
7 cards
- Step 1A one-class SVM learns a boundary around mostly normal data by separating mapped training points from the origin with maximum margin in feature space. It is a classic novelty-detection method because it needs examples of the inlier class but not labeled anomalies.
- Step 2Anomaly detection identifies observations that look unlikely under the pattern of normal data. The main families are density-based methods, reconstruction-based methods, and one-class classification methods, and the right choice depends on whether you have labels, strong feature engineering, or only normal examples.
- Step 3Latent Dirichlet Allocation models each document as a mixture of latent topics and each topic as a distribution over words. It is a generative model for uncovering coarse semantic structure in bag-of-words corpora, not a modern contextual language model.
- Step 4Factor analysis models observed variables as linear combinations of a small number of latent factors plus variable-specific noise. It is useful when the goal is to explain covariance structure rather than merely reduce dimension, which is the key difference from PCA.
- Step 5The Kalman filter recursively estimates the hidden state of a linear Gaussian dynamical system by alternating a prediction step with a measurement update. It is optimal for that model class because the posterior remains Gaussian and is fully described by a mean and covariance.
- Step 6A particle filter approximates the posterior over a hidden state with weighted samples, or particles, instead of a single Gaussian. It is useful for nonlinear or non-Gaussian state-space models, but resampling and weight degeneracy are central practical issues.
- Step 7Canonical correlation analysis finds linear combinations of two random vectors that are maximally correlated with each other. It is the right tool when the question is about shared structure between two views of the same examples rather than variance within a single view.