Statistical Decision Theory & Inference

Bridge the probability and decision-theoretic ideas that connect Bayesian inference, calibrated prediction, and classical statistical learning.

Estimated time: ~75 min

Study this path with flashcards

6 cards

Study →

Step 1
Law of Total Probability
The law of total probability computes an event probability by summing over mutually exclusive, exhaustive cases. In machine learning it is the basic marginalization identity behind latent-variable models, mixture models, and many Bayesian calculations.
Step 2
Conditional Independence
Conditional independence means two variables become unrelated once a third variable is known. It is the simplifying assumption that makes graphical models tractable and explains why conditioning can either remove dependence or, in collider structures, create it.
Step 3
Sufficient Statistics
A sufficient statistic is a summary of the sample that retains all information about a parameter relevant for inference. This is why many classical models can replace an entire dataset with counts, sums, or means without changing the likelihood-based conclusions about the parameter.
Step 4
Bayes Risk and the Bayes Optimal Classifier
Bayes risk is the minimum achievable expected loss under the true data distribution, and the Bayes-optimal classifier attains it by minimizing posterior expected loss for each input. Under ordinary 0–1 loss, that rule becomes “predict the class with highest posterior probability.”
Step 5
Proper Scoring Rules
A scoring rule is proper if a forecaster minimizes expected score by reporting their true predictive distribution. Proper scoring rules matter because they reward honest, calibrated probabilities rather than merely getting the top-ranked class right.
Step 6
Surrogate Losses and Classification Calibration
Surrogate losses replace hard-to-optimize 0–1 classification loss with tractable objectives such as logistic or hinge loss. A surrogate is classification-calibrated if optimizing it still drives the classifier toward the Bayes-optimal decision rule.