Statistical Inference & Calibration
Move from hypothesis testing and likelihood methods to uncertainty quantification and probabilistic scoring.
Study this path with flashcards
5 cards
- Step 1Multiple hypothesis testing asks how to control false positives when many tests are run at once. False discovery rate control, especially the Benjamini–Hochberg procedure, limits the expected fraction of rejected hypotheses that are actually null and is usually less conservative than family-wise error control.
- Step 2A likelihood ratio test compares how well two nested statistical models explain the same data by taking the ratio of their maximized likelihoods. Large likelihood-ratio statistics indicate that the larger model fits substantially better than the restricted one, and under regularity conditions the test statistic is asymptotically chi-squared.
- Step 3Importance sampling estimates an expectation under a target distribution by drawing samples from a different proposal distribution and reweighting them. It is powerful when the proposal places more mass in the important regions of the integrand, but unstable weights can make the variance explode.
- Step 4Bootstrap confidence intervals estimate uncertainty by resampling the observed dataset with replacement and recomputing the statistic many times. They are useful when analytic standard errors are awkward, but they inherit the sample's biases and can fail when the original sample is too small or unrepresentative.
- Step 5The Brier score measures the mean squared error of probabilistic predictions, so it rewards both correctness and calibration. Lower is better, and unlike accuracy it penalizes a confidently wrong 0.99 prediction much more than a cautious 0.6 prediction.