The Kolmogorov-Smirnov (K-S) Test - He Loves Math – Past Papers, Study Notes, & Math Resources

Understanding the Kolmogorov-Smirnov (K-S) Test: A Comprehensive Guide

The Kolmogorov-Smirnov test is one of the most useful and general nonparametric methods for comparing two samples, as it is sensitive to differences in both location and shape of the empirical cumulative distribution functions of the two samples.

What is a Kolmogorov-Smirnov (K-S) Test?

The Kolmogorov-Smirnov (K-S) test is a nonparametric statistical test used to determine whether there is a significant difference between the distributions of two samples. Unlike many statistical tests that focus on specific parameters like mean or variance, the K-S test compares the entire distributions, making it more comprehensive for detecting various types of differences.

Named after Russian mathematicians Andrey Kolmogorov and Nikolai Smirnov, this test has become a fundamental tool in statistical analysis across various fields, from finance to natural sciences.

Core Principles of the K-S Test

Distribution Assessment: The K-S test is specifically designed to assess distributional differences between datasets.
Maximum Distance Measurement: It measures the maximum vertical distance (supremum) between the cumulative distribution functions (CDFs) of two distributions.
Similarity Indicator: The smaller this maximum distance, the more likely the two distributions are the same.
Statistical Testing: While often visualized as a "plot," the K-S test is primarily interpreted as a statistical test that provides a p-value to help determine whether the distributional differences are statistically significant.

Understanding Cumulative Distribution Functions (CDFs)

Before diving deeper into the K-S test, it's essential to understand what a cumulative distribution function (CDF) is:

A cumulative distribution function (CDF) describes the probability that a random variable X takes on a value less than or equal to x. For a continuous distribution, it's the integral of the probability density function (PDF) up to point x.

For a discrete dataset of observations, the empirical CDF (ECDF) is formed by arranging the data in ascending order and calculating the cumulative probability at each point. The ECDF step function jumps at each observed data point.

Properties of CDFs:

Always non-decreasing (moves from 0 to 1)
Right-continuous
Approaches 0 as x approaches negative infinity
Approaches 1 as x approaches positive infinity

How the K-S Test Works

The K-S test works by finding the maximum absolute difference between two cumulative distribution functions. The test statistic, denoted as D, is defined as:

D = sup|F₁(x) - F₂(x)|

Where:

sup denotes the supremum (maximum) of the set of distances
F₁(x) is the CDF of the first distribution
F₂(x) is the CDF of the second distribution

The null hypothesis (H₀) of the K-S test is that both samples are drawn from the same distribution. The alternative hypothesis (H₁) is that they come from different distributions.

One-Sample vs. Two-Sample K-S Test

The K-S test comes in two flavors:

One-sample K-S test: Compares a sample distribution to a reference theoretical distribution (like normal, uniform, etc.)
Two-sample K-S test: Compares two empirical sample distributions

In this article, we'll focus primarily on the two-sample test, but the principles apply to both variations.

Interactive K-S Test Visualization

Distribution Comparison

Critical Values

Real-World Examples

Distribution 1

Distribution Type:

Mean (μ):

Standard Deviation (σ):

Minimum (a):

Maximum (b):

Rate (λ):

Mean 1:

Mean 2:

Standard Deviation:

Distribution 2

Distribution Type:

Mean (μ):

Standard Deviation (σ):

Minimum (a):

Maximum (b):

Rate (λ):

Mean 1:

Mean 2:

Standard Deviation:

Sample Settings

Sample Size (n):

Significance Level (α):

Distribution 1 CDF

Distribution 2 CDF

Maximum Distance (D)

K-S Test Results:

K-S Statistic (D): —

Critical Value: —

p-value: —

Conclusion: —

How to Interpret the Results:

The K-S test compares the value of the test statistic D (the maximum vertical distance between CDFs) to a critical value. If D exceeds the critical value (or equivalently, if the p-value is less than α), we reject the null hypothesis that the samples come from the same distribution.

Try adjusting the parameters of each distribution to see how it affects the test results. Notice how the test becomes more sensitive as the sample size increases.

K-S Test Critical Values

The critical value for the K-S test depends on the sample size and the significance level (α). For large samples, the critical value can be approximated as:

Critical Value = c(α) × √[(n₁ + n₂) / (n₁ × n₂)]

Where c(α) depends on the significance level:

For α = 0.1: c(α) = 1.22
For α = 0.05: c(α) = 1.36
For α = 0.01: c(α) = 1.63
For α = 0.001: c(α) = 1.95

Sample Size (n): 100

Critical Values:

α = 0.1: 0.192

α = 0.05: 0.214

α = 0.01: 0.257

α = 0.001: 0.308

Understanding Critical Values:

The critical value represents the threshold for the K-S statistic (D). If D is greater than the critical value, we reject the null hypothesis that the samples come from the same distribution.

As sample size increases, smaller differences can be detected, which is why the critical value decreases with larger samples. This visualization shows how the critical value changes with sample size for different significance levels.

Real-World Applications of the K-S Test

Select Example Dataset:

This example compares stock returns before and after a major economic event. The K-S test can help determine if the distribution of returns significantly changed after the event.

Example Results:

K-S Statistic (D): —

Critical Value: —

p-value: —

Conclusion: —

Application Insights:

The K-S test is widely used across different fields:

Finance: Testing if market returns follow a normal distribution or comparing market behavior across different periods.
Medicine: Comparing treatment outcomes between control and experimental groups.
Quality Control: Determining if a manufacturing process has changed by comparing product measurements.
Environmental Science: Comparing distributions of pollution levels from different sources or locations.

Advantages and Limitations of the K-S Test

Advantages

Distribution-Free: As a nonparametric test, it doesn't assume that data comes from a specific distribution.
Sensitivity to Shape: Detects differences in both location and shape of distributions, not just central tendency.
Visual Interpretation: Can be easily visualized, making it accessible for non-statisticians.
Applicable to Various Data Types: Works with continuous, discrete, and ordinal data.

Limitations

Sample Size Sensitivity: Less reliable for very small sample sizes.
Two-Distribution Focus: Only compares two distributions at a time.
Limited Insight: Tells you if distributions differ but not how they differ.
Tail Sensitivity: Sometimes less sensitive to differences in the tails of distributions.

When to Use the K-S Test

The K-S test is particularly useful in scenarios where:

You need to compare complete distributions, not just means or medians.
You don't want to make assumptions about the underlying distributions of your data.
You're looking for any kind of difference between distributions (location, spread, shape).
You need a well-established statistical test with recognized power and validity.

While there are other tests for comparing distributions (like the Anderson-Darling test or the Chi-square test), the K-S test remains popular due to its simplicity, effectiveness, and straightforward interpretation.

Conclusion

The Kolmogorov-Smirnov test provides a robust method for determining whether two samples are drawn from the same distribution. By measuring the maximum distance between cumulative distribution functions, it offers a comprehensive way to assess distributional differences. Although the test is often accompanied by visualizations (the "K-S plot"), its primary output is a statistical result that helps researchers make evidence-based decisions about their data.

When applied correctly, the K-S test serves as a valuable tool in a statistician's toolbox, especially when comparing distributions without making assumptions about their parametric forms. As with any statistical test, it's important to understand both its strengths and limitations to ensure appropriate application and interpretation.