Statistics + Probability Formula Sheet

597 formulas. Zero scroll panic.

Q: What formulas are included in this statistics and probability formula bank?

It includes 597 K-12 statistics and probability formulas covering data displays, averages, spread, probability rules, combinatorics, distributions, inference, regression, sampling, errors, time series, and senior secondary extensions.

Q: Can I use this formula page for AP, IB, GCSE, IGCSE, CBSE, and A-Level-style revision?

Yes. The formula bank is organized for broad global K-12 revision and includes formulas commonly seen across AP Statistics, IB Mathematics, GCSE and IGCSE, CBSE or NCERT, Common Core, and senior secondary pathways.

Q: Why are some advanced formulas listed on a school formula page?

Some advanced formulas appear in AP, IB, A-Level-style courses, optional senior secondary statistics topics, or extension work. The grade-band labels help students choose the right level.

Q: Should students memorize every formula on this page?

No. Students should follow their teacher, syllabus, and exam-board formula sheet. This page is designed as a searchable revision and reference bank.

A searchable K-12 statistics and probability formula bank for data, averages, spread, probability, combinatorics, distributions, inference, regression, sampling, and exam-style helpers.

K–2 9 3–5 35 6–8 109 9–10 198 11–12 246

597Total formulas

19Topic units

5Grade bands

K-12Global scope

How to Use This Formula Bank

1. Pick a topic

Use the table of contents to jump to the unit you are studying.

2. Filter by level

Choose a grade band to narrow the bank to your course level.

3. Search fast

Type a formula name, symbol, distribution, or keyword.

4. Check the note

Use the note to confirm the formula context before applying it.

Unit 1

Data, Frequency Tables, Graphs, and Basic Descriptors

43 / 43 formulas

#1 K–2

Number of observations

\(n=\text{total number of data values}\)

Variables\(n\)Number of observations\(\text{total number of data values}\)count of all data values

Count every recorded value once.

#2 K–2

Frequency

\(f=\text{number of times a value or category occurs}\)

Variables\(f\)Frequency\(\text{number of times a value or category occurs}\)the named quantity shown in the formula

Used in tally charts and frequency tables.

#3 K–2

Total frequency

\(N=\sum f_i\)

Variables\(N\)Total frequency\(f_i\)frequency of class or category i

Add all class or category frequencies.

#4 K–2

Category total

\(T_c=\sum f_{\text{category }c}\)

Variables\(T_c\)Category total\(\text{category}\)the named quantity shown in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(c,d\)cell counts, constants, or additional values named by the formula

Count all observations in a category.

#5 K–2

Difference between two counts

\(D=a-b\)

Variables\(D\)Difference between two counts\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Used in simple bar-chart comparison.

#6 K–2

Combined count

\(T=a+b+c+\cdots\)

Variables\(T\)Combined count\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(c,d\)cell counts, constants, or additional values named by the formula

Used in pictographs and simple tables.

#7 K–2

More than comparison

\(\text{More}=a-b\)

Variables\(\text{More}\)More than comparison\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Assume \(a>b\).

#8 K–2

Fewer than comparison

\(\text{Fewer}=b-a\)

Variables\(\text{Fewer}\)Fewer than comparison\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Assume \(b>a\).

#9 K–2

Pictograph value

\(\text{Total}=(\text{number of symbols})(\text{key value})\)

Variables\(\text{Total}\)Pictograph value\(\text{number of symbols}\)the named quantity shown in the formula\(\text{key value}\)the named quantity shown in the formula

Example: 1 icon represents 5 students.

#10 3–5

Relative frequency

\(\text{Relative frequency}=\frac{f}{N}\)

Variables\(\text{Relative frequency}\)Relative frequency\(N\)total count, population size, or total frequency\(f\)frequency or class frequency

Fraction of the total.

#11 3–5

Relative frequency percent

\(\text{Relative frequency percent}=\frac{f}{N}\times100\%\)

Variables\(\text{Relative frequency percent}\)Relative frequency percent\(N\)total count, population size, or total frequency\(f\)frequency or class frequency

Used in tables and bar charts.

#12 3–5

Cumulative frequency

\(CF_k=\sum_{i=1}^{k}f_i\)

Variables\(CF_k\)Cumulative frequency\(f_i\)frequency of class or category i\(CF\)cumulative frequency before or up to a class\(k\)class position, selected count, number of categories, or period length

Running total up to class \(k\).

#13 3–5

Cumulative relative frequency

\(CRF_k=\frac{CF_k}{N}\)

Variables\(CRF_k\)Cumulative relative frequency\(N\)total count, population size, or total frequency\(CF\)cumulative frequency before or up to a class

Running proportion.

#14 3–5

Cumulative percentage

\(CP_k=\frac{CF_k}{N}\times100\%\)

Variables\(CP_k\)Cumulative percentage\(N\)total count, population size, or total frequency\(CF\)cumulative frequency before or up to a class

Running percent.

#15 3–5

Pie-chart sector angle

\(\theta=\frac{f}{N}\times360^\circ\)

Variables\(\theta\)Pie-chart sector angle\(N\)total count, population size, or total frequency\(f\)frequency or class frequency

Used for circle graphs.

#16 3–5

Pie-chart sector percent

\(p=\frac{f}{N}\times100\%\)

Variables\(p\)Pie-chart sector percent\(N\)total count, population size, or total frequency\(f\)frequency or class frequency

Percent for one category.

#17 3–5

Bar height from scale

\(\text{Value}=(\text{number of scale units})(\text{scale value})\)

Variables\(\text{Value}\)Bar height from scale\(\text{number of scale units}\)the named quantity shown in the formula\(\text{scale value}\)the named quantity shown in the formula

Reads scaled bar charts.

#18 3–5

Line plot total

\(N=\sum \text{marks above all values}\)

Variables\(N\)Line plot total\(\text{marks above all values}\)the named quantity shown in the formula

Each mark represents one value.

#19 3–5

Frequency table total value

\(\sum fx\)

Variables\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Used before calculating a mean from a frequency table.

#20 6–8

Frequency density

\(\text{Frequency density}=\frac{f}{\text{class width}}\)

Variables\(\text{Frequency density}\)Frequency density\(\text{class width}\)width of the interval or class\(f\)frequency or class frequency

Used for histograms with unequal widths.

#21 6–8

Histogram frequency

\(f=(\text{frequency density})(\text{class width})\)

Variables\(f\)Histogram frequency\(\text{frequency density}\)frequency per unit class width\(\text{class width}\)width of the interval or class\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Area of a histogram bar represents frequency.

#22 6–8

Class width

\(w=U-L\)

Variables\(w\)Class width\(L\)lower boundary, likelihood, or index value named by the formula\(U\)upper class boundary or upper endpoint named by the formula

Upper class boundary minus lower class boundary.

#23 6–8

Class midpoint

\(m=\frac{L+U}{2}\)

Variables\(m\)Class midpoint\(L\)lower boundary, likelihood, or index value named by the formula\(U\)upper class boundary or upper endpoint named by the formula

Used for grouped-data estimates.

#24 6–8

Grouped total

\(N=\sum f_i\)

Variables\(N\)Grouped total\(f_i\)frequency of class or category i

Total observations in grouped data.

#25 6–8

Grouped sum estimate

\(\sum f_im_i\)

Variables\(f_i\)frequency of class or category i\(m_i\)midpoint of class i

Approximate total using midpoints.

#26 6–8

Frequency polygon point

\(\left(m_i,f_i\right)\)

Variables\(f_i\)frequency of class or category i\(m_i\)midpoint of class i

Plot class midpoint against frequency.

#27 6–8

Relative frequency polygon point

\(\left(m_i,\frac{f_i}{N}\right)\)

Variables\(N\)total count, population size, or total frequency\(f_i\)frequency of class or category i\(m_i\)midpoint of class i

Plot midpoint against relative frequency.

#28 6–8

Ogive point

\(\left(U_i,CF_i\right)\)

Variables\(CF\)cumulative frequency before or up to a class

Plot upper boundary against cumulative frequency.

#29 6–8

Percent ogive point

\(\left(U_i,\frac{CF_i}{N}\times100\%\right)\)

Variables\(N\)total count, population size, or total frequency\(CF\)cumulative frequency before or up to a class

Used to estimate percentiles.

#30 6–8

Stem-and-leaf total

\(N=\sum \text{leaves}\)

Variables\(N\)Stem-and-leaf total\(\text{leaves}\)the named quantity shown in the formula

Each leaf usually represents one observation.

#31 6–8

Grouped class boundary midpoint

\(m_i=\frac{\text{lower boundary}+\text{upper boundary}}{2}\)

Variables\(m_i\)Grouped class boundary midpoint\(\text{lower boundary}\)the named quantity shown in the formula\(\text{upper boundary}\)the named quantity shown in the formula

Preferred for continuous grouped data.

#32 6–8

Scale factor for graph values

\(\text{Actual value}=(\text{graph reading})(\text{scale factor})\)

Variables\(\text{Actual value}\)Scale factor for graph values\(\text{graph reading}\)the named quantity shown in the formula\(\text{scale factor}\)the named quantity shown in the formula

Used with scaled graphs.

#33 6–8

Percentage error in graph reading

\(\text{Error \%}=\frac{\text{absolute error}}{\text{true value}}\times100\%\)

Variables\(\text{Error \%}\)Percentage error in graph reading\(\text{absolute error}\)absolute difference from the true value\(\text{true value}\)accepted or exact value

General measurement-statistics link.

#34 9–10

Data density in grouped interval

\(d_i=\frac{f_i}{w_i}\)

Variables\(d_i\)Data density in grouped interval\(f_i\)frequency of class or category i\(w_i\)weight for value i

Alternative symbol for histogram density.

#35 9–10

Proportion in interval

\(p_i=\frac{f_i}{N}\)

Variables\(p_i\)Proportion in interval\(N\)total count, population size, or total frequency\(f_i\)frequency of class or category i

Distribution share for class \(i\).

#36 9–10

Expected class count from percentage

\(f_i=\frac{p_i}{100}N\)

Variables\(f_i\)Expected class count from percentage\(N\)total count, population size, or total frequency\(p_i\)probability or proportion for category i

Used in grouped tables.

#37 9–10

Missing frequency from total

\(f_{\text{missing}}=N-\sum f_{\text{known}}\)

Variables\(f_{\text{missing}}\)Missing frequency from total\(\text{missing}\)the named quantity shown in the formula\(\text{known}\)the named quantity shown in the formula\(N\)total count, population size, or total frequency

Used in table completion.

#38 9–10

Weighted table total

\(T=\sum f_ix_i\)

Variables\(T\)Weighted table total\(x_i\)ith data value or observation\(f_i\)frequency of class or category i

Weighted sum from values and frequencies.

#39 9–10

Binned data midpoint estimate

\(x_i\approx m_i\)

Variables\(x_i\)Binned data midpoint estimate\(m_i\)midpoint of class i

Approximation used for grouped statistics.

#40 9–10

Cumulative frequency class location

\(CF_{\text{before}}<k\le CF_{\text{class}}\)

Variables\(\text{before}\)the named quantity shown in the formula\(\text{class}\)the named quantity shown in the formula\(CF\)cumulative frequency before or up to a class\(k\)class position, selected count, number of categories, or period length

Locates median, quartile, or percentile class.

#41 9–10

Less-than cumulative frequency

\(CF_{<U_i}=\sum_{j\le i}f_j\)

Variables\(CF_{<U_i}\)Less-than cumulative frequency\(CF\)cumulative frequency before or up to a class

Cumulative count below upper boundary.

#42 9–10

More-than cumulative frequency

\(CF_{\ge L_i}=N-\sum_{j<i}f_j\)

Variables\(CF_{\ge L_i}\)More-than cumulative frequency\(N\)total count, population size, or total frequency\(CF\)cumulative frequency before or up to a class

Cumulative count at or above lower boundary.

#43 9–10

Frequency percentage angle reverse

\(f=\frac{\theta}{360^\circ}N\)

Variables\(f\)Frequency percentage angle reverse\(\theta\)angle, parameter, or statistic named by the formula\(N\)total count, population size, or total frequency\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Recover count from a pie-chart angle.

Unit 2

Measures of Central Tendency

38 / 38 formulas

#44 3–5

Arithmetic mean

\(\bar{x}=\frac{\sum x_i}{n}\)

Variables\(\bar{x}\)Arithmetic mean\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Average of raw data.

#45 3–5

Mean as total divided by count

\(\text{Mean}=\frac{\text{total}}{\text{number of values}}\)

Variables\(\text{Mean}\)Mean as total divided by count\(\text{total}\)sum of the relevant values or counts\(\text{number of values}\)count of data values

Elementary form.

#46 3–5

Total from mean

\(\text{Total}=n\bar{x}\)

Variables\(\text{Total}\)Total from mean\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Useful for missing-value problems.

#47 3–5

Mean after adding one value

\(\bar{x}_{new}=\frac{n\bar{x}+a}{n+1}\)

Variables\(\bar{x}_{new}\)Mean after adding one value\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Add value \(a\).

#48 3–5

Mean after removing one value

\(\bar{x}_{new}=\frac{n\bar{x}-a}{n-1}\)

Variables\(\bar{x}_{new}\)Mean after removing one value\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Remove value \(a\).

#49 3–5

Median position for ordered data

\(\text{Median position}=\frac{n+1}{2}\)

Variables\(\text{Median position}\)Median position for ordered data\(n\)sample size, number of observations, or number of trials

If \(n\) is odd, this is one data position.

#50 3–5

Median for odd \(n\)

\(\text{Median}=x_{\frac{n+1}{2}}\)

Variables\(\text{Median}\)Median for odd \(n\)\(n\)sample size, number of observations, or number of trials

After sorting.

#51 3–5

Median for even \(n\)

\(\text{Median}=\frac{x_{\frac{n}{2}}+x_{\frac{n}{2}+1}}{2}\)

Variables\(\text{Median}\)Median for even \(n\)\(n\)sample size, number of observations, or number of trials

After sorting.

#52 3–5

Mode

\(\text{Mode}=\text{value with highest frequency}\)

Variables\(\text{Mode}\)Mode\(\text{value with highest frequency}\)the named quantity shown in the formula

Can be none, one, or multiple.

#53 3–5

Midrange

\(\text{Midrange}=\frac{\text{minimum}+\text{maximum}}{2}\)

Variables\(\text{Midrange}\)Midrange\(\text{minimum}\)the named quantity shown in the formula\(\text{maximum}\)the named quantity shown in the formula

Sometimes used in early data work.

#54 6–8

Frequency mean

\(\bar{x}=\frac{\sum f_ix_i}{\sum f_i}\)

Variables\(\bar{x}\)Frequency mean\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i

Mean from frequency table.

#55 6–8

Grouped-data mean estimate

\(\bar{x}\approx\frac{\sum f_im_i}{\sum f_i}\)

Variables\(\bar{x}\)Grouped-data mean estimate\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i\(m_i\)midpoint of class i

Use class midpoints \(m_i\).

#56 6–8

Weighted mean

\(\bar{x}_w=\frac{\sum w_ix_i}{\sum w_i}\)

Variables\(\bar{x}_w\)Weighted mean\(\bar{x}\)sample mean or average of x-values\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(w_i\)weight for value i

Weights may be credits, marks, or frequencies.

#57 6–8

Combined mean

\(\bar{x}_{combined}=\frac{n_1\bar{x}_1+n_2\bar{x}_2}{n_1+n_2}\)

Variables\(\bar{x}_{combined}\)Combined mean\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value

For two groups.

#58 6–8

Combined mean for many groups

\(\bar{x}_{combined}=\frac{\sum n_j\bar{x}_j}{\sum n_j}\)

Variables\(\bar{x}_{combined}\)Combined mean for many groups\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value

For several groups.

#59 6–8

Missing value from mean

\(x_{\text{missing}}=n\bar{x}-\sum x_{\text{known}}\)

Variables\(x_{\text{missing}}\)Missing value from mean\(\text{missing}\)the named quantity shown in the formula\(\text{known}\)the named quantity shown in the formula\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

One missing value.

#60 6–8

Missing frequency from mean

\(f_m=\frac{\bar{x}\sum f_{\text{known}}-\sum f_{\text{known}}x_{\text{known}}}{x_m-\bar{x}}\)

Variables\(f_m\)Missing frequency from mean\(\text{known}\)the named quantity shown in the formula\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value

One missing frequency at value \(x_m\).

#61 6–8

Median class condition

\(CF_{\text{before}}<\frac{N}{2}\le CF_{\text{median class}}\)

Variables\(\text{before}\)the named quantity shown in the formula\(\text{median class}\)the named quantity shown in the formula\(N\)total count, population size, or total frequency\(CF\)cumulative frequency before or up to a class

Grouped-data median class.

#62 6–8

Modal class

\(\text{Modal class}=\text{class with largest }f\)

Variables\(\text{Modal class}\)Modal class\(\text{class with largest}\)the named quantity shown in the formula\(f\)frequency or class frequency

Grouped-data mode class.

#63 9–10

Grouped median

\(\text{Median}=L+\left(\frac{\frac{N}{2}-CF}{f}\right)h\)

Variables\(\text{Median}\)Grouped median\(N\)total count, population size, or total frequency\(f\)frequency or class frequency\(CF\)cumulative frequency before or up to a class\(L\)lower boundary, likelihood, or index value named by the formula\(h\)class width, step size, or subscript named by the formula

\(L\)=lower boundary, \(CF\)=cumulative frequency before median class.

#64 9–10

Grouped mode

\(\text{Mode}=L+\frac{f_1-f_0}{2f_1-f_0-f_2}h\)

Variables\(\text{Mode}\)Grouped mode\(L\)lower boundary, likelihood, or index value named by the formula\(h\)class width, step size, or subscript named by the formula

\(f_1\)=modal class frequency.

#65 9–10

Empirical relation

\(\text{Mode}\approx3(\text{Median})-2(\text{Mean})\)

Variables\(\text{Mode}\)Empirical relation\(\text{Median}\)the named quantity shown in the formula\(\text{Mean}\)the named quantity shown in the formula

Approximate relation for moderately skewed data.

#66 9–10

Mean using assumed mean

\(\bar{x}=A+\frac{\sum f_id_i}{\sum f_i}\)

Variables\(\bar{x}\)Mean using assumed mean\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i\(A\)event, assumed mean, actual value, or starting value named by the formula\(d_i\)deviation, rank difference, or transformed value for item i

\(d_i=x_i-A\).

#67 9–10

Mean using step deviation

\(\bar{x}=A+h\frac{\sum f_iu_i}{\sum f_i}\)

Variables\(\bar{x}\)Mean using step deviation\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i\(h\)class width, step size, or subscript named by the formula\(A\)event, assumed mean, actual value, or starting value named by the formula\(u_i\)step-deviation coded value for class i\(u,u_x,u_y,u_z\)measurement uncertainty values

\(u_i=\frac{x_i-A}{h}\).

#68 9–10

Trimmed mean

\(\bar{x}_{trim}=\frac{\sum x_{\text{remaining}}}{n-2k}\)

Variables\(\bar{x}_{trim}\)Trimmed mean\(\text{remaining}\)the named quantity shown in the formula\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(k\)class position, selected count, number of categories, or period length

Remove \(k\) smallest and \(k\) largest values.

#69 9–10

Winsorized mean

\(\bar{x}_{win}=\frac{\sum x_{\text{winsorized}}}{n}\)

Variables\(\bar{x}_{win}\)Winsorized mean\(\text{winsorized}\)the named quantity shown in the formula\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Extreme values are capped.

#70 9–10

Geometric mean

\(GM=\sqrt[n]{x_1x_2\cdots x_n}\)

Variables\(GM\)Geometric mean\(n\)sample size, number of observations, or number of trials

For positive values, growth factors, rates.

#71 9–10

Geometric mean via logs

\(GM=\exp\left(\frac{1}{n}\sum\ln x_i\right)\)

Variables\(GM\)Geometric mean via logs\(x_i\)ith data value or observation\(n\)sample size, number of observations, or number of trials

Positive values only.

#72 9–10

Harmonic mean

\(HM=\frac{n}{\sum\frac{1}{x_i}}\)

Variables\(HM\)Harmonic mean\(x_i\)ith data value or observation\(n\)sample size, number of observations, or number of trials

Useful for average rates.

#73 9–10

Weighted geometric mean

\(GM_w=\prod x_i^{w_i/\sum w_i}\)

Variables\(GM_w\)Weighted geometric mean\(x_i\)ith data value or observation\(w_i\)weight for value i

Positive values only.

#74 9–10

Weighted harmonic mean

\(HM_w=\frac{\sum w_i}{\sum\frac{w_i}{x_i}}\)

Variables\(HM_w\)Weighted harmonic mean\(x_i\)ith data value or observation\(w_i\)weight for value i

Positive values only.

#75 11–12

Population mean

\(\mu=\frac{1}{N}\sum_{i=1}^{N}x_i\)

Variables\(\mu\)Population mean\(x_i\)ith data value or observation\(N\)total count, population size, or total frequency

Population parameter.

#76 11–12

Sample mean

\(\bar{x}=\frac{1}{n}\sum_{i=1}^{n}x_i\)

Variables\(\bar{x}\)Sample mean\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Sample statistic.

#77 11–12

Expected value as mean

\(E(X)=\sum x\,P(X=x)\)

Variables\(E(X)\)Expected value as mean\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula

Discrete random variable.

#78 11–12

Continuous expected value

\(E(X)=\int_{-\infty}^{\infty}x f(x)\,dx\)

Variables\(E(X)\)Continuous expected value\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(f\)frequency or class frequency

Continuous random variable.

#79 11–12

Centering identity

\(\sum(x_i-\bar{x})=0\)

Variables\(\bar{x}\)sample mean or average of x-values\(x_i\)ith data value or observation\(x\)data value, outcome, or input value

Deviations from the mean sum to zero.

#80 11–12

Grand mean

\(\bar{x}_{..}=\frac{\sum_{j=1}^{g}\sum_{i=1}^{n_j}x_{ij}}{\sum_{j=1}^{g}n_j}\)

Variables\(\bar{x}_{..}\)Grand mean\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value

Used in grouped or ANOVA settings.

#81 11–12

Weighted grand mean

\(\bar{x}_{..}=\frac{\sum n_j\bar{x}_j}{\sum n_j}\)

Variables\(\bar{x}_{..}\)Weighted grand mean\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value

Equivalent to combined mean.

Unit 3

Measures of Position: Quartiles, Percentiles, Standard Scores

35 / 35 formulas

#82 3–5

Minimum

\(\min(x)=\text{smallest value}\)

Variables\(\min(x)\)Minimum\(\text{smallest value}\)the named quantity shown in the formula\(x\)data value, outcome, or input value

Lowest observation.

#83 3–5

Maximum

\(\max(x)=\text{largest value}\)

Variables\(\max(x)\)Maximum\(\text{largest value}\)the named quantity shown in the formula\(x\)data value, outcome, or input value

Highest observation.

#84 6–8

Lower quartile position

\(Q_1\text{ position}=\frac{n+1}{4}\)

Variables\(Q_1\text{ position}\)Lower quartile position\(\text{position}\)the named quantity shown in the formula\(n\)sample size, number of observations, or number of trials

One common school convention.

#85 6–8

Upper quartile position

\(Q_3\text{ position}=\frac{3(n+1)}{4}\)

Variables\(Q_3\text{ position}\)Upper quartile position\(\text{position}\)the named quantity shown in the formula\(n\)sample size, number of observations, or number of trials

One common school convention.

#86 6–8

Percentile position

\(P_k\text{ position}=\frac{k}{100}(n+1)\)

Variables\(P_k\text{ position}\)Percentile position\(\text{position}\)the named quantity shown in the formula\(n\)sample size, number of observations, or number of trials\(k\)class position, selected count, number of categories, or period length

One common convention.

#87 6–8

Decile position

\(D_k\text{ position}=\frac{k}{10}(n+1)\)

Variables\(D_k\text{ position}\)Decile position\(\text{position}\)the named quantity shown in the formula\(n\)sample size, number of observations, or number of trials\(k\)class position, selected count, number of categories, or period length

One common convention.

#88 6–8

Median as second quartile

\(Q_2=\text{Median}\)

Variables\(Q_2\)Median as second quartile\(\text{Median}\)the named quantity shown in the formula

Middle quartile.

#89 6–8

Interquartile range

\(IQR=Q_3-Q_1\)

Variables\(IQR\)Interquartile range

Middle 50 percent spread.

#90 6–8

Semi-interquartile range

\(SIQR=\frac{Q_3-Q_1}{2}\)

Variables\(SIQR\)Semi-interquartile range

Also called quartile deviation.

#91 6–8

Five-number summary

\(\{\min,Q_1,Q_2,Q_3,\max\}\)

Variables\(\text{terms}\)the quantities named directly in the formula and note

Used for box plots.

#92 6–8

Lower fence

\(LF=Q_1-1.5(IQR)\)

Variables\(LF\)Lower fence

Box-plot outlier rule.

#93 6–8

Upper fence

\(UF=Q_3+1.5(IQR)\)

Variables\(UF\)Upper fence

Box-plot outlier rule.

#94 6–8

Extreme lower fence

\(ELF=Q_1-3(IQR)\)

Variables\(ELF\)Extreme lower fence

Extreme outlier rule.

#95 6–8

Extreme upper fence

\(EUF=Q_3+3(IQR)\)

Variables\(EUF\)Extreme upper fence

Extreme outlier rule.

#96 9–10

Grouped quartile

\(Q_k=L+\left(\frac{\frac{kN}{4}-CF}{f}\right)h\)

Variables\(Q_k\)Grouped quartile\(f\)frequency or class frequency\(CF\)cumulative frequency before or up to a class\(L\)lower boundary, likelihood, or index value named by the formula\(h\)class width, step size, or subscript named by the formula\(k\)class position, selected count, number of categories, or period length

For \(k=1,2,3\).

#97 9–10

Grouped decile

\(D_k=L+\left(\frac{\frac{kN}{10}-CF}{f}\right)h\)

Variables\(D_k\)Grouped decile\(f\)frequency or class frequency\(CF\)cumulative frequency before or up to a class\(L\)lower boundary, likelihood, or index value named by the formula\(h\)class width, step size, or subscript named by the formula\(k\)class position, selected count, number of categories, or period length

For \(k=1,\dots,9\).

#98 9–10

Grouped percentile

\(P_k=L+\left(\frac{\frac{kN}{100}-CF}{f}\right)h\)

Variables\(P_k\)Grouped percentile\(f\)frequency or class frequency\(CF\)cumulative frequency before or up to a class\(L\)lower boundary, likelihood, or index value named by the formula\(h\)class width, step size, or subscript named by the formula\(k\)class position, selected count, number of categories, or period length

For \(k=1,\dots,99\).

#99 9–10

Percentile rank

\(PR=\frac{\#\text{ values below }x+0.5(\#\text{ values equal }x)}{n}\times100\)

Variables\(PR\)Percentile rank\(\text{values below}\)the named quantity shown in the formula\(\text{values equal}\)the named quantity shown in the formula\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Common rank formula.

#100 9–10

Rank from percentile

\(R=\frac{p}{100}(n+1)\)

Variables\(R\)Rank from percentile\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability

Another school convention.

#101 9–10

Z-score

\(z=\frac{x-\mu}{\sigma}\)

Variables\(z\)Z-score\(\mu\)population mean\(\sigma\)population standard deviation\(x\)data value, outcome, or input value

Population standard score.

#102 9–10

Sample z-score

\(z=\frac{x-\bar{x}}{s}\)

Variables\(z\)Sample z-score\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x\)data value, outcome, or input value

Uses sample mean and sample standard deviation.

#103 9–10

T-score transformation

\(T=50+10z\)

Variables\(T\)T-score transformation\(z\)standard score or normal critical value

Common standardized score scale.

#104 9–10

IQ-style score transformation

\(S=100+15z\)

Variables\(S\)IQ-style score transformation\(z\)standard score or normal critical value

Example standard-score scale.

#105 9–10

Stanine score approximation

\(\text{Stanine}\approx2z+5\)

Variables\(\text{Stanine}\)Stanine score approximation\(z\)standard score or normal critical value

Usually rounded and bounded from 1 to 9.

#106 9–10

Normal percentile

\(\text{Percentile}=\Phi(z)\times100\%\)

Variables\(\text{Percentile}\)Normal percentile\(z\)standard score or normal critical value\(\Phi\)standard normal cumulative distribution function

\(\Phi\) is standard normal CDF.

#107 9–10

Value from z-score

\(x=\mu+z\sigma\)

Variables\(x\)Value from z-score\(\mu\)population mean\(z\)standard score or normal critical value

Reverse standardization.

#108 9–10

Value from sample z-score

\(x=\bar{x}+zs\)

Variables\(x\)Value from sample z-score\(\bar{x}\)sample mean or average of x-values

Reverse standardization.

#109 11–12

Standardization of random variable

\(Z=\frac{X-\mu}{\sigma}\)

Variables\(Z\)Standardization of random variable\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula

Transforms to mean 0 and standard deviation 1.

#110 11–12

Chebyshev lower bound

\(P(\lvert X-\mu\rvert<k\sigma)\ge1-\frac{1}{k^2}\)

Variables\(\mu\)population mean\(X,Y,Z\)random variables or standardized variables used in the formula\(k\)class position, selected count, number of categories, or period length

For \(k>1\).

#111 11–12

Chebyshev outside bound

\(P(\lvert X-\mu\rvert\ge k\sigma)\le\frac{1}{k^2}\)

Variables\(\mu\)population mean\(X,Y,Z\)random variables or standardized variables used in the formula\(k\)class position, selected count, number of categories, or period length

For \(k>0\).

#112 11–12

Empirical rule 68%

\(P(\mu-\sigma<X<\mu+\sigma)\approx0.68\)

Variables\(P(\mu-\sigma<X<\mu+\sigma)\)Empirical rule 68%\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula

Approximately normal data.

#113 11–12

Empirical rule 95%

\(P(\mu-2\sigma<X<\mu+2\sigma)\approx0.95\)

Variables\(P(\mu-2\sigma<X<\mu+2\sigma)\)Empirical rule 95%\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula

Approximately normal data.

#114 11–12

Empirical rule 99.7%

\(P(\mu-3\sigma<X<\mu+3\sigma)\approx0.997\)

Variables\(P(\mu-3\sigma<X<\mu+3\sigma)\)Empirical rule 99.7%\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula

Approximately normal data.

#115 11–12

Normal interquartile range

\(IQR\approx1.349\sigma\)

Variables\(IQR\)Normal interquartile range\(\sigma\)population standard deviation\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

For a normal distribution.

#116 11–12

Normal quartile deviation

\(Q_3-\mu\approx0.674\sigma\)

Variables\(Q_3-\mu\)Normal quartile deviation\(\mu\)population mean\(\sigma\)population standard deviation\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

For a normal distribution.

Unit 4

Measures of Spread, Dispersion, and Variation

41 / 41 formulas

#117 3–5

Range

\(R=\max-\min\)

Variables\(R\)Range

Basic spread.

#118 3–5

Deviation from mean

\(d_i=x_i-\bar{x}\)

Variables\(d_i\)Deviation from mean\(\bar{x}\)sample mean or average of x-values\(x_i\)ith data value or observation\(x\)data value, outcome, or input value

Individual difference from mean.

#119 3–5

Absolute deviation

\(\lvert d_i\rvert=\lvert x_i-\bar{x}\rvert\)

Variables\(\lvert d_i\rvert\)Absolute deviation\(\bar{x}\)sample mean or average of x-values\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(d_i\)deviation, rank difference, or transformed value for item i

Distance from mean.

#120 6–8

Mean absolute deviation

\(MAD=\frac{\sum \lvert x_i-\bar{x}\rvert}{n}\)

Variables\(MAD\)Mean absolute deviation\(\bar{x}\)sample mean or average of x-values\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Average absolute distance from mean.

#121 6–8

Frequency mean absolute deviation

\(MAD=\frac{\sum f_i\lvert x_i-\bar{x}\rvert}{\sum f_i}\)

Variables\(MAD\)Frequency mean absolute deviation\(\bar{x}\)sample mean or average of x-values\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i

Frequency-table version.

#122 6–8

Grouped mean absolute deviation

\(MAD\approx\frac{\sum f_i\lvert m_i-\bar{x}\rvert}{\sum f_i}\)

Variables\(MAD\)Grouped mean absolute deviation\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i\(m_i\)midpoint of class i

Uses class midpoints.

#123 6–8

Population variance

\(\sigma^2=\frac{\sum(x_i-\mu)^2}{N}\)

Variables\(\sigma^2\)Population variance\(\mu\)population mean\(\sigma\)population standard deviation\(x_i\)ith data value or observation\(N\)total count, population size, or total frequency

Population parameter.

#124 6–8

Population standard deviation

\(\sigma=\sqrt{\frac{\sum(x_i-\mu)^2}{N}}\)

Variables\(\sigma\)Population standard deviation\(\mu\)population mean\(x_i\)ith data value or observation\(N\)total count, population size, or total frequency

Population spread.

#125 6–8

Sample variance

\(s^2=\frac{\sum(x_i-\bar{x})^2}{n-1}\)

Variables\(s^2\)Sample variance\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Unbiased sample variance.

#126 6–8

Sample standard deviation

\(s=\sqrt{\frac{\sum(x_i-\bar{x})^2}{n-1}}\)

Variables\(s\)Sample standard deviation\(\bar{x}\)sample mean or average of x-values\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Sample spread.

#127 6–8

Frequency population variance

\(\sigma^2=\frac{\sum f_i(x_i-\mu)^2}{\sum f_i}\)

Variables\(\sigma^2\)Frequency population variance\(\mu\)population mean\(\sigma\)population standard deviation\(x_i\)ith data value or observation\(f_i\)frequency of class or category i

Frequency table.

#128 6–8

Frequency sample variance

\(s^2=\frac{\sum f_i(x_i-\bar{x})^2}{\sum f_i-1}\)

Variables\(s^2\)Frequency sample variance\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i

Frequency table sample version.

#129 9–10

Computational population variance

\(\sigma^2=\frac{\sum x_i^2}{N}-\mu^2\)

Variables\(\sigma^2\)Computational population variance\(\mu\)population mean\(\sigma\)population standard deviation\(x_i\)ith data value or observation\(N\)total count, population size, or total frequency

Shortcut formula.

#130 9–10

Computational sample variance

\(s^2=\frac{\sum x_i^2-\frac{(\sum x_i)^2}{n}}{n-1}\)

Variables\(s^2\)Computational sample variance\(s\)sample standard deviation\(x_i\)ith data value or observation\(n\)sample size, number of observations, or number of trials

Shortcut formula.

#131 9–10

Frequency computational variance

\(\sigma^2=\frac{\sum f_ix_i^2}{\sum f_i}-\bar{x}^2\)

Variables\(\sigma^2\)Frequency computational variance\(\bar{x}\)sample mean or average of x-values\(\sigma\)population standard deviation\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i

Population-style denominator.

#132 9–10

Grouped variance estimate

\(s^2\approx\frac{\sum f_i(m_i-\bar{x})^2}{\sum f_i-1}\)

Variables\(s^2\)Grouped variance estimate\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x\)data value, outcome, or input value\(f_i\)frequency of class or category i\(m_i\)midpoint of class i

Grouped sample estimate.

#133 9–10

Variance using assumed mean

\(\sigma^2=\frac{\sum f_id_i^2}{N}-\left(\frac{\sum f_id_i}{N}\right)^2\)

Variables\(\sigma^2\)Variance using assumed mean\(\sigma\)population standard deviation\(x_i\)ith data value or observation\(N\)total count, population size, or total frequency\(f_i\)frequency of class or category i\(A\)event, assumed mean, actual value, or starting value named by the formula\(d_i\)deviation, rank difference, or transformed value for item i

\(d_i=x_i-A\).

#134 9–10

Standard deviation using assumed mean

\(\sigma=\sqrt{\frac{\sum f_id_i^2}{N}-\left(\frac{\sum f_id_i}{N}\right)^2}\)

Variables\(\sigma\)Standard deviation using assumed mean\(N\)total count, population size, or total frequency\(f_i\)frequency of class or category i\(d_i\)deviation, rank difference, or transformed value for item i

Population-style.

#135 9–10

Variance using step deviation

\(\sigma^2=h^2\left[\frac{\sum f_iu_i^2}{N}-\left(\frac{\sum f_iu_i}{N}\right)^2\right]\)

Variables\(\sigma^2\)Variance using step deviation\(\sigma\)population standard deviation\(x_i\)ith data value or observation\(N\)total count, population size, or total frequency\(f_i\)frequency of class or category i\(h\)class width, step size, or subscript named by the formula\(A\)event, assumed mean, actual value, or starting value named by the formula\(u_i\)step-deviation coded value for class i\(u,u_x,u_y,u_z\)measurement uncertainty values

\(u_i=(x_i-A)/h\).

#136 9–10

Coefficient of range

\(\text{Coefficient of range}=\frac{\max-\min}{\max+\min}\)

Variables\(\text{Coefficient of range}\)Coefficient of range

Sometimes used in applied statistics.

#137 9–10

Coefficient of quartile deviation

\(\frac{Q_3-Q_1}{Q_3+Q_1}\)

Variables\(\text{terms}\)the quantities named directly in the formula and note

Relative quartile spread.

#138 9–10

Coefficient of mean deviation

\(\frac{MD}{\text{average used}}\)

Variables\(\text{average used}\)the named quantity shown in the formula

Average may be mean or median.

#139 9–10

Coefficient of variation

\(CV=\frac{s}{\bar{x}}\times100\%\)

Variables\(CV\)Coefficient of variation\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x\)data value, outcome, or input value

Sample version.

#140 9–10

Population coefficient of variation

\(CV=\frac{\sigma}{\mu}\times100\%\)

Variables\(CV\)Population coefficient of variation\(\mu\)population mean\(\sigma\)population standard deviation

Population version.

#141 9–10

Relative standard deviation

\(RSD=\frac{s}{\bar{x}}\times100\%\)

Variables\(RSD\)Relative standard deviation\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x\)data value, outcome, or input value

Equivalent to sample CV.

#142 9–10

Pooled variance, two samples

\(s_p^2=\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}\)

Variables\(s_p^2\)Pooled variance, two samples\(s_p\)pooled sample standard deviation

Equal-variance two-sample procedures.

#143 11–12

Variance identity

\(Var(X)=E(X^2)-[E(X)]^2\)

Variables\(Var(X)\)Variance identity\(X,Y,Z\)random variables or standardized variables used in the formula

Random variable.

#144 11–12

Standard deviation of random variable

\(SD(X)=\sqrt{Var(X)}\)

Variables\(SD(X)\)Standard deviation of random variable\(X,Y,Z\)random variables or standardized variables used in the formula

Population random-variable spread.

#145 11–12

Variance of shifted variable

\(Var(X+c)=Var(X)\)

Variables\(Var(X+c)\)Variance of shifted variable\(X,Y,Z\)random variables or standardized variables used in the formula\(c,d\)cell counts, constants, or additional values named by the formula

Adding constant does not change variance.

#146 11–12

Variance of scaled variable

\(Var(aX)=a^2Var(X)\)

Variables\(Var(aX)\)Variance of scaled variable\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Scaling by \(a\).

#147 11–12

Standard deviation of scaled variable

\(SD(aX)=\lvert a\rvert SD(X)\)

Variables\(SD(aX)\)Standard deviation of scaled variable\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Scaling rule.

#148 11–12

Variance of linear transformation

\(Var(aX+b)=a^2Var(X)\)

Variables\(Var(aX+b)\)Variance of linear transformation\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

General linear transformation.

#149 11–12

Mean of linear transformation

\(E(aX+b)=aE(X)+b\)

Variables\(E(aX+b)\)Mean of linear transformation\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Transformation of center.

#150 11–12

Variance of independent sum

\(Var(X+Y)=Var(X)+Var(Y)\)

Variables\(Var(X+Y)\)Variance of independent sum\(X,Y,Z\)random variables or standardized variables used in the formula

For independent \(X,Y\).

#151 11–12

Variance of independent difference

\(Var(X-Y)=Var(X)+Var(Y)\)

Variables\(Var(X-Y)\)Variance of independent difference\(X,Y,Z\)random variables or standardized variables used in the formula

For independent \(X,Y\).

#152 11–12

Variance of general sum

\(Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y)\)

Variables\(Var(X+Y)\)Variance of general sum\(X,Y,Z\)random variables or standardized variables used in the formula

General rule.

#153 11–12

Variance of general difference

\(Var(X-Y)=Var(X)+Var(Y)-2Cov(X,Y)\)

Variables\(Var(X-Y)\)Variance of general difference\(X,Y,Z\)random variables or standardized variables used in the formula

General rule.

#154 11–12

Covariance definition

\(Cov(X,Y)=E[(X-\mu_X)(Y-\mu_Y)]\)

Variables\(Cov(X,Y)\)Covariance definition\(\mu\)population mean\(\mu_X,\mu_Y\)population means of random variables X and Y\(X,Y,Z\)random variables or standardized variables used in the formula

Population covariance.

#155 11–12

Covariance computational form

\(Cov(X,Y)=E(XY)-E(X)E(Y)\)

Variables\(Cov(X,Y)\)Covariance computational form\(X,Y,Z\)random variables or standardized variables used in the formula

Shortcut.

#156 11–12

Sample covariance

\(s_{xy}=\frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{n-1}\)

Variables\(s_{xy}\)Sample covariance\(\bar{x}\)sample mean or average of x-values\(\bar{y}\)sample mean or average of y-values\(x_i\)ith data value or observation\(y_i\)ith y-value or response observation\(x\)data value, outcome, or input value\(y\)response value or transformed value\(n\)sample size, number of observations, or number of trials

Sample covariance.

#157 11–12

Population covariance from data

\(\sigma_{xy}=\frac{\sum(x_i-\mu_x)(y_i-\mu_y)}{N}\)

Variables\(\sigma_{xy}\)Population covariance from data\(\mu\)population mean\(\sigma\)population standard deviation\(x_i\)ith data value or observation\(y_i\)ith y-value or response observation\(N\)total count, population size, or total frequency\(u,u_x,u_y,u_z\)measurement uncertainty values

Population covariance.

Unit 5

Data Transformations and Standardization

18 / 18 formulas

#158 6–8

Add constant to all values: mean

\(\bar{x}_{new}=\bar{x}+c\)

Variables\(\bar{x}_{new}\)Add constant to all values: mean\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(c,d\)cell counts, constants, or additional values named by the formula

Every value becomes \(x+c\).

#159 6–8

Add constant to all values: median

\(\text{Median}_{new}=\text{Median}+c\)

Variables\(\text{Median}_{new}\)Add constant to all values: median\(\text{Median}\)the named quantity shown in the formula\(x\)data value, outcome, or input value\(c,d\)cell counts, constants, or additional values named by the formula

Every value becomes \(x+c\).

#160 6–8

Add constant to all values: range

\(R_{new}=R\)

Variables\(R_{new}\)Add constant to all values: range\(R\)range, return, rank, or number of simulation repetitions named by the formula

Spread unchanged.

#161 6–8

Multiply all values: mean

\(\bar{x}_{new}=a\bar{x}\)

Variables\(\bar{x}_{new}\)Multiply all values: mean\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Every value becomes \(ax\).

#162 6–8

Multiply all values: median

\(\text{Median}_{new}=a\cdot\text{Median}\)

Variables\(\text{Median}_{new}\)Multiply all values: median\(\text{Median}\)the named quantity shown in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

For \(a>0\).

#163 6–8

Multiply all values: range

\(R_{new}=\lvert a\rvert R\)

Variables\(R_{new}\)Multiply all values: range\(R\)range, return, rank, or number of simulation repetitions named by the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Spread scales by \(\lvert a\rvert\).

#164 6–8

Linear transformation of mean

\(\bar{y}=a\bar{x}+b\)

Variables\(\bar{y}\)Linear transformation of mean\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(y\)response value or transformed value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

For \(y=ax+b\).

#165 6–8

Linear transformation of standard deviation

\(s_y=\lvert a\rvert s_x\)

Variables\(s_y\)Linear transformation of standard deviation\(s_x,s_y\)sample standard deviations of x and y\(y\)response value or transformed value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

For \(y=ax+b\).

#166 9–10

Linear transformation of variance

\(s_y^2=a^2s_x^2\)

Variables\(s_y^2\)Linear transformation of variance\(s_x,s_y\)sample standard deviations of x and y\(y\)response value or transformed value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

For \(y=ax+b\).

#167 9–10

Linear transformation of z-score

\(z=\frac{x-\bar{x}}{s}\)

Variables\(z\)Linear transformation of z-score\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x\)data value, outcome, or input value

Standardized values have mean 0 and SD 1.

#168 9–10

Standardized mean

\(\bar{z}=0\)

Variables\(\bar{z}\)Standardized mean\(z\)standard score or normal critical value

For standardizing by sample mean and SD.

#169 9–10

Standardized standard deviation

\(s_z=1\)

Variables\(s_z\)Standardized standard deviation

For standardizing by sample mean and SD.

#170 9–10

Rescaling to new mean and SD

\(y=\mu_y+\sigma_y\left(\frac{x-\mu_x}{\sigma_x}\right)\)

Variables\(y\)Rescaling to new mean and SD\(\mu\)population mean\(\sigma\)population standard deviation\(x\)data value, outcome, or input value\(u,u_x,u_y,u_z\)measurement uncertainty values

Maps \(x\) scale to new scale.

#171 9–10

Min-max scaling

\(x'=\frac{x-\min}{\max-\min}\)

Variables\(x'\)Min-max scaling\(x\)data value, outcome, or input value

Maps values to 0–1 range.

#172 9–10

Percent scaling

\(x_{\%}=\frac{x}{\text{maximum possible}}\times100\%\)

Variables\(x_{\%}\)Percent scaling\(\text{maximum possible}\)the named quantity shown in the formula\(x\)data value, outcome, or input value

Used for marks and scores.

#173 9–10

Index number

\(I=\frac{\text{current value}}{\text{base value}}\times100\)

Variables\(I\)Index number\(\text{current value}\)value in the current period\(\text{base value}\)value in the base period

Base usually equals 100.

#174 9–10

Relative change

\(\frac{\text{new}-\text{old}}{\text{old}}\)

Variables\(\text{new}\)new or final value\(\text{old}\)old or initial value

Decimal change.

#175 9–10

Percentage change

\(\frac{\text{new}-\text{old}}{\text{old}}\times100\%\)

Variables\(\text{new}\)new or final value\(\text{old}\)old or initial value

Percent change.

Unit 6

Bivariate Data, Correlation, Regression, and Residuals

39 / 39 formulas

#176 6–8

Ordered pair

\((x_i,y_i)\)

Variables\(x_i\)ith data value or observation\(y_i\)ith y-value or response observation

One bivariate observation.

#177 6–8

Residual

\(e_i=y_i-\hat{y}_i\)

Variables\(e_i\)Residual\(\hat{y}\)predicted value of y\(y_i\)ith y-value or response observation\(y\)response value or transformed value

Observed minus predicted.

#178 6–8

Prediction from line

\(\hat{y}=a+bx\)

Variables\(\hat{y}\)Prediction from line\(y\)response value or transformed value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Simple linear prediction.

#179 6–8

Line slope from two points

\(b=\frac{y_2-y_1}{x_2-x_1}\)

Variables\(b\)Line slope from two points\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Used for trend lines.

#180 6–8

Line intercept from point

\(a=y_1-bx_1\)

Variables\(a\)Line intercept from point\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

For line through a known point.

#181 9–10

Least-squares regression line

\(\hat{y}=a+bx\)

Variables\(\hat{y}\)Least-squares regression line\(x\)data value, outcome, or input value\(y\)response value or transformed value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Predict \(y\) from \(x\).

#182 9–10

Regression slope

\(b=\frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sum(x_i-\bar{x})^2}\)

Variables\(b\)Regression slope\(\bar{x}\)sample mean or average of x-values\(\bar{y}\)sample mean or average of y-values\(x_i\)ith data value or observation\(y_i\)ith y-value or response observation\(x\)data value, outcome, or input value\(y\)response value or transformed value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Least-squares slope.

#183 9–10

Regression intercept

\(a=\bar{y}-b\bar{x}\)

Variables\(a\)Regression intercept\(\bar{x}\)sample mean or average of x-values\(\bar{y}\)sample mean or average of y-values\(x\)data value, outcome, or input value\(y\)response value or transformed value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Line passes through \((\bar{x},\bar{y})\).

#184 9–10

Correlation coefficient

\(r=\frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum(x_i-\bar{x})^2\sum(y_i-\bar{y})^2}}\)

Variables\(r\)Correlation coefficient\(\bar{x}\)sample mean or average of x-values\(\bar{y}\)sample mean or average of y-values\(x_i\)ith data value or observation\(y_i\)ith y-value or response observation\(x\)data value, outcome, or input value\(y\)response value or transformed value

Pearson correlation.

#185 9–10

Correlation using standardized scores

\(r=\frac{1}{n-1}\sum z_{x_i}z_{y_i}\)

Variables\(r\)Correlation using standardized scores\(x_i\)ith data value or observation\(y_i\)ith y-value or response observation\(n\)sample size, number of observations, or number of trials

Sample correlation.

#186 9–10

Population correlation

\(\rho=\frac{Cov(X,Y)}{\sigma_X\sigma_Y}\)

Variables\(\rho\)Population correlation\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula

Population parameter.

#187 9–10

Sample correlation from covariance

\(r=\frac{s_{xy}}{s_xs_y}\)

Variables\(r\)Sample correlation from covariance\(s_x,s_y\)sample standard deviations of x and y

Sample version.

#188 9–10

Slope from correlation

\(b=r\frac{s_y}{s_x}\)

Variables\(b\)Slope from correlation\(s_x,s_y\)sample standard deviations of x and y\(x\)data value, outcome, or input value\(y\)response value or transformed value\(r\)correlation coefficient, rate, rank, or period count named by the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Regression of \(y\) on \(x\).

#189 9–10

Coefficient of determination

\(R^2=r^2\)

Variables\(R^2\)Coefficient of determination\(r\)correlation coefficient, rate, rank, or period count named by the formula\(R\)range, return, rank, or number of simulation repetitions named by the formula

Simple linear regression.

#190 9–10

Total sum of squares

\(SST=\sum(y_i-\bar{y})^2\)

Variables\(SST\)Total sum of squares\(\bar{y}\)sample mean or average of y-values\(y_i\)ith y-value or response observation\(y\)response value or transformed value\(SST,SSR,SSE\)total, regression, and residual sums of squares

Total variation in \(y\).

#191 9–10

Residual sum of squares

\(SSE=\sum(y_i-\hat{y}_i)^2\)

Variables\(SSE\)Residual sum of squares\(\hat{y}\)predicted value of y\(y_i\)ith y-value or response observation\(y\)response value or transformed value\(SST,SSR,SSE\)total, regression, and residual sums of squares

Unexplained variation.

#192 9–10

Regression sum of squares

\(SSR=\sum(\hat{y}_i-\bar{y})^2\)

Variables\(SSR\)Regression sum of squares\(\bar{y}\)sample mean or average of y-values\(\hat{y}\)predicted value of y\(y\)response value or transformed value\(SST,SSR,SSE\)total, regression, and residual sums of squares

Explained variation.

#193 9–10

SST decomposition

\(SST=SSR+SSE\)

Variables\(SST\)SST decomposition\(SST,SSR,SSE\)total, regression, and residual sums of squares

For regression with intercept.

#194 9–10

Coefficient of determination from sums

\(R^2=\frac{SSR}{SST}=1-\frac{SSE}{SST}\)

Variables\(R^2\)Coefficient of determination from sums\(R\)range, return, rank, or number of simulation repetitions named by the formula\(SST,SSR,SSE\)total, regression, and residual sums of squares

Regression fit measure.

#195 9–10

Mean squared error

\(MSE=\frac{SSE}{n-2}\)

Variables\(MSE\)Mean squared error\(n\)sample size, number of observations, or number of trials\(MAE,MSE,RMSE,MAPE\)common prediction or error-size metrics\(SST,SSR,SSE\)total, regression, and residual sums of squares

Simple linear regression.

#196 9–10

Residual standard error

\(s_e=\sqrt{\frac{SSE}{n-2}}\)

Variables\(s_e\)Residual standard error\(n\)sample size, number of observations, or number of trials\(SST,SSR,SSE\)total, regression, and residual sums of squares

Typical residual size.

#197 9–10

Root mean square error

\(RMSE=\sqrt{\frac{1}{n}\sum(y_i-\hat{y}_i)^2}\)

Variables\(RMSE\)Root mean square error\(\hat{y}\)predicted value of y\(y_i\)ith y-value or response observation\(y\)response value or transformed value\(n\)sample size, number of observations, or number of trials\(MAE,MSE,RMSE,MAPE\)common prediction or error-size metrics

Prediction error metric.

#198 9–10

Mean absolute error

\(MAE=\frac{1}{n}\sum\lvert y_i-\hat{y}_i\rvert\)

Variables\(MAE\)Mean absolute error\(\hat{y}\)predicted value of y\(y_i\)ith y-value or response observation\(y\)response value or transformed value\(n\)sample size, number of observations, or number of trials\(MAE,MSE,RMSE,MAPE\)common prediction or error-size metrics

Prediction error metric.

#199 9–10

Mean absolute percentage error

\(MAPE=\frac{100\%}{n}\sum\left\lvert\frac{y_i-\hat{y}_i}{y_i}\right\rvert\)

Variables\(MAPE\)Mean absolute percentage error\(\hat{y}\)predicted value of y\(y_i\)ith y-value or response observation\(y\)response value or transformed value\(n\)sample size, number of observations, or number of trials\(MAE,MSE,RMSE,MAPE\)common prediction or error-size metrics

Requires nonzero \(y_i\).

#200 9–10

Standardized residual

\(r_i=\frac{e_i}{s_e}\)

Variables\(r_i\)Standardized residual\(s_e\)residual standard error\(e_i\)residual or error for observation i

Simple school-level form.

#201 11–12

Standard error of slope

\(SE_b=\frac{s_e}{\sqrt{\sum(x_i-\bar{x})^2}}\)

Variables\(SE_b\)Standard error of slope\(\bar{x}\)sample mean or average of x-values\(s_e\)residual standard error\(x_i\)ith data value or observation\(x\)data value, outcome, or input value

Inference for regression slope.

#202 11–12

Standard error of intercept

\(SE_a=s_e\sqrt{\frac{1}{n}+\frac{\bar{x}^2}{\sum(x_i-\bar{x})^2}}\)

Variables\(SE_a\)Standard error of intercept\(\bar{x}\)sample mean or average of x-values\(s_e\)residual standard error\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Inference for intercept.

#203 11–12

t statistic for slope

\(t=\frac{b-\beta_0}{SE_b}\)

Variables\(t\)t statistic for slope\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(\beta\)Type II error probability, regression parameter, or distribution parameter

Usually test \(H_0:\beta=0\).

#204 11–12

Regression slope confidence interval

\(b\pm t^*SE_b\)

Variables\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(t\)t statistic or time period\(t^*\)critical t-value

Simple linear regression.

#205 11–12

Degrees of freedom for slope test

\(df=n-2\)

Variables\(df\)Degrees of freedom for slope test\(n\)sample size, number of observations, or number of trials

Simple linear regression.

#206 11–12

Predicted mean response SE

\(SE_{\hat{y}}=s_e\sqrt{\frac{1}{n}+\frac{(x_0-\bar{x})^2}{\sum(x_i-\bar{x})^2}}\)

Variables\(SE_{\hat{y}}\)Predicted mean response SE\(\bar{x}\)sample mean or average of x-values\(\hat{y}\)predicted value of y\(s_e\)residual standard error\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(y\)response value or transformed value\(n\)sample size, number of observations, or number of trials\(x_t,x_0\)time-series value at time t and base-time value

CI for mean response.

#207 11–12

Prediction interval SE

\(SE_{pred}=s_e\sqrt{1+\frac{1}{n}+\frac{(x_0-\bar{x})^2}{\sum(x_i-\bar{x})^2}}\)

Variables\(SE_{pred}\)Prediction interval SE\(\bar{x}\)sample mean or average of x-values\(s_e\)residual standard error\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(x_t,x_0\)time-series value at time t and base-time value

PI for individual response.

#208 11–12

Mean response CI

\(\hat{y}_0\pm t^*SE_{\hat{y}}\)

Variables\(\hat{y}\)predicted value of y\(x\)data value, outcome, or input value\(y\)response value or transformed value\(t\)t statistic or time period\(t^*\)critical t-value\(x_t,x_0\)time-series value at time t and base-time value

At \(x=x_0\).

#209 11–12

Individual prediction interval

\(\hat{y}_0\pm t^*SE_{pred}\)

At \(x=x_0\).

#210 11–12

Spearman rank correlation

\(r_s=1-\frac{6\sum d_i^2}{n(n^2-1)}\)

Variables\(r_s\)Spearman rank correlation\(n\)sample size, number of observations, or number of trials\(d_i\)deviation, rank difference, or transformed value for item i

No tied ranks formula.

#211 11–12

Kendall tau

\(\tau=\frac{C-D}{\binom{n}{2}}\)

Variables\(\tau\)Kendall tau\(n\)sample size, number of observations, or number of trials\(B,C\)events, groups, constants, or distribution labels named by the formula

\(C\)=concordant pairs, \(D\)=discordant pairs.

#212 11–12

Phi coefficient

\(\phi=\frac{ad-bc}{\sqrt{(a+b)(c+d)(a+c)(b+d)}}\)

Variables\(\phi\)Phi coefficient\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(c,d\)cell counts, constants, or additional values named by the formula

For \(2\times2\) table.

#213 11–12

Cramér's V

\(V=\sqrt{\frac{\chi^2}{n(k-1)}}\)

Variables\(V\)Cramér's V\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula\(c,d\)cell counts, constants, or additional values named by the formula\(\chi^2\)chi-square statistic or critical value\(k\)class position, selected count, number of categories, or period length

\(k=\min(r,c)\).

#214 11–12

Adjusted \(R^2\)

\(R^2_{adj}=1-(1-R^2)\frac{n-1}{n-p-1}\)

Variables\(R^2_{adj}\)Adjusted \(R^2\)\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability\(R\)range, return, rank, or number of simulation repetitions named by the formula\(R^2\)coefficient of determination

\(p\)=number of predictors.

Unit 7

Probability Basics and Set Rules

45 / 45 formulas

#215 3–5

Classical probability

\(P(A)=\frac{\text{favourable outcomes}}{\text{total equally likely outcomes}}\)

Variables\(P(A)\)Classical probability\(\text{favourable outcomes}\)outcomes that satisfy the event\(\text{total equally likely outcomes}\)all equally likely outcomes in the sample space\(A\)event, assumed mean, actual value, or starting value named by the formula\(A,B,C\)events or sets in the sample space

Elementary probability.

#216 3–5

Probability scale

\(0\le P(A)\le1\)

Variables\(A\)event, assumed mean, actual value, or starting value named by the formula\(P(A)\)probability of event A\(A,B,C\)events or sets in the sample space

All probabilities lie between 0 and 1.

#217 3–5

Impossible event

\(P(\varnothing)=0\)

Variables\(P(\varnothing)\)Impossible event

Never occurs.

#218 3–5

Certain event

\(P(S)=1\)

Variables\(P(S)\)Certain event\(S\)sample space, score, or smoothed value named by the formula

Sample space occurs.

#219 3–5

Complement rule

\(P(A^c)=1-P(A)\)

Variables\(P(A^c)\)Complement rule\(A\)event, assumed mean, actual value, or starting value named by the formula\(A^c\)complement of event A\(P(A)\)probability of event A\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Probability of not \(A\).

#220 3–5

Probability as percent

\(P(A)\times100\%\)

Variables\(A\)event, assumed mean, actual value, or starting value named by the formula\(P(A)\)probability of event A\(A,B,C\)events or sets in the sample space

Convert probability to percent.

#221 3–5

Probability from relative frequency

\(P(A)\approx\frac{f_A}{N}\)

Variables\(P(A)\)Probability from relative frequency\(N\)total count, population size, or total frequency\(A\)event, assumed mean, actual value, or starting value named by the formula\(A,B,C\)events or sets in the sample space

Experimental probability.

#222 6–8

Odds in favour

\(\text{Odds in favour}=\frac{P(A)}{P(A^c)}\)

Variables\(\text{Odds in favour}\)Odds in favour\(A\)event, assumed mean, actual value, or starting value named by the formula\(A^c\)complement of event A\(P(A)\)probability of event A\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Often written as a ratio.

#223 6–8

Odds against

\(\text{Odds against}=\frac{P(A^c)}{P(A)}\)

Variables\(\text{Odds against}\)Odds against\(A\)event, assumed mean, actual value, or starting value named by the formula\(A^c\)complement of event A\(P(A)\)probability of event A\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Often written as a ratio.

#224 6–8

Probability from odds in favour \(a:b\)

\(P(A)=\frac{a}{a+b}\)

Variables\(P(A)\)Probability from odds in favour \(a:b\)\(A\)event, assumed mean, actual value, or starting value named by the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(A,B,C\)events or sets in the sample space

Odds in favour \(a:b\).

#225 6–8

Union rule

\(P(A\cup B)=P(A)+P(B)-P(A\cap B)\)

Variables\(P(A\cup B)\)Union rule\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cup\)union: event A or event B occurs\(\cap\)intersection: both events occur\(A,B,C\)events or sets in the sample space

General addition rule.

#226 6–8

Mutually exclusive union

\(P(A\cup B)=P(A)+P(B)\)

Variables\(P(A\cup B)\)Mutually exclusive union\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cup\)union: event A or event B occurs\(\cap\)intersection: both events occur\(A,B,C\)events or sets in the sample space

When \(A\cap B=\varnothing\).

#227 6–8

Intersection of independent events

\(P(A\cap B)=P(A)P(B)\)

Variables\(P(A\cap B)\)Intersection of independent events\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(A,B,C\)events or sets in the sample space

Independent events.

#228 6–8

Conditional probability

\(P(A\mid B)=\frac{P(A\cap B)}{P(B)}\)

Variables\(P(A\mid B)\)Conditional probability\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(\mid\)given condition in conditional probability\(A,B,C\)events or sets in the sample space

Requires \(P(B)>0\).

#229 6–8

Multiplication rule

\(P(A\cap B)=P(A\mid B)P(B)\)

Variables\(P(A\cap B)\)Multiplication rule\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(\mid\)given condition in conditional probability\(A,B,C\)events or sets in the sample space

General rule.

#230 6–8

Alternative multiplication rule

\(P(A\cap B)=P(B\mid A)P(A)\)

Variables\(P(A\cap B)\)Alternative multiplication rule\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(\mid\)given condition in conditional probability\(A,B,C\)events or sets in the sample space

General rule.

#231 6–8

Independent conditional probability

\(P(A\mid B)=P(A)\)

Variables\(P(A\mid B)\)Independent conditional probability\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\mid\)given condition in conditional probability\(A,B,C\)events or sets in the sample space

Definition of independence.

#232 6–8

Complement of union

\(P((A\cup B)^c)=1-P(A\cup B)\)

Variables\(P((A\cup B)^c)\)Complement of union\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cup\)union: event A or event B occurs\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Not \(A\) or \(B\).

#233 6–8

At least one rule

\(P(\text{at least one})=1-P(\text{none})\)

Variables\(P(\text{at least one})\)At least one rule\(\text{at least one}\)the named quantity shown in the formula\(\text{none}\)the named quantity shown in the formula

Very common shortcut.

#234 6–8

Neither rule

\(P(\text{neither }A\text{ nor }B)=1-P(A\cup B)\)

Variables\(P(\text{neither }A\text{ nor }B)\)Neither rule\(\text{neither}\)the named quantity shown in the formula\(\text{nor}\)the named quantity shown in the formula\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cup\)union: event A or event B occurs\(A,B,C\)events or sets in the sample space

Two-event case.

#235 6–8

Venn two-set count

\(n(A\cup B)=n(A)+n(B)-n(A\cap B)\)

Variables\(n(A\cup B)\)Venn two-set count\(n\)sample size, number of observations, or number of trials\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(\cup\)union: event A or event B occurs\(\cap\)intersection: both events occur\(A,B,C\)events or sets in the sample space

Counting version.

#236 6–8

Complement count

\(n(A^c)=n(S)-n(A)\)

Variables\(n(A^c)\)Complement count\(n\)sample size, number of observations, or number of trials\(A\)event, assumed mean, actual value, or starting value named by the formula\(A^c\)complement of event A\(S\)sample space, score, or smoothed value named by the formula\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Counting version.

#237 9–10

Three-event union probability

\(P(A\cup B\cup C)=P(A)+P(B)+P(C)-P(A\cap B)-P(A\cap C)-P(B\cap C)+P(A\cap B\cap C)\)

Variables\(P(A\cup B\cup C)\)Three-event union probability\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cup\)union: event A or event B occurs\(\cap\)intersection: both events occur\(A,B,C\)events or sets in the sample space

Inclusion-exclusion.

#238 9–10

Three-set count

\(n(A\cup B\cup C)=n(A)+n(B)+n(C)-n(A\cap B)-n(A\cap C)-n(B\cap C)+n(A\cap B\cap C)\)

Variables\(n(A\cup B\cup C)\)Three-set count\(n\)sample size, number of observations, or number of trials\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(\cup\)union: event A or event B occurs\(\cap\)intersection: both events occur\(A,B,C\)events or sets in the sample space

Counting version.

#239 9–10

Law of total probability, two cases

\(P(A)=P(A\mid B)P(B)+P(A\mid B^c)P(B^c)\)

Variables\(P(A)\)Law of total probability, two cases\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(\mid\)given condition in conditional probability\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Partition \(B,B^c\).

#240 9–10

Law of total probability

\(P(A)=\sum_iP(A\mid B_i)P(B_i)\)

Variables\(P(A)\)Law of total probability\(m_i\)midpoint of class i\(A\)event, assumed mean, actual value, or starting value named by the formula\(\mid\)given condition in conditional probability\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(A,B,C\)events or sets in the sample space

\(\{B_i\}\) is a partition.

#241 9–10

Bayes theorem, two cases

\(P(B\mid A)=\frac{P(A\mid B)P(B)}{P(A\mid B)P(B)+P(A\mid B^c)P(B^c)}\)

Variables\(P(B\mid A)\)Bayes theorem, two cases\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\mid\)given condition in conditional probability\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Common diagnostic form.

#242 9–10

Bayes theorem

\(P(B_j\mid A)=\frac{P(A\mid B_j)P(B_j)}{\sum_iP(A\mid B_i)P(B_i)}\)

Variables\(P(B_j\mid A)\)Bayes theorem\(m_i\)midpoint of class i\(A\)event, assumed mean, actual value, or starting value named by the formula\(P(A)\)probability of event A\(\mid\)given condition in conditional probability\(A,B,C\)events or sets in the sample space

General partition form.

#243 9–10

De Morgan probability 1

\(P((A\cup B)^c)=P(A^c\cap B^c)\)

Variables\(P((A\cup B)^c)\)De Morgan probability 1\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(A^c\)complement of event A\(P(A)\)probability of event A\(\cup\)union: event A or event B occurs\(\cap\)intersection: both events occur\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Set identity.

#244 9–10

De Morgan probability 2

\(P((A\cap B)^c)=P(A^c\cup B^c)\)

Variables\(P((A\cap B)^c)\)De Morgan probability 2\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(A^c\)complement of event A\(P(A)\)probability of event A\(\cup\)union: event A or event B occurs\(\cap\)intersection: both events occur\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Set identity.

#245 9–10

Complement of intersection

\(P((A\cap B)^c)=1-P(A\cap B)\)

Variables\(P((A\cap B)^c)\)Complement of intersection\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Not both.

#246 9–10

Complement of conditional event

\(P(A^c\mid B)=1-P(A\mid B)\)

Variables\(P(A^c\mid B)\)Complement of conditional event\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(A^c\)complement of event A\(P(A)\)probability of event A\(\mid\)given condition in conditional probability\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

Given \(B\).

#247 9–10

Independence of complements

\(A\perp B\Rightarrow A^c\perp B\)

Variables\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(A^c\)complement of event A\(c,d\)cell counts, constants, or additional values named by the formula\(A,B,C\)events or sets in the sample space

If \(A\) and \(B\) are independent.

#248 9–10

Joint probability from table

\(P(A\cap B)=\frac{n(A\cap B)}{n(S)}\)

Variables\(P(A\cap B)\)Joint probability from table\(n\)sample size, number of observations, or number of trials\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(S\)sample space, score, or smoothed value named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(A,B,C\)events or sets in the sample space

Two-way table.

#249 9–10

Marginal probability from table

\(P(A)=\frac{n(A)}{n(S)}\)

Variables\(P(A)\)Marginal probability from table\(n\)sample size, number of observations, or number of trials\(A\)event, assumed mean, actual value, or starting value named by the formula\(S\)sample space, score, or smoothed value named by the formula\(A,B,C\)events or sets in the sample space

Row or column total over grand total.

#250 9–10

Conditional table probability

\(P(A\mid B)=\frac{n(A\cap B)}{n(B)}\)

Variables\(P(A\mid B)\)Conditional table probability\(n\)sample size, number of observations, or number of trials\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(\mid\)given condition in conditional probability\(A,B,C\)events or sets in the sample space

Restrict denominator to \(B\).

#251 9–10

Expected frequency

\(E_i=Np_i\)

Variables\(E_i\)Expected frequency\(p_i\)probability or proportion for category i\(O_i,E_i\)observed and expected counts

Expected count from probability.

#252 9–10

Relative risk

\(RR=\frac{P(A\mid B)}{P(A\mid B^c)}\)

Variables\(RR\)Relative risk\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\mid\)given condition in conditional probability\(c,d\)cell counts, constants, or additional values named by the formula\(RR,RD,OR,NNT\)relative risk, risk difference, odds ratio, and number needed to treat\(A,B,C\)events or sets in the sample space

Used in applied statistics.

#253 9–10

Odds ratio

\(OR=\frac{ad}{bc}\)

Variables\(OR\)Odds ratio\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(c,d\)cell counts, constants, or additional values named by the formula\(RR,RD,OR,NNT\)relative risk, risk difference, odds ratio, and number needed to treat

For \(2\times2\) table with cells \(a,b,c,d\).

#254 11–12

Independent events product, many events

\(P(A_1\cap\cdots\cap A_k)=\prod_{i=1}^{k}P(A_i)\)

Variables\(P(A_1\cap\cdots\cap A_k)\)Independent events product, many events\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(k\)class position, selected count, number of categories, or period length\(A_i,F_i\)actual and forecast values for item i\(A,B,C\)events or sets in the sample space

Mutual independence.

#255 11–12

Chain rule of probability

\(P(A_1\cap\cdots\cap A_k)=P(A_1)\prod_{i=2}^{k}P(A_i\mid A_1\cap\cdots\cap A_{i-1})\)

Variables\(P(A_1\cap\cdots\cap A_k)\)Chain rule of probability\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(\mid\)given condition in conditional probability\(k\)class position, selected count, number of categories, or period length\(A_i,F_i\)actual and forecast values for item i\(A,B,C\)events or sets in the sample space

General multiplication rule.

#256 11–12

Bonferroni inequality

\(P(A\cap B)\ge P(A)+P(B)-1\)

Variables\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(A,B,C\)events or sets in the sample space

Lower bound.

#257 11–12

Union bound

\(P(A\cup B)\le P(A)+P(B)\)

Variables\(s\)sample standard deviation\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cup\)union: event A or event B occurs\(A,B,C\)events or sets in the sample space

Boole's inequality.

#258 11–12

General union bound

\(P\left(\bigcup_i A_i\right)\le\sum_iP(A_i)\)

Variables\(s\)sample standard deviation\(m_i\)midpoint of class i\(p_i\)probability or proportion for category i\(P(A)\)probability of event A\(A_i,F_i\)actual and forecast values for item i\(A,B,C\)events or sets in the sample space

Boole's inequality.

#259 11–12

Conditional independence

\(P(A\cap B\mid C)=P(A\mid C)P(B\mid C)\)

Variables\(P(A\cap B\mid C)\)Conditional independence\(A\)event, assumed mean, actual value, or starting value named by the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(P(A)\)probability of event A\(\cap\)intersection: both events occur\(\mid\)given condition in conditional probability\(A,B,C\)events or sets in the sample space

Given event \(C\).

Unit 8

Counting, Permutations, Combinations, and Arrangements

24 / 24 formulas

#260 6–8

Addition principle

\(N=N_1+N_2+\cdots+N_k\)

Variables\(N\)Addition principle

For disjoint choices.

#261 6–8

Multiplication principle

\(N=N_1N_2\cdots N_k\)

Variables\(N\)Multiplication principle

For sequential choices.

#262 6–8

Factorial

\(n!=n(n-1)(n-2)\cdots1\)

Variables\(n!\)Factorial\(n\)sample size, number of observations, or number of trials

With \(0!=1\).

#263 6–8

Arrangements of \(n\) distinct objects

\(n!\)

Variables\(n\)sample size, number of observations, or number of trials

All objects used.

#264 6–8

Permutations

\({}^nP_r=\frac{n!}{(n-r)!}\)

Variables\({}^nP_r\)Permutations\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula

Order matters.

#265 6–8

Combinations

\({}^nC_r=\binom{n}{r}=\frac{n!}{r!(n-r)!}\)

Variables\({}^nC_r\)Combinations\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula

Order does not matter.

#266 6–8

Combination symmetry

\(\binom{n}{r}=\binom{n}{n-r}\)

Variables\(\binom{n}{r}\)Combination symmetry\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula

Choose \(r\) or leave \(n-r\).

#267 6–8

Combination recurrence

\(\binom{n}{r}=\binom{n-1}{r-1}+\binom{n-1}{r}\)

Variables\(\binom{n}{r}\)Combination recurrence\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula

Pascal identity.

#268 9–10

Permutations with repetition allowed

\(n^r\)

Variables\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula

\(r\) choices from \(n\) options each time.

#269 9–10

Combinations with repetition

\(\binom{n+r-1}{r}\)

Variables\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula

Stars and bars.

#270 9–10

Permutations with identical objects

\(\frac{n!}{n_1!n_2!\cdots n_k!}\)

Variables\(n\)sample size, number of observations, or number of trials

Repeated types.

#271 9–10

Circular permutations

\((n-1)!\)

Variables\(n\)sample size, number of observations, or number of trials

Rotations considered identical.

#272 9–10

Circular permutations with reflection same

\(\frac{(n-1)!}{2}\)

Variables\(n\)sample size, number of observations, or number of trials

For reversible necklaces, \(n>2\).

#273 9–10

Ordered sample without replacement

\(\frac{N!}{(N-n)!}\)

Variables\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency

Same as \(P(N,n)\).

#274 9–10

Unordered sample without replacement

\(\binom{N}{n}\)

Variables\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency

Simple random samples.

#275 9–10

Ordered sample with replacement

\(N^n\)

Variables\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency

Repeated choices allowed.

#276 9–10

Unordered sample with replacement

\(\binom{N+n-1}{n}\)

Variables\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency

Multisets.

#277 9–10

Number of subsets

\(2^n\)

Variables\(n\)sample size, number of observations, or number of trials

All subsets of an \(n\)-element set.

#278 9–10

Number of proper subsets

\(2^n-1\)

Variables\(n\)sample size, number of observations, or number of trials

Excludes the full set.

#279 9–10

Binomial coefficient as probability count

\(\binom{n}{k}\)

Variables\(n\)sample size, number of observations, or number of trials\(k\)class position, selected count, number of categories, or period length

Number of ways to choose success positions.

#280 11–12

Multinomial coefficient

\(\binom{n}{n_1,n_2,\dots,n_k}=\frac{n!}{n_1!n_2!\cdots n_k!}\)

Variables\(\binom{n}{n_1,n_2,\dots,n_k}\)Multinomial coefficient\(n\)sample size, number of observations, or number of trials

Counts category allocations.

#281 11–12

Binomial theorem

\((a+b)^n=\sum_{k=0}^{n}\binom{n}{k}a^{n-k}b^k\)

Variables\((a+b)^n\)Binomial theorem\(n\)sample size, number of observations, or number of trials\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(k\)class position, selected count, number of categories, or period length

Used in binomial probabilities.

#282 11–12

Multinomial theorem

\((x_1+\cdots+x_k)^n=\sum\frac{n!}{n_1!\cdots n_k!}x_1^{n_1}\cdots x_k^{n_k}\)

Variables\((x_1+\cdots+x_k)^n\)Multinomial theorem\(n\)sample size, number of observations, or number of trials

Sum over \(n_1+\cdots+n_k=n\).

#283 11–12

Inclusion-exclusion count general

\(\left\lvert\bigcup_iA_i\right\rvert=\sum\lvert A_i\rvert-\sum\lvert A_i\cap A_j\rvert+\sum\lvert A_i\cap A_j\cap A_k\rvert-\cdots\)

Variables\(\left\lvert\bigcup_iA_i\right\rvert\)Inclusion-exclusion count general\(p_i\)probability or proportion for category i\(\cap\)intersection: both events occur\(A_i,F_i\)actual and forecast values for item i

General finite version.

Unit 9

Random Variables and Discrete Distributions

41 / 41 formulas

#284 9–10

Discrete expected value

\(\mu_X=E(X)=\sum x_ip_i\)

Variables\(\mu_X\)Discrete expected value\(\mu\)population mean\(\mu_X,\mu_Y\)population means of random variables X and Y\(x_i\)ith data value or observation\(X,Y,Z\)random variables or standardized variables used in the formula\(p_i\)probability or proportion for category i\(x_i,p_i\)outcome value and its probability

Weighted average of outcomes.

#285 9–10

Discrete variance

\(\sigma_X^2=Var(X)=\sum(x_i-\mu_X)^2p_i\)

Variables\(\sigma_X^2\)Discrete variance\(\mu\)population mean\(\mu_X,\mu_Y\)population means of random variables X and Y\(\sigma\)population standard deviation\(x_i\)ith data value or observation\(X,Y,Z\)random variables or standardized variables used in the formula\(p_i\)probability or proportion for category i\(x_i,p_i\)outcome value and its probability

Population variance of \(X\).

#286 9–10

Discrete variance shortcut

\(Var(X)=\sum x_i^2p_i-\mu_X^2\)

Variables\(Var(X)\)Discrete variance shortcut\(\mu\)population mean\(\mu_X,\mu_Y\)population means of random variables X and Y\(x_i\)ith data value or observation\(X,Y,Z\)random variables or standardized variables used in the formula\(p_i\)probability or proportion for category i\(x_i,p_i\)outcome value and its probability

Equivalent form.

#287 9–10

Discrete standard deviation

\(\sigma_X=\sqrt{Var(X)}\)

Variables\(\sigma_X\)Discrete standard deviation\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula

Spread of random variable.

#288 9–10

Expected value of function

\(E[g(X)]=\sum g(x_i)p_i\)

Variables\(E[g(X)]\)Expected value of function\(x_i\)ith data value or observation\(X,Y,Z\)random variables or standardized variables used in the formula\(p_i\)probability or proportion for category i\(x_i,p_i\)outcome value and its probability

Discrete case.

#289 9–10

Linearity of expectation

\(E(aX+b)=aE(X)+b\)

Variables\(E(aX+b)\)Linearity of expectation\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Always true.

#290 9–10

Expected sum

\(E(X+Y)=E(X)+E(Y)\)

Variables\(E(X+Y)\)Expected sum\(X,Y,Z\)random variables or standardized variables used in the formula

Always true.

#291 9–10

Expected difference

\(E(X-Y)=E(X)-E(Y)\)

Variables\(E(X-Y)\)Expected difference\(X,Y,Z\)random variables or standardized variables used in the formula

Always true.

#292 9–10

Expected product, independent variables

\(E(XY)=E(X)E(Y)\)

Variables\(E(XY)\)Expected product, independent variables\(X,Y,Z\)random variables or standardized variables used in the formula

Requires independence.

#293 9–10

Bernoulli distribution

\(P(X=x)=p^x(1-p)^{1-x},\;x\in\{0,1\}\)

Variables\(P(X\)Bernoulli distribution\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability

Single success/failure trial.

#294 9–10

Bernoulli mean

\(E(X)=p\)

Variables\(E(X)\)Bernoulli mean\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability

Success probability.

#295 9–10

Bernoulli variance

\(Var(X)=p(1-p)\)

Variables\(Var(X)\)Bernoulli variance\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability

Also \(pq\).

#296 9–10

Binomial probability

\(P(X=k)=\binom{n}{k}p^k(1-p)^{n-k}\)

Variables\(P(X\)Binomial probability\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(B,C\)events, groups, constants, or distribution labels named by the formula\(p\)probability, population proportion, or success probability\(k\)class position, selected count, number of categories, or period length

\(X\sim B(n,p)\).

#297 9–10

Binomial mean

\(E(X)=np\)

Variables\(E(X)\)Binomial mean\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(B,C\)events, groups, constants, or distribution labels named by the formula\(p\)probability, population proportion, or success probability

\(X\sim B(n,p)\).

#298 9–10

Binomial variance

\(Var(X)=np(1-p)\)

Variables\(Var(X)\)Binomial variance\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(B,C\)events, groups, constants, or distribution labels named by the formula\(p\)probability, population proportion, or success probability

\(X\sim B(n,p)\).

#299 9–10

Binomial standard deviation

\(\sigma=\sqrt{np(1-p)}\)

Variables\(\sigma\)Binomial standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(B,C\)events, groups, constants, or distribution labels named by the formula\(p\)probability, population proportion, or success probability

\(X\sim B(n,p)\).

#300 9–10

Binomial at least one

\(P(X\ge1)=1-(1-p)^n\)

Variables\(P(X\ge1)\)Binomial at least one\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability

At least one success.

#301 9–10

Binomial cumulative probability

\(P(X\le k)=\sum_{i=0}^{k}\binom{n}{i}p^i(1-p)^{n-i}\)

Variables\(P(X\le k)\)Binomial cumulative probability\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability\(k\)class position, selected count, number of categories, or period length

CDF form.

#302 9–10

Binomial upper tail

\(P(X\ge k)=\sum_{i=k}^{n}\binom{n}{i}p^i(1-p)^{n-i}\)

Variables\(P(X\ge k)\)Binomial upper tail\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability\(k\)class position, selected count, number of categories, or period length

Upper-tail probability.

#303 9–10

Geometric probability

\(P(X=k)=(1-p)^{k-1}p\)

Variables\(P(X\)Geometric probability\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability\(k\)class position, selected count, number of categories, or period length

Trials until first success.

#304 9–10

Geometric mean

\(E(X)=\frac{1}{p}\)

Variables\(E(X)\)Geometric mean\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability

Trials until first success.

#305 9–10

Geometric variance

\(Var(X)=\frac{1-p}{p^2}\)

Variables\(Var(X)\)Geometric variance\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability

Trials until first success.

#306 9–10

Geometric cumulative probability

\(P(X\le k)=1-(1-p)^k\)

Variables\(P(X\le k)\)Geometric cumulative probability\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability\(k\)class position, selected count, number of categories, or period length

First success by trial \(k\).

#307 9–10

Geometric upper tail

\(P(X>k)=(1-p)^k\)

Variables\(P(X>k)\)Geometric upper tail\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability\(k\)class position, selected count, number of categories, or period length

No success in first \(k\) trials.

#308 9–10

Negative binomial probability

\(P(X=k)=\binom{k-1}{r-1}p^r(1-p)^{k-r}\)

Variables\(P(X\)Negative binomial probability\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability\(r\)correlation coefficient, rate, rank, or period count named by the formula\(k\)class position, selected count, number of categories, or period length

Trial of \(r\)th success is \(k\).

#309 9–10

Negative binomial mean

\(E(X)=\frac{r}{p}\)

Variables\(E(X)\)Negative binomial mean\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability\(r\)correlation coefficient, rate, rank, or period count named by the formula

Trials until \(r\) successes.

#310 9–10

Negative binomial variance

\(Var(X)=\frac{r(1-p)}{p^2}\)

Variables\(Var(X)\)Negative binomial variance\(X,Y,Z\)random variables or standardized variables used in the formula\(p\)probability, population proportion, or success probability\(r\)correlation coefficient, rate, rank, or period count named by the formula

Trials until \(r\) successes.

#311 11–12

Hypergeometric probability

\(P(X=k)=\frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}\)

Variables\(P(X\)Hypergeometric probability\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency\(K\)number of successes in the population or category count\(k\)class position, selected count, number of categories, or period length

Sampling without replacement.

#312 11–12

Hypergeometric mean

\(E(X)=n\frac{K}{N}\)

Variables\(E(X)\)Hypergeometric mean\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency\(K\)number of successes in the population or category count

Population successes \(K\), population size \(N\).

#313 11–12

Hypergeometric variance

\(Var(X)=n\frac{K}{N}\left(1-\frac{K}{N}\right)\frac{N-n}{N-1}\)

Variables\(Var(X)\)Hypergeometric variance\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency\(K\)number of successes in the population or category count

Includes finite population correction.

#314 11–12

Poisson probability

\(P(X=k)=\frac{e^{-\lambda}\lambda^k}{k!}\)

Variables\(P(X\)Poisson probability\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean\(k\)class position, selected count, number of categories, or period length

Counts in fixed interval.

#315 11–12

Poisson mean

\(E(X)=\lambda\)

Variables\(E(X)\)Poisson mean\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean

\(X\sim Pois(\lambda)\).

#316 11–12

Poisson variance

\(Var(X)=\lambda\)

Variables\(Var(X)\)Poisson variance\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean

\(X\sim Pois(\lambda)\).

#317 11–12

Poisson standard deviation

\(\sigma=\sqrt{\lambda}\)

Variables\(\sigma\)Poisson standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean

\(X\sim Pois(\lambda)\).

#318 11–12

Poisson zero-event probability

\(P(X=0)=e^{-\lambda}\)

Variables\(P(X\)Poisson zero-event probability\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean

No events.

#319 11–12

Poisson at least one

\(P(X\ge1)=1-e^{-\lambda}\)

Variables\(P(X\ge1)\)Poisson at least one\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean

At least one event.

#320 11–12

Poisson rate scaling

\(\lambda_t=rt\)

Variables\(\lambda_t\)Poisson rate scaling\(r\)correlation coefficient, rate, rank, or period count named by the formula\(t\)t statistic or time period\(\lambda\)rate parameter or Poisson mean

Rate \(r\) over time/space \(t\).

#321 11–12

Multinomial probability

\(P(X_1=x_1,\dots,X_k=x_k)=\frac{n!}{x_1!\cdots x_k!}p_1^{x_1}\cdots p_k^{x_k}\)

Variables\(P(X_1\)Multinomial probability\(n\)sample size, number of observations, or number of trials\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(k\)class position, selected count, number of categories, or period length\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Counts across \(k\) categories.

#322 11–12

Multinomial mean

\(E(X_i)=np_i\)

Variables\(E(X_i)\)Multinomial mean\(p_i\)probability or proportion for category i

For category \(i\).

#323 11–12

Multinomial variance

\(Var(X_i)=np_i(1-p_i)\)

Variables\(Var(X_i)\)Multinomial variance\(p_i\)probability or proportion for category i

For category \(i\).

#324 11–12

Multinomial covariance

\(Cov(X_i,X_j)=-np_ip_j\)

Variables\(Cov(X_i,X_j)\)Multinomial covariance\(p_i\)probability or proportion for category i

For \(i e j\).

Unit 10

Continuous Distributions and Density Functions

31 / 31 formulas

#325 11–12

Continuous probability

\(P(a\le X\le b)=\int_a^b f(x)\,dx\)

Variables\(P(a\le X\le b)\)Continuous probability\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(f\)frequency or class frequency\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Area under density.

#326 11–12

Total density area

\(\int_{-\infty}^{\infty}f(x)\,dx=1\)

Variables\(x\)data value, outcome, or input value\(f\)frequency or class frequency

Valid PDF.

#327 11–12

Continuous expected value

\(E(X)=\int_{-\infty}^{\infty}x f(x)\,dx\)

Mean.

#328 11–12

Continuous variance

\(Var(X)=\int_{-\infty}^{\infty}(x-\mu)^2 f(x)\,dx\)

Variables\(Var(X)\)Continuous variance\(\mu\)population mean\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(f\)frequency or class frequency

Variance.

#329 11–12

Continuous variance shortcut

\(Var(X)=E(X^2)-[E(X)]^2\)

Variables\(Var(X)\)Continuous variance shortcut\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(f\)frequency or class frequency

With \(E(X^2)=\int x^2f(x)\,dx\).

#330 11–12

CDF

\(F(x)=P(X\le x)=\int_{-\infty}^{x}f(t)\,dt\)

Variables\(F(x)\)CDF\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(f\)frequency or class frequency\(t\)t statistic or time period

Cumulative distribution function.

#331 11–12

PDF from CDF

\(f(x)=F'(x)\)

Variables\(f(x)\)PDF from CDF\(x\)data value, outcome, or input value\(f\)frequency or class frequency

Where differentiable.

#332 11–12

Interval probability from CDF

\(P(a<X\le b)=F(b)-F(a)\)

Variables\(P(a<X\le b)\)Interval probability from CDF\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Continuous variables.

#333 11–12

Uniform density

\(f(x)=\frac{1}{b-a},\;a\le x\le b\)

Variables\(f(x)\)Uniform density\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(f\)frequency or class frequency\(U\)upper class boundary or upper endpoint named by the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

\(X\sim U(a,b)\).

#334 11–12

Uniform mean

\(E(X)=\frac{a+b}{2}\)

Variables\(E(X)\)Uniform mean\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Continuous uniform.

#335 11–12

Uniform variance

\(Var(X)=\frac{(b-a)^2}{12}\)

Variables\(Var(X)\)Uniform variance\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Continuous uniform.

#336 11–12

Uniform CDF

\(F(x)=\frac{x-a}{b-a},\;a\le x\le b\)

Variables\(F(x)\)Uniform CDF\(x\)data value, outcome, or input value\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Continuous uniform.

#337 11–12

Normal density

\(f(x)=\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}\)

Variables\(f(x)\)Normal density\(\mu\)population mean\(\sigma\)population standard deviation\(\sigma^2\)population variance\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(N\)total count, population size, or total frequency\(f\)frequency or class frequency

\(X\sim N(\mu,\sigma^2)\).

#338 11–12

Standard normal density

\(\phi(z)=\frac{1}{\sqrt{2\pi}}e^{-z^2/2}\)

Variables\(\phi(z)\)Standard normal density\(X,Y,Z\)random variables or standardized variables used in the formula\(N\)total count, population size, or total frequency\(z\)standard score or normal critical value\(\phi\)standard normal density or phi coefficient

\(Z\sim N(0,1)\).

#339 11–12

Normal standardization

\(Z=\frac{X-\mu}{\sigma}\)

Variables\(Z\)Normal standardization\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula

Use standard normal tables/calculator.

#340 11–12

Normal probability

\(P(a<X<b)=\Phi\left(\frac{b-\mu}{\sigma}\right)-\Phi\left(\frac{a-\mu}{\sigma}\right)\)

Variables\(P(a<X<b)\)Normal probability\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(\Phi\)standard normal cumulative distribution function

Normal interval probability.

#341 11–12

Normal upper tail

\(P(X>a)=1-\Phi\left(\frac{a-\mu}{\sigma}\right)\)

Variables\(P(X>a)\)Normal upper tail\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(\Phi\)standard normal cumulative distribution function

Upper tail.

#342 11–12

Normal lower tail

\(P(X<a)=\Phi\left(\frac{a-\mu}{\sigma}\right)\)

Variables\(P(X<a)\)Normal lower tail\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(\Phi\)standard normal cumulative distribution function

Lower tail.

#343 11–12

Inverse normal

\(x=\mu+z_p\sigma\)

Variables\(x\)Inverse normal\(\mu\)population mean\(p\)probability, population proportion, or success probability\(\Phi\)standard normal cumulative distribution function

Where \(\Phi(z_p)=p\).

#344 11–12

Exponential density

\(f(x)=\lambda e^{-\lambda x},\;x\ge0\)

Variables\(f(x)\)Exponential density\(x\)data value, outcome, or input value\(f\)frequency or class frequency\(\lambda\)rate parameter or Poisson mean

Waiting-time model.

#345 11–12

Exponential CDF

\(F(x)=1-e^{-\lambda x}\)

Variables\(F(x)\)Exponential CDF\(x\)data value, outcome, or input value\(\lambda\)rate parameter or Poisson mean

\(x\ge0\).

#346 11–12

Exponential survival

\(P(X>x)=e^{-\lambda x}\)

Variables\(P(X>x)\)Exponential survival\(x\)data value, outcome, or input value\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean

Right-tail probability.

#347 11–12

Exponential mean

\(E(X)=\frac{1}{\lambda}\)

Variables\(E(X)\)Exponential mean\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean

Average waiting time.

#348 11–12

Exponential variance

\(Var(X)=\frac{1}{\lambda^2}\)

Variables\(Var(X)\)Exponential variance\(X,Y,Z\)random variables or standardized variables used in the formula\(\lambda\)rate parameter or Poisson mean

Waiting-time variance.

#349 11–12

Exponential memoryless property

\(P(X>s+t\mid X>s)=P(X>t)\)

Variables\(P(X>s+t\mid X>s)\)Exponential memoryless property\(s\)sample standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(\mid\)given condition in conditional probability\(t\)t statistic or time period

For exponential \(X\).

#350 11–12

Gamma density

\(f(x)=\frac{\lambda^\alpha x^{\alpha-1}e^{-\lambda x}}{\Gamma(\alpha)},\;x>0\)

Variables\(f(x)\)Gamma density\(x\)data value, outcome, or input value\(f\)frequency or class frequency\(\alpha\)significance level or distribution parameter, depending on context\(\lambda\)rate parameter or Poisson mean\(\Gamma\)gamma function\(K\)number of successes in the population or category count

Advanced K–12/AP extension.

#351 11–12

Gamma mean

\(E(X)=\frac{\alpha}{\lambda}\)

Variables\(E(X)\)Gamma mean\(X,Y,Z\)random variables or standardized variables used in the formula\(\alpha\)significance level or distribution parameter, depending on context\(\lambda\)rate parameter or Poisson mean

Rate parameterization.

#352 11–12

Gamma variance

\(Var(X)=\frac{\alpha}{\lambda^2}\)

Variables\(Var(X)\)Gamma variance\(X,Y,Z\)random variables or standardized variables used in the formula\(\alpha\)significance level or distribution parameter, depending on context\(\lambda\)rate parameter or Poisson mean

Rate parameterization.

#353 11–12

Chi-square distribution

\(\chi^2_\nu\sim \Gamma\left(\frac{\nu}{2},\frac{1}{2}\right)\)

Variables\(\nu\)degrees-of-freedom parameter\(\chi^2\)chi-square statistic or critical value\(\Gamma\)gamma function

Shape-scale view.

#354 11–12

Student t definition

\(T=\frac{Z}{\sqrt{V/\nu}}\)

Variables\(T\)Student t definition\(X,Y,Z\)random variables or standardized variables used in the formula\(N\)total count, population size, or total frequency\(\nu\)degrees-of-freedom parameter\(\chi^2\)chi-square statistic or critical value\(u,u_x,u_y,u_z\)measurement uncertainty values

\(Z\sim N(0,1)\), \(V\sim\chi^2_ u\).

#355 11–12

F distribution definition

\(F=\frac{U/\nu_1}{V/\nu_2}\)

Variables\(F\)F distribution definition\(U\)upper class boundary or upper endpoint named by the formula\(\nu\)degrees-of-freedom parameter

\(U,V\) independent chi-square variables.

Unit 11

Sampling, Sampling Distributions, and Standard Error

34 / 34 formulas

#356 6–8

Sample proportion

\(\hat{p}=\frac{x}{n}\)

Variables\(\hat{p}\)Sample proportion\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability

\(x\)=success count.

#357 6–8

Sample percentage

\(\hat{p}\times100\%\)

Variables\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability

Percent form.

#358 6–8

Population proportion

\(p=\frac{X}{N}\)

Variables\(p\)Population proportion\(X,Y,Z\)random variables or standardized variables used in the formula\(N\)total count, population size, or total frequency

Population success fraction.

#359 6–8

Sampling fraction

\(f=\frac{n}{N}\)

Variables\(f\)Sampling fraction\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency

Sample size over population size.

#360 6–8

Finite population correction

\(FPC=\sqrt{\frac{N-n}{N-1}}\)

Variables\(FPC\)Finite population correction\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

For sampling without replacement.

#361 9–10

Mean of sample mean

\(E(\bar{X})=\mu\)

Variables\(E(\bar{X})\)Mean of sample mean\(\mu\)population mean\(X,Y,Z\)random variables or standardized variables used in the formula

Unbiased estimator.

#362 9–10

Standard error of sample mean

\(SE_{\bar{X}}=\frac{\sigma}{\sqrt{n}}\)

Variables\(SE_{\bar{X}}\)Standard error of sample mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials

Known population SD.

#363 9–10

Estimated standard error of sample mean

\(SE_{\bar{X}}\approx\frac{s}{\sqrt{n}}\)

Variables\(SE_{\bar{X}}\)Estimated standard error of sample mean\(s\)sample standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials

Unknown population SD.

#364 9–10

Sample mean distribution

\(\bar{X}\approx N\left(\mu,\frac{\sigma^2}{n}\right)\)

Variables\(\bar{X}\)Sample mean distribution\(\mu\)population mean\(\sigma\)population standard deviation\(\sigma^2\)population variance\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency

For normal population or large \(n\).

#365 9–10

Central limit theorem standardization

\(Z=\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\)

Variables\(Z\)Central limit theorem standardization\(\mu\)population mean\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials

Known \(\sigma\).

#366 9–10

Mean of sample proportion

\(E(\hat{p})=p\)

Variables\(E(\hat{p})\)Mean of sample proportion\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability

Unbiased estimator.

#367 9–10

Standard error of sample proportion

\(SE_{\hat{p}}=\sqrt{\frac{p(1-p)}{n}}\)

Variables\(SE_{\hat{p}}\)Standard error of sample proportion\(\hat{p}\)sample proportion\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability

Uses true \(p\).

#368 9–10

Estimated SE of sample proportion

\(SE_{\hat{p}}\approx\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)

Variables\(SE_{\hat{p}}\)Estimated SE of sample proportion\(\hat{p}\)sample proportion\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability

Uses sample \(\hat{p}\).

#369 9–10

Sample proportion distribution

\(\hat{p}\approx N\left(p,\frac{p(1-p)}{n}\right)\)

Variables\(\hat{p}\)Sample proportion distribution\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency\(p\)probability, population proportion, or success probability

Large-sample approximation.

#370 9–10

Proportion z statistic

\(Z=\frac{\hat{p}-p}{\sqrt{p(1-p)/n}}\)

Variables\(Z\)Proportion z statistic\(\hat{p}\)sample proportion\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability

For tests.

#371 9–10

SE for count in binomial

\(SE_X=\sqrt{np(1-p)}\)

Variables\(SE_X\)SE for count in binomial\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(B,C\)events, groups, constants, or distribution labels named by the formula\(p\)probability, population proportion, or success probability

For \(X\sim B(n,p)\).

#372 11–12

Mean of sum

\(E\left(\sum X_i\right)=\sum E(X_i)\)

Variables\(\text{terms}\)the quantities named directly in the formula and note

Linearity.

#373 11–12

Variance of independent sum

\(Var\left(\sum X_i\right)=\sum Var(X_i)\)

Variables\(\text{terms}\)the quantities named directly in the formula and note

Independent variables.

#374 11–12

Standardized sample sum

\(Z=\frac{\sum X_i-n\mu}{\sigma\sqrt{n}}\)

Variables\(Z\)Standardized sample sum\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials

Known \(\sigma\).

#375 11–12

Difference of sample means mean

\(E(\bar{X}_1-\bar{X}_2)=\mu_1-\mu_2\)

Variables\(E(\bar{X}_1-\bar{X}_2)\)Difference of sample means mean\(\mu\)population mean\(X,Y,Z\)random variables or standardized variables used in the formula

Two independent samples.

#376 11–12

SE of difference of means

\(SE_{\bar{X}_1-\bar{X}_2}=\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}\)

Variables\(SE_{\bar{X}_1-\bar{X}_2}\)SE of difference of means\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula

Known population SDs.

#377 11–12

Estimated SE difference of means

\(SE_{\bar{X}_1-\bar{X}_2}\approx\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}\)

Variables\(SE_{\bar{X}_1-\bar{X}_2}\)Estimated SE difference of means\(X,Y,Z\)random variables or standardized variables used in the formula

Unknown population SDs.

#378 11–12

Difference of proportions mean

\(E(\hat{p}_1-\hat{p}_2)=p_1-p_2\)

Variables\(E(\hat{p}_1-\hat{p}_2)\)Difference of proportions mean\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Two independent samples.

#379 11–12

SE difference of proportions

\(SE_{\hat{p}_1-\hat{p}_2}=\sqrt{\frac{p_1(1-p_1)}{n_1}+\frac{p_2(1-p_2)}{n_2}}\)

Variables\(SE_{\hat{p}_1-\hat{p}_2}\)SE difference of proportions\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

For confidence intervals with true values.

#380 11–12

Estimated SE difference of proportions

\(SE\approx\sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1}+\frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\)

Variables\(SE\)Estimated SE difference of proportions\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability

For confidence intervals.

#381 11–12

Pooled proportion

\(\hat{p}_c=\frac{x_1+x_2}{n_1+n_2}\)

Variables\(\hat{p}_c\)Pooled proportion\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

For two-proportion test under \(H_0:p_1=p_2\).

#382 11–12

Pooled SE difference of proportions

\(SE_{pooled}=\sqrt{\hat{p}_c(1-\hat{p}_c)\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}\)

Variables\(SE_{pooled}\)Pooled SE difference of proportions\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability\(z\)standard score or normal critical value

For two-proportion z test.

#383 11–12

Standard error with finite population correction

\(SE_{\bar{X},FPC}=\frac{\sigma}{\sqrt{n}}\sqrt{\frac{N-n}{N-1}}\)

Variables\(SE_{\bar{X},FPC}\)Standard error with finite population correction\(\sigma\)population standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

Sampling without replacement.

#384 11–12

Sample size for mean margin of error

\(n=\left(\frac{z^*\sigma}{ME}\right)^2\)

Variables\(n\)Sample size for mean margin of error\(\sigma\)population standard deviation\(z\)standard score or normal critical value\(z^*\)critical z-value\(ME\)margin of error

Round up.

#385 11–12

Sample size for proportion margin of error

\(n=\frac{(z^*)^2p(1-p)}{ME^2}\)

Variables\(n\)Sample size for proportion margin of error\(p\)probability, population proportion, or success probability\(z\)standard score or normal critical value\(z^*\)critical z-value\(ME\)margin of error

Use \(p=0.5\) if unknown.

#386 11–12

Design effect

\(DEFF=\frac{Var_{\text{actual}}}{Var_{\text{SRS}}}\)

Variables\(DEFF\)Design effect\(\text{actual}\)the named quantity shown in the formula\(\text{SRS}\)the named quantity shown in the formula\(DEFF,n_{eff}\)design effect and effective sample size

Survey sampling.

#387 11–12

Effective sample size

\(n_{\text{eff}}=\frac{n}{DEFF}\)

Variables\(n_{\text{eff}}\)Effective sample size\(\text{eff}\)the named quantity shown in the formula\(n\)sample size, number of observations, or number of trials\(DEFF,n_{eff}\)design effect and effective sample size

Survey sampling.

#388 11–12

Response rate

\(RR=\frac{\text{responses}}{\text{eligible sample}}\times100\%\)

Variables\(RR\)Response rate\(\text{responses}\)number of completed responses\(\text{eligible sample}\)number of eligible sampled units\(RR,RD,OR,NNT\)relative risk, risk difference, odds ratio, and number needed to treat

Survey quality metric.

#389 11–12

Nonresponse rate

\(NR=1-RR\)

Variables\(NR\)Nonresponse rate\(RR,RD,OR,NNT\)relative risk, risk difference, odds ratio, and number needed to treat

Use decimal response rate.

Unit 12

Confidence Intervals and Margins of Error

29 / 29 formulas

#390 9–10

Generic confidence interval

\(\text{estimate}\pm(\text{critical value})(\text{standard error})\)

Variables\(\text{estimate}\)the named quantity shown in the formula\(\text{critical value}\)the named quantity shown in the formula\(\text{standard error}\)the named quantity shown in the formula

Core inference structure.

#391 9–10

Margin of error

\(ME=(\text{critical value})(SE)\)

Variables\(ME\)Margin of error\(\text{critical value}\)the named quantity shown in the formula\(SE\)standard error

Half-width of interval.

#392 9–10

Confidence interval lower bound

\(L=\text{estimate}-ME\)

Variables\(L\)Confidence interval lower bound\(\text{estimate}\)the named quantity shown in the formula\(ME\)margin of error

Lower endpoint.

#393 9–10

Confidence interval upper bound

\(U=\text{estimate}+ME\)

Variables\(U\)Confidence interval upper bound\(\text{estimate}\)the named quantity shown in the formula\(ME\)margin of error

Upper endpoint.

#394 11–12

One-sample z interval for mean

\(\bar{x}\pm z^*\frac{\sigma}{\sqrt{n}}\)

Variables\(\bar{x}\)sample mean or average of x-values\(\sigma\)population standard deviation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(z\)standard score or normal critical value\(z^*\)critical z-value

Known \(\sigma\).

#395 11–12

One-sample t interval for mean

\(\bar{x}\pm t^*\frac{s}{\sqrt{n}}\)

Variables\(\bar{x}\)sample mean or average of x-values\(\sigma\)population standard deviation\(s\)sample standard deviation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(t\)t statistic or time period\(t^*\)critical t-value

Unknown \(\sigma\).

#396 11–12

One-sample t interval degrees of freedom

\(df=n-1\)

Variables\(df\)One-sample t interval degrees of freedom\(n\)sample size, number of observations, or number of trials

For one mean.

#397 11–12

One-proportion z interval

\(\hat{p}\pm z^*\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)

Variables\(\hat{p}\)sample proportion\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability\(z\)standard score or normal critical value\(z^*\)critical z-value

Large-sample interval.

#398 11–12

Plus-four proportion estimate

\(\tilde{p}=\frac{x+2}{n+4}\)

Variables\(\tilde{p}\)Plus-four proportion estimate\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability

Approximate interval adjustment.

#399 11–12

Plus-four proportion interval

\(\tilde{p}\pm z^*\sqrt{\frac{\tilde{p}(1-\tilde{p})}{n+4}}\)

Variables\(\tilde{p}\)plus-four adjusted sample proportion\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability\(z\)standard score or normal critical value\(z^*\)critical z-value

Approximate adjusted interval.

#400 11–12

Two-sample z interval for means

\((\bar{x}_1-\bar{x}_2)\pm z^*\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}\)

Variables\(\bar{x}\)sample mean or average of x-values\(\sigma\)population standard deviation\(x\)data value, outcome, or input value\(z\)standard score or normal critical value\(z^*\)critical z-value

Known SDs.

#401 11–12

Two-sample t interval for means

\((\bar{x}_1-\bar{x}_2)\pm t^*\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}\)

Variables\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(t\)t statistic or time period\(t^*\)critical t-value

Unknown SDs.

#402 11–12

Welch-Satterthwaite degrees of freedom

\(df=\frac{\left(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\right)^2}{\frac{(s_1^2/n_1)^2}{n_1-1}+\frac{(s_2^2/n_2)^2}{n_2-1}}\)

Variables\(df\)Welch-Satterthwaite degrees of freedom

Approximate \(df\).

#403 11–12

Pooled two-sample t interval

\((\bar{x}_1-\bar{x}_2)\pm t^*s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}\)

Variables\(\bar{x}\)sample mean or average of x-values\(s_p\)pooled sample standard deviation\(x\)data value, outcome, or input value\(t\)t statistic or time period\(t^*\)critical t-value

Assumes equal variances.

#404 11–12

Pooled standard deviation

\(s_p=\sqrt{\frac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}}\)

Variables\(s_p\)Pooled standard deviation

Equal-variance methods.

#405 11–12

Pooled two-sample t degrees of freedom

\(df=n_1+n_2-2\)

Variables\(df\)Pooled two-sample t degrees of freedom

Equal-variance method.

#406 11–12

Paired t interval

\(\bar{d}\pm t^*\frac{s_d}{\sqrt{n}}\)

Variables\(\bar{d}\)mean of paired differences\(s_d\)standard deviation of paired differences\(n\)sample size, number of observations, or number of trials\(c,d\)cell counts, constants, or additional values named by the formula\(t\)t statistic or time period\(t^*\)critical t-value

Differences within pairs.

#407 11–12

Paired t degrees of freedom

\(df=n-1\)

Variables\(df\)Paired t degrees of freedom\(n\)sample size, number of observations, or number of trials

Number of pairs minus 1.

#408 11–12

Two-proportion z interval

\((\hat{p}_1-\hat{p}_2)\pm z^*\sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1}+\frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}\)

Variables\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability\(z\)standard score or normal critical value\(z^*\)critical z-value

Independent samples.

#409 11–12

Confidence interval for variance

\(\left(\frac{(n-1)s^2}{\chi^2_{\alpha/2}},\frac{(n-1)s^2}{\chi^2_{1-\alpha/2}}\right)\)

Variables\(s\)sample standard deviation\(s^2\)sample variance\(n\)sample size, number of observations, or number of trials\(\alpha\)significance level or distribution parameter, depending on context\(\chi^2\)chi-square statistic or critical value

For normal population.

#410 11–12

Confidence interval for standard deviation

\(\left(\sqrt{\frac{(n-1)s^2}{\chi^2_{\alpha/2}}},\sqrt{\frac{(n-1)s^2}{\chi^2_{1-\alpha/2}}}\right)\)

Square-root variance interval.

#411 11–12

Confidence interval for correlation, Fisher z

\(z_r=\frac{1}{2}\ln\left(\frac{1+r}{1-r}\right)\)

Variables\(z_r\)Confidence interval for correlation, Fisher z\(r\)correlation coefficient, rate, rank, or period count named by the formula

Transform \(r\).

#412 11–12

SE for Fisher z

\(SE_{z_r}=\frac{1}{\sqrt{n-3}}\)

Variables\(SE_{z_r}\)SE for Fisher z\(n\)sample size, number of observations, or number of trials

Correlation interval.

#413 11–12

Fisher z interval

\(z_r\pm z^*\frac{1}{\sqrt{n-3}}\)

Variables\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula\(z\)standard score or normal critical value\(z^*\)critical z-value

Convert back to \(r\).

#414 11–12

Back-transform Fisher z

\(r=\frac{e^{2z}-1}{e^{2z}+1}\)

Variables\(r\)Back-transform Fisher z\(z\)standard score or normal critical value

From \(z\) scale to correlation.

#415 11–12

Bootstrap percentile interval

\([Q_{\alpha/2},Q_{1-\alpha/2}]\)

Variables\(\alpha\)significance level or distribution parameter, depending on context

Approximate simulation-based interval.

#416 11–12

Critical value confidence relation

\(\alpha=1-C\)

Variables\(\alpha\)Critical value confidence relation\(B,C\)events, groups, constants, or distribution labels named by the formula

\(C\)=confidence level.

#417 11–12

Two-sided z critical probability

\(P(-z^*<Z<z^*)=C\)

Variables\(P(-z^*<Z<z^*)\)Two-sided z critical probability\(X,Y,Z\)random variables or standardized variables used in the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(z\)standard score or normal critical value\(z^*\)critical z-value\(A,B,C\)events or sets in the sample space

Standard normal.

#418 11–12

One-sided z critical probability

\(P(Z<z^*)=C\)

Variables\(P(Z<z^*)\)One-sided z critical probability\(X,Y,Z\)random variables or standardized variables used in the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(z\)standard score or normal critical value\(z^*\)critical z-value\(A,B,C\)events or sets in the sample space

Upper one-sided interval context.

Unit 13

Hypothesis Testing and Test Statistics

41 / 41 formulas

#419 9–10

Generic test statistic

\(\text{test statistic}=\frac{\text{statistic}-\text{null value}}{\text{standard error}}\)

Variables\(\text{test statistic}\)Generic test statistic\(\text{statistic}\)the named quantity shown in the formula\(\text{null value}\)the named quantity shown in the formula\(\text{standard error}\)the named quantity shown in the formula

Core idea.

#420 9–10

p-value, right-tailed

\(p=P(T\ge t_{\text{obs}}\mid H_0)\)

Variables\(p\)p-value, right-tailed\(\text{obs}\)the named quantity shown in the formula\(\mid\)given condition in conditional probability

Right-tail test.

#421 9–10

p-value, left-tailed

\(p=P(T\le t_{\text{obs}}\mid H_0)\)

Variables\(p\)p-value, left-tailed\(\text{obs}\)the named quantity shown in the formula\(\mid\)given condition in conditional probability

Left-tail test.

#422 9–10

p-value, two-tailed symmetric

\(p=2P(T\ge\lvert t_{\text{obs}}\rvert\mid H_0)\)

Variables\(p\)p-value, two-tailed symmetric\(\text{obs}\)the named quantity shown in the formula\(\mid\)given condition in conditional probability

For symmetric test statistics.

#423 9–10

Type I error probability

\(\alpha=P(\text{reject }H_0\mid H_0\text{ true})\)

Variables\(\alpha\)Type I error probability\(\text{reject}\)the named quantity shown in the formula\(\text{true}\)the named quantity shown in the formula\(\mid\)given condition in conditional probability

Significance level.

#424 9–10

Type II error probability

\(\beta=P(\text{fail to reject }H_0\mid H_0\text{ false})\)

Variables\(\beta\)Type II error probability\(\text{fail to reject}\)the named quantity shown in the formula\(\text{false}\)the named quantity shown in the formula\(\mid\)given condition in conditional probability

Missed detection.

#425 9–10

Power

\(\text{Power}=1-\beta\)

Variables\(\text{Power}\)Power\(\beta\)Type II error probability, regression parameter, or distribution parameter

Probability of correctly rejecting false \(H_0\).

#426 11–12

One-sample z test for mean

\(z=\frac{\bar{x}-\mu_0}{\sigma/\sqrt{n}}\)

Variables\(z\)One-sample z test for mean\(\bar{x}\)sample mean or average of x-values\(\mu\)population mean\(\mu_0\)hypothesized population mean\(\sigma\)population standard deviation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Known \(\sigma\).

#427 11–12

One-sample t test for mean

\(t=\frac{\bar{x}-\mu_0}{s/\sqrt{n}}\)

Variables\(t\)One-sample t test for mean\(\bar{x}\)sample mean or average of x-values\(\mu\)population mean\(\mu_0\)hypothesized population mean\(\sigma\)population standard deviation\(s\)sample standard deviation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Unknown \(\sigma\).

#428 11–12

One-sample t degrees of freedom

\(df=n-1\)

Variables\(df\)One-sample t degrees of freedom\(n\)sample size, number of observations, or number of trials

One mean.

#429 11–12

One-proportion z test

\(z=\frac{\hat{p}-p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\)

Variables\(z\)One-proportion z test\(\hat{p}\)sample proportion\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(SE\)standard error\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Use null value in SE.

#430 11–12

Two-sample z test for means

\(z=\frac{(\bar{x}_1-\bar{x}_2)-(\mu_1-\mu_2)_0}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}}\)

Variables\(z\)Two-sample z test for means\(\bar{x}\)sample mean or average of x-values\(\mu\)population mean\(\sigma\)population standard deviation\(x\)data value, outcome, or input value

Known SDs.

#431 11–12

Two-sample t test for means

\(t=\frac{(\bar{x}_1-\bar{x}_2)-(\mu_1-\mu_2)_0}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\)

Variables\(t\)Two-sample t test for means\(\bar{x}\)sample mean or average of x-values\(\mu\)population mean\(x\)data value, outcome, or input value

Welch version.

#432 11–12

Pooled two-sample t test

\(t=\frac{(\bar{x}_1-\bar{x}_2)-(\mu_1-\mu_2)_0}{s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}\)

Variables\(t\)Pooled two-sample t test\(\bar{x}\)sample mean or average of x-values\(\mu\)population mean\(s_p\)pooled sample standard deviation\(x\)data value, outcome, or input value

Equal variances.

#433 11–12

Paired t test

\(t=\frac{\bar{d}-\mu_{d,0}}{s_d/\sqrt{n}}\)

Variables\(t\)Paired t test\(\bar{d}\)mean of paired differences\(\mu\)population mean\(s_d\)standard deviation of paired differences\(n\)sample size, number of observations, or number of trials\(c,d\)cell counts, constants, or additional values named by the formula

Matched pairs.

#434 11–12

Two-proportion z test

\(z=\frac{(\hat{p}_1-\hat{p}_2)-(p_1-p_2)_0}{\sqrt{\hat{p}_c(1-\hat{p}_c)\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}\)

Variables\(z\)Two-proportion z test\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Uses pooled estimate for \(H_0:p_1=p_2\).

#435 11–12

Chi-square goodness-of-fit statistic

\(\chi^2=\sum\frac{(O_i-E_i)^2}{E_i}\)

Variables\(\chi^2\)Chi-square goodness-of-fit statistic\(O_i,E_i\)observed and expected counts

Categorical distribution test.

#436 11–12

Goodness-of-fit degrees of freedom

\(df=k-1-m\)

Variables\(df\)Goodness-of-fit degrees of freedom\(k\)class position, selected count, number of categories, or period length

\(m\)=estimated parameters.

#437 11–12

Expected count in goodness-of-fit

\(E_i=np_i\)

Variables\(E_i\)Expected count in goodness-of-fit\(p_i\)probability or proportion for category i\(O_i,E_i\)observed and expected counts

Expected count under \(H_0\).

#438 11–12

Chi-square independence statistic

\(\chi^2=\sum\frac{(O_{ij}-E_{ij})^2}{E_{ij}}\)

Variables\(\chi^2\)Chi-square independence statistic\(O_{ij},E_{ij}\)observed and expected counts in row i, column j

Two-way table.

#439 11–12

Expected cell count

\(E_{ij}=\frac{(\text{row total})(\text{column total})}{\text{grand total}}\)

Variables\(E_{ij}\)Expected cell count\(\text{row total}\)sum of counts in that row\(\text{column total}\)sum of counts in that column\(\text{grand total}\)sum of all table counts\(O_{ij},E_{ij}\)observed and expected counts in row i, column j

Independence test.

#440 11–12

Independence test degrees of freedom

\(df=(r-1)(c-1)\)

Variables\(df\)Independence test degrees of freedom\(r\)correlation coefficient, rate, rank, or period count named by the formula\(c,d\)cell counts, constants, or additional values named by the formula

\(r\) rows, \(c\) columns.

#441 11–12

Chi-square homogeneity degrees of freedom

\(df=(r-1)(c-1)\)

Variables\(df\)Chi-square homogeneity degrees of freedom\(r\)correlation coefficient, rate, rank, or period count named by the formula\(c,d\)cell counts, constants, or additional values named by the formula

Same as independence.

#442 11–12

One-way ANOVA F statistic

\(F=\frac{MSB}{MSW}\)

Variables\(F\)One-way ANOVA F statistic\(SSB,SSW,MSB,MSW\)ANOVA sums of squares and mean squares

Between-group over within-group variation.

#443 11–12

Between-group sum of squares

\(SSB=\sum n_j(\bar{x}_j-\bar{x}_{..})^2\)

Variables\(SSB\)Between-group sum of squares\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(SSB,SSW,MSB,MSW\)ANOVA sums of squares and mean squares

ANOVA.

#444 11–12

Within-group sum of squares

\(SSW=\sum\sum(x_{ij}-\bar{x}_j)^2\)

Variables\(SSW\)Within-group sum of squares\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(SSB,SSW,MSB,MSW\)ANOVA sums of squares and mean squares

ANOVA.

#445 11–12

Total sum of squares ANOVA

\(SST=\sum\sum(x_{ij}-\bar{x}_{..})^2\)

Variables\(SST\)Total sum of squares ANOVA\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(SST,SSR,SSE\)total, regression, and residual sums of squares

ANOVA total variation.

#446 11–12

ANOVA decomposition

\(SST=SSB+SSW\)

Variables\(SST\)ANOVA decomposition\(SST,SSR,SSE\)total, regression, and residual sums of squares\(SSB,SSW,MSB,MSW\)ANOVA sums of squares and mean squares

One-way ANOVA.

#447 11–12

Between-group mean square

\(MSB=\frac{SSB}{k-1}\)

Variables\(MSB\)Between-group mean square\(k\)class position, selected count, number of categories, or period length\(SSB,SSW,MSB,MSW\)ANOVA sums of squares and mean squares

\(k\)=number of groups.

#448 11–12

Within-group mean square

\(MSW=\frac{SSW}{N-k}\)

Variables\(MSW\)Within-group mean square\(N\)total count, population size, or total frequency\(k\)class position, selected count, number of categories, or period length\(SSB,SSW,MSB,MSW\)ANOVA sums of squares and mean squares

\(N\)=total observations.

#449 11–12

ANOVA degrees of freedom between

\(df_B=k-1\)

Variables\(df_B\)ANOVA degrees of freedom between\(k\)class position, selected count, number of categories, or period length

Between groups.

#450 11–12

ANOVA degrees of freedom within

\(df_W=N-k\)

Variables\(df_W\)ANOVA degrees of freedom within\(N\)total count, population size, or total frequency\(k\)class position, selected count, number of categories, or period length

Within groups.

#451 11–12

F test for two variances

\(F=\frac{s_1^2}{s_2^2}\)

Variables\(F\)F test for two variances

Often put larger variance on top.

#452 11–12

Degrees of freedom for variance ratio

\(df_1=n_1-1,\quad df_2=n_2-1\)

Variables\(df_1\)Degrees of freedom for variance ratio

F test.

#453 11–12

Test statistic for correlation

\(t=\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}\)

Variables\(t\)Test statistic for correlation\(n\)sample size, number of observations, or number of trials\(r\)correlation coefficient, rate, rank, or period count named by the formula\(\rho\)population correlation coefficient

Test \(H_0:\rho=0\).

#454 11–12

Correlation test degrees of freedom

\(df=n-2\)

Variables\(df\)Correlation test degrees of freedom\(n\)sample size, number of observations, or number of trials

Pearson correlation.

#455 11–12

Regression slope test statistic

\(t=\frac{b-\beta_0}{SE_b}\)

Variables\(t\)Regression slope test statistic\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(\beta\)Type II error probability, regression parameter, or distribution parameter

Simple linear regression.

#456 11–12

Regression slope test degrees of freedom

\(df=n-2\)

Variables\(df\)Regression slope test degrees of freedom\(n\)sample size, number of observations, or number of trials

Simple linear regression.

#457 11–12

Sign test probability

\(P(X=k)=\binom{n}{k}(0.5)^n\)

Variables\(P(X\)Sign test probability\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(k\)class position, selected count, number of categories, or period length

Under median null, no ties.

#458 11–12

McNemar test statistic

\(\chi^2=\frac{(b-c)^2}{b+c}\)

Variables\(\chi^2\)McNemar test statistic\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(c,d\)cell counts, constants, or additional values named by the formula

Paired categorical data.

#459 11–12

Continuity-corrected McNemar

\(\chi^2=\frac{(\lvert b-c\rvert-1)^2}{b+c}\)

Variables\(\chi^2\)Continuity-corrected McNemar\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints\(c,d\)cell counts, constants, or additional values named by the formula

Approximate correction.

Unit 14

Categorical Data, Rates, Risk, and Epidemiology-Style Statistics

24 / 24 formulas

#460 6–8

Rate

\(\text{Rate}=\frac{\text{number of events}}{\text{exposure amount}}\)

Variables\(\text{Rate}\)Rate\(\text{number of events}\)the named quantity shown in the formula\(\text{exposure amount}\)the named quantity shown in the formula

General rate formula.

#461 6–8

Percentage rate

\(\text{Rate \%}=\frac{\text{part}}{\text{whole}}\times100\%\)

Variables\(\text{Rate \%}\)Percentage rate\(\text{part}\)selected amount or subgroup\(\text{whole}\)total amount or full group

Basic categorical percentage.

#462 6–8

Incidence proportion

\(IP=\frac{\text{new cases}}{\text{population at risk}}\)

Variables\(IP\)Incidence proportion\(\text{new cases}\)the named quantity shown in the formula\(\text{population at risk}\)the named quantity shown in the formula

Applied school statistics.

#463 6–8

Prevalence

\(P=\frac{\text{existing cases}}{\text{population}}\)

Variables\(P\)Prevalence\(\text{existing cases}\)the named quantity shown in the formula\(\text{population}\)the named quantity shown in the formula

Applied school statistics.

#464 9–10

Sensitivity

\(\text{Sensitivity}=\frac{TP}{TP+FN}\)

Variables\(\text{Sensitivity}\)Sensitivity\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

True positive rate.

#465 9–10

Specificity

\(\text{Specificity}=\frac{TN}{TN+FP}\)

Variables\(\text{Specificity}\)Specificity\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

True negative rate.

#466 9–10

False positive rate

\(FPR=\frac{FP}{FP+TN}\)

Variables\(FPR\)False positive rate\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

Equals \(1-\text{specificity}\).

#467 9–10

False negative rate

\(FNR=\frac{FN}{FN+TP}\)

Variables\(FNR\)False negative rate\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

Equals \(1-\text{sensitivity}\).

#468 9–10

Positive predictive value

\(PPV=\frac{TP}{TP+FP}\)

Variables\(PPV\)Positive predictive value\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives\(PPV,NPV\)positive and negative predictive values

Precision.

#469 9–10

Negative predictive value

\(NPV=\frac{TN}{TN+FN}\)

Variables\(NPV\)Negative predictive value\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives\(PPV,NPV\)positive and negative predictive values

Negative prediction accuracy.

#470 9–10

Accuracy

\(\text{Accuracy}=\frac{TP+TN}{TP+TN+FP+FN}\)

Variables\(\text{Accuracy}\)Accuracy\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

Overall correct rate.

#471 9–10

Error rate

\(\text{Error rate}=\frac{FP+FN}{TP+TN+FP+FN}\)

Variables\(\text{Error rate}\)Error rate\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

Overall incorrect rate.

#472 9–10

Precision

\(\text{Precision}=\frac{TP}{TP+FP}\)

Variables\(\text{Precision}\)Precision\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives\(PPV,NPV\)positive and negative predictive values

Same as PPV.

#473 9–10

Recall

\(\text{Recall}=\frac{TP}{TP+FN}\)

Variables\(\text{Recall}\)Recall\(TP,TN,FP,FN\)true positives, true negatives, false positives, and false negatives

Same as sensitivity.

#474 9–10

F1 score

\(F_1=\frac{2(\text{precision})(\text{recall})}{\text{precision}+\text{recall}}\)

Variables\(F_1\)F1 score\(\text{precision}\)the named quantity shown in the formula\(\text{recall}\)the named quantity shown in the formula

Classification metric.

#475 9–10

Risk difference

\(RD=P_1-P_0\)

Variables\(RD\)Risk difference\(RR,RD,OR,NNT\)relative risk, risk difference, odds ratio, and number needed to treat

Difference in proportions.

#476 9–10

Relative risk

\(RR=\frac{P_1}{P_0}\)

Variables\(RR\)Relative risk\(RR,RD,OR,NNT\)relative risk, risk difference, odds ratio, and number needed to treat

Ratio of risks.

#477 9–10

Odds

\(\text{Odds}=\frac{p}{1-p}\)

Variables\(\text{Odds}\)Odds\(p\)probability, population proportion, or success probability

Probability to odds.

#478 9–10

Probability from odds

\(p=\frac{\text{odds}}{1+\text{odds}}\)

Variables\(p\)Probability from odds\(\text{odds}\)the named quantity shown in the formula

Odds to probability.

#479 11–12

Odds ratio from probabilities

\(OR=\frac{p_1/(1-p_1)}{p_0/(1-p_0)}\)

Variables\(OR\)Odds ratio from probabilities\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(RR,RD,OR,NNT\)relative risk, risk difference, odds ratio, and number needed to treat\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Comparing two groups.

#480 11–12

Log odds

\(\text{logit}(p)=\ln\left(\frac{p}{1-p}\right)\)

Variables\(\text{logit}(p)\)Log odds\(\text{logit}\)the named quantity shown in the formula\(p\)probability, population proportion, or success probability

Used in logistic models.

#481 11–12

Logistic probability

\(p=\frac{1}{1+e^{-(a+bx)}}\)

Variables\(p\)Logistic probability\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

Basic logistic model.

#482 11–12

Number needed to treat

\(NNT=\frac{1}{\lvert RD\rvert}\)

Variables\(NNT\)Number needed to treat\(RR,RD,OR,NNT\)relative risk, risk difference, odds ratio, and number needed to treat

Applied inference.

#483 11–12

Prevalence odds

\(\text{Prevalence odds}=\frac{P}{1-P}\)

Variables\(\text{Prevalence odds}\)Prevalence odds

Categorical applications.

Unit 15

Time Series, Growth, Index Numbers, and Forecasting

25 / 25 formulas

#484 6–8

Absolute change

\(\Delta x=x_{new}-x_{old}\)

Variables\(\Delta x\)Absolute change\(x\)data value, outcome, or input value

Change in value.

#485 6–8

Relative change

\(\frac{x_{new}-x_{old}}{x_{old}}\)

Variables\(\text{terms}\)the quantities named directly in the formula and note

Decimal change.

#486 6–8

Percentage increase

\(\frac{x_{new}-x_{old}}{x_{old}}\times100\%\)

Variables\(\text{terms}\)the quantities named directly in the formula and note

If new value is greater.

#487 6–8

Percentage decrease

\(\frac{x_{old}-x_{new}}{x_{old}}\times100\%\)

Variables\(\text{terms}\)the quantities named directly in the formula and note

If new value is smaller.

#488 6–8

Simple moving average

\(MA_t=\frac{x_t+x_{t-1}+\cdots+x_{t-k+1}}{k}\)

Variables\(MA_t\)Simple moving average\(t\)t statistic or time period\(k\)class position, selected count, number of categories, or period length\(MA_t,CMA_t,S_t\)moving average, centered moving average, or smoothed time-series value\(x_t,x_0\)time-series value at time t and base-time value

\(k\)-period moving average.

#489 9–10

Centered moving average, even period

\(CMA_t=\frac{MA_t+MA_{t+1}}{2}\)

Variables\(CMA_t\)Centered moving average, even period\(t\)t statistic or time period\(MA_t,CMA_t,S_t\)moving average, centered moving average, or smoothed time-series value

Used for even seasonal periods.

#490 9–10

Growth factor

\(g=\frac{x_{new}}{x_{old}}\)

Variables\(g\)Growth factor

Ratio form.

#491 9–10

Compound annual growth rate

\(CAGR=\left(\frac{V_f}{V_i}\right)^{1/t}-1\)

Variables\(CAGR\)Compound annual growth rate\(t\)t statistic or time period\(V_i,V_f,V_t,V_0\)initial, final, current-time, or base values

Average compound growth.

#492 9–10

Index number

\(I_t=\frac{x_t}{x_0}\times100\)

Variables\(I_t\)Index number\(x_t,x_0\)time-series value at time t and base-time value

Base period \(0\).

#493 9–10

Price relative

\(R_i=\frac{p_{i,t}}{p_{i,0}}\times100\)

Variables\(R_i\)Price relative\(t\)t statistic or time period

Individual item index.

#494 9–10

Simple aggregate price index

\(P_{01}=\frac{\sum p_1}{\sum p_0}\times100\)

Variables\(P_{01}\)Simple aggregate price index\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Unweighted.

#495 9–10

Weighted aggregate price index

\(P_{01}=\frac{\sum w_ip_{i,1}}{\sum w_ip_{i,0}}\times100\)

Variables\(P_{01}\)Weighted aggregate price index\(w_i\)weight for value i

Weights \(w_i\).

#496 11–12

Laspeyres price index

\(L=\frac{\sum p_1q_0}{\sum p_0q_0}\times100\)

Variables\(L\)Laspeyres price index\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Base quantities.

#497 11–12

Paasche price index

\(P=\frac{\sum p_1q_1}{\sum p_0q_1}\times100\)

Variables\(P\)Paasche price index\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Current quantities.

#498 11–12

Fisher ideal index

\(F=\sqrt{LP}\)

Variables\(F\)Fisher ideal index

Geometric mean of Laspeyres and Paasche.

#499 11–12

Quantity index Laspeyres

\(Q_L=\frac{\sum q_1p_0}{\sum q_0p_0}\times100\)

Variables\(Q_L\)Quantity index Laspeyres\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Base prices.

#500 11–12

Quantity index Paasche

\(Q_P=\frac{\sum q_1p_1}{\sum q_0p_1}\times100\)

Variables\(Q_P\)Quantity index Paasche\(p_0,p_1,p_2\)hypothesized or group-specific proportions/probabilities\(p_0,p_1,q_0,q_1\)base/current prices or quantities in index-number formulas

Current prices.

#501 11–12

Seasonal index multiplicative

\(SI=\frac{\text{actual value}}{\text{trend value}}\times100\)

Variables\(SI\)Seasonal index multiplicative\(\text{actual value}\)the named quantity shown in the formula\(\text{trend value}\)the named quantity shown in the formula

Seasonality as percentage.

#502 11–12

Deseasonalized value

\(D=\frac{\text{actual value}}{SI/100}\)

Variables\(D\)Deseasonalized value\(\text{actual value}\)the named quantity shown in the formula

Multiplicative model.

#503 11–12

Additive time-series model

\(Y=T+S+C+I\)

Variables\(Y\)Additive time-series model\(X,Y,Z\)random variables or standardized variables used in the formula\(B,C\)events, groups, constants, or distribution labels named by the formula\(S\)sample space, score, or smoothed value named by the formula

Trend, seasonal, cyclical, irregular.

#504 11–12

Multiplicative time-series model

\(Y=TSCI\)

Variables\(Y\)Multiplicative time-series model\(X,Y,Z\)random variables or standardized variables used in the formula

Product model.

#505 11–12

Exponential smoothing

\(S_t=\alpha x_t+(1-\alpha)S_{t-1}\)

Variables\(S_t\)Exponential smoothing\(t\)t statistic or time period\(\alpha\)significance level or distribution parameter, depending on context\(MA_t,CMA_t,S_t\)moving average, centered moving average, or smoothed time-series value\(x_t,x_0\)time-series value at time t and base-time value

\(0<\alpha<1\).

#506 11–12

Forecast error

\(e_t=x_t-\hat{x}_t\)

Variables\(e_t\)Forecast error\(x\)data value, outcome, or input value\(x_t,x_0\)time-series value at time t and base-time value

Actual minus forecast.

#507 11–12

Mean forecast error

\(MFE=\frac{1}{n}\sum e_t\)

Variables\(MFE\)Mean forecast error\(n\)sample size, number of observations, or number of trials

Bias metric.

#508 11–12

Mean squared forecast error

\(MSFE=\frac{1}{n}\sum e_t^2\)

Variables\(MSFE\)Mean squared forecast error\(n\)sample size, number of observations, or number of trials

Error metric.

Unit 16

Experimental Design, Sampling Design, and Simulation Metrics

18 / 18 formulas

#509 6–8

Sample-to-population fraction

\(\frac{n}{N}\)

Variables\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency

Sampling fraction.

#510 6–8

Allocation proportion

\(p_i=\frac{n_i}{n}\)

Variables\(p_i\)Allocation proportion\(n\)sample size, number of observations, or number of trials\(n_i\)count in group or category i

Share assigned to group \(i\).

#511 6–8

Expected group size

\(E(n_i)=np_i\)

Variables\(E(n_i)\)Expected group size\(p_i\)probability or proportion for category i\(n_i\)count in group or category i

Random allocation expectation.

#512 9–10

Stratified sample size

\(n_h=\frac{N_h}{N}n\)

Variables\(n_h\)Stratified sample size\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency\(N_h,n_h,S_h,W_h\)stratum population size, sample size, standard deviation, and weight

Proportional allocation.

#513 9–10

Systematic sampling interval

\(k=\frac{N}{n}\)

Variables\(k\)Systematic sampling interval\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency

Choose every \(k\)th member.

#514 9–10

Cluster sample total

\(n=\sum n_{\text{clusters selected}}\)

Variables\(n\)Cluster sample total\(\text{clusters selected}\)the named quantity shown in the formula

Total sampled individuals.

#515 9–10

Simulation estimate of probability

\(\hat{p}=\frac{\text{number of successful trials}}{\text{number of simulation trials}}\)

Variables\(\hat{p}\)Simulation estimate of probability\(\text{number of successful trials}\)the named quantity shown in the formula\(\text{number of simulation trials}\)the named quantity shown in the formula\(p\)probability, population proportion, or success probability

Monte Carlo estimate.

#516 9–10

Simulation standard error

\(SE\approx\sqrt{\frac{\hat{p}(1-\hat{p})}{R}}\)

Variables\(SE\)Simulation standard error\(\hat{p}\)sample proportion\(p\)probability, population proportion, or success probability\(R\)range, return, rank, or number of simulation repetitions named by the formula

\(R\)=simulation repetitions.

#517 9–10

Random assignment balance difference

\(D=\bar{x}_{treatment}-\bar{x}_{control}\)

Variables\(D\)Random assignment balance difference\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value

Compare groups.

#518 11–12

Neyman allocation

\(n_h=n\frac{N_hS_h}{\sum N_hS_h}\)

Variables\(n_h\)Neyman allocation\(n\)sample size, number of observations, or number of trials\(N_h,n_h,S_h,W_h\)stratum population size, sample size, standard deviation, and weight

Advanced stratified allocation.

#519 11–12

Weighted survey mean

\(\bar{x}_w=\frac{\sum w_ix_i}{\sum w_i}\)

Variables\(\bar{x}_w\)Weighted survey mean\(\bar{x}\)sample mean or average of x-values\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(w_i\)weight for value i

Survey weights.

#520 11–12

Post-stratified estimate

\(\hat{\mu}_{post}=\sum_h W_h\bar{x}_h\)

Variables\(\hat{\mu}_{post}\)Post-stratified estimate\(\bar{x}\)sample mean or average of x-values\(\mu\)population mean\(x\)data value, outcome, or input value\(N\)total count, population size, or total frequency\(N_h,n_h,S_h,W_h\)stratum population size, sample size, standard deviation, and weight

\(W_h=N_h/N\).

#521 11–12

Relative bias

\(\text{Relative bias}=\frac{E(\hat{\theta})-\theta}{\theta}\)

Variables\(\text{Relative bias}\)Relative bias\(\hat{\theta}\)sample statistic or estimator\(\theta\)angle, parameter, or statistic named by the formula

Estimator quality.

#522 11–12

Mean squared error

\(MSE(\hat{\theta})=Var(\hat{\theta})+[Bias(\hat{\theta})]^2\)

Variables\(MSE(\hat{\theta})\)Mean squared error\(\hat{\theta}\)sample statistic or estimator\(\theta\)angle, parameter, or statistic named by the formula\(MAE,MSE,RMSE,MAPE\)common prediction or error-size metrics

Estimator quality.

#523 11–12

Randomization p-value

\(p=\frac{\#\{\text{simulated statistics as extreme as observed}\}}{\#\{\text{simulations}\}}\)

Variables\(p\)Randomization p-value\(\text{simulated statistics as extreme as observed}\)the named quantity shown in the formula\(\text{simulations}\)the named quantity shown in the formula

Simulation inference.

#524 11–12

Permutation difference in means

\(D^*=\bar{x}_A^*-\bar{x}_B^*\)

Variables\(D^*\)Permutation difference in means\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value

Randomization distribution.

#525 11–12

Bootstrap statistic

\(\hat{\theta}^*=\text{statistic from resample}\)

Variables\(\hat{\theta}^*\)Bootstrap statistic\(\text{statistic from resample}\)the named quantity shown in the formula\(\hat{\theta}\)sample statistic or estimator\(\theta\)angle, parameter, or statistic named by the formula

Bootstrap resampling.

#526 11–12

Bootstrap standard error

\(SE_{boot}=SD(\hat{\theta}^*)\)

Variables\(SE_{boot}\)Bootstrap standard error\(\hat{\theta}\)sample statistic or estimator\(\theta\)angle, parameter, or statistic named by the formula

SD of bootstrap statistics.

Unit 17

Errors, Accuracy, Rounding, and Measurement Statistics

20 / 20 formulas

#527 3–5

Absolute error

\(AE=\lvert\text{measured}-\text{true}\rvert\)

Variables\(AE\)Absolute error\(\text{measured}\)the named quantity shown in the formula\(\text{true}\)the named quantity shown in the formula\(AE,RE,PE\)absolute, relative, and percentage error

Measurement error.

#528 3–5

Relative error

\(RE=\frac{AE}{\lvert\text{true}\rvert}\)

Variables\(RE\)Relative error\(\text{true}\)the named quantity shown in the formula\(AE,RE,PE\)absolute, relative, and percentage error

Dimensionless error.

#529 3–5

Percentage error

\(PE=\frac{AE}{\lvert\text{true}\rvert}\times100\%\)

Variables\(PE\)Percentage error\(\text{true}\)the named quantity shown in the formula\(AE,RE,PE\)absolute, relative, and percentage error

Percent error.

#530 6–8

Lower bound from rounding

\(LB=x-\frac{u}{2}\)

Variables\(LB\)Lower bound from rounding\(x\)data value, outcome, or input value\(LB,UB\)lower and upper bounds\(u,u_x,u_y,u_z\)measurement uncertainty values

Rounded to nearest unit \(u\).

#531 6–8

Upper bound from rounding

\(UB=x+\frac{u}{2}\)

Variables\(UB\)Upper bound from rounding\(x\)data value, outcome, or input value\(LB,UB\)lower and upper bounds\(u,u_x,u_y,u_z\)measurement uncertainty values

Rounded to nearest unit \(u\).

#532 6–8

Tolerance interval for measurement

\(x\pm\frac{u}{2}\)

Variables\(x\)data value, outcome, or input value\(u,u_x,u_y,u_z\)measurement uncertainty values

Rounded to nearest unit \(u\).

#533 6–8

Mean absolute percentage error

\(MAPE=\frac{100\%}{n}\sum\left\lvert\frac{A_i-F_i}{A_i}\right\rvert\)

Variables\(MAPE\)Mean absolute percentage error\(n\)sample size, number of observations, or number of trials\(MAE,MSE,RMSE,MAPE\)common prediction or error-size metrics\(A_i,F_i\)actual and forecast values for item i

Forecast/approximation error.

#534 6–8

Root mean square deviation

\(RMSD=\sqrt{\frac{\sum(x_i-y_i)^2}{n}}\)

Variables\(RMSD\)Root mean square deviation\(x_i\)ith data value or observation\(y_i\)ith y-value or response observation\(n\)sample size, number of observations, or number of trials

Difference between two series.

#535 9–10

Standard uncertainty of mean

\(u_{\bar{x}}=\frac{s}{\sqrt{n}}\)

Variables\(u_{\bar{x}}\)Standard uncertainty of mean\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(u,u_x,u_y,u_z\)measurement uncertainty values

Measurement repeated trials.

#536 9–10

Combined uncertainty for sum

\(u_{x+y}=\sqrt{u_x^2+u_y^2}\)

Variables\(u_{x+y}\)Combined uncertainty for sum\(x\)data value, outcome, or input value\(y\)response value or transformed value\(u,u_x,u_y,u_z\)measurement uncertainty values

Independent uncertainties.

#537 9–10

Combined uncertainty for difference

\(u_{x-y}=\sqrt{u_x^2+u_y^2}\)

Variables\(u_{x-y}\)Combined uncertainty for difference\(x\)data value, outcome, or input value\(y\)response value or transformed value\(u,u_x,u_y,u_z\)measurement uncertainty values

Independent uncertainties.

#538 9–10

Relative uncertainty product

\(\left(\frac{u_z}{z}\right)^2=\left(\frac{u_x}{x}\right)^2+\left(\frac{u_y}{y}\right)^2\)

Variables\(x\)data value, outcome, or input value\(y\)response value or transformed value\(z\)standard score or normal critical value\(u,u_x,u_y,u_z\)measurement uncertainty values

For \(z=xy\), independent.

#539 9–10

Relative uncertainty quotient

\(\left(\frac{u_z}{z}\right)^2=\left(\frac{u_x}{x}\right)^2+\left(\frac{u_y}{y}\right)^2\)

For \(z=x/y\), independent.

#540 11–12

Root mean square error

\(RMSE=\sqrt{\frac{1}{n}\sum e_i^2}\)

Variables\(RMSE\)Root mean square error\(n\)sample size, number of observations, or number of trials\(e_i\)residual or error for observation i\(MAE,MSE,RMSE,MAPE\)common prediction or error-size metrics

Prediction or measurement errors.

#541 11–12

Mean squared error

\(MSE=\frac{1}{n}\sum e_i^2\)

Variables\(MSE\)Mean squared error\(n\)sample size, number of observations, or number of trials\(e_i\)residual or error for observation i\(MAE,MSE,RMSE,MAPE\)common prediction or error-size metrics

Average squared error.

#542 11–12

Bias of measurements

\(Bias=\bar{x}-x_{\text{true}}\)

Variables\(Bias\)Bias of measurements\(\text{true}\)the named quantity shown in the formula\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value

Average measurement error.

#543 11–12

Precision as standard deviation

\(\text{Precision measure}=s\)

Variables\(\text{Precision measure}\)Precision as standard deviation\(s\)sample standard deviation

Lower \(s\) means higher precision.

#544 11–12

Signal-to-noise ratio

\(SNR=\frac{\mu}{\sigma}\)

Variables\(SNR\)Signal-to-noise ratio\(\mu\)population mean\(\sigma\)population standard deviation

Basic form.

#545 11–12

Coefficient of repeatability

\(CR=1.96\sqrt{2}s_w\)

Variables\(CR\)Coefficient of repeatability\(s_w\)within-subject or repeated-measurement standard deviation

Repeated-measurement agreement.

#546 11–12

Bland-Altman limit of agreement

\(\bar{d}\pm1.96s_d\)

Variables\(\bar{d}\)mean of paired differences\(s_d\)standard deviation of paired differences\(c,d\)cell counts, constants, or additional values named by the formula

Agreement analysis extension.

Unit 18

Advanced Senior Secondary Extensions

29 / 29 formulas

#547 11–12

Moment about origin

\(\mu'_r=E(X^r)\)

Variables\(\mu'_r\)Moment about origin\(\mu\)population mean\(X,Y,Z\)random variables or standardized variables used in the formula\(r\)correlation coefficient, rate, rank, or period count named by the formula\(\mu'_r,\mu_r\)raw and central moments of order r

Raw moment.

#548 11–12

Central moment

\(\mu_r=E[(X-\mu)^r]\)

Variables\(\mu_r\)Central moment\(\mu\)population mean\(X,Y,Z\)random variables or standardized variables used in the formula\(r\)correlation coefficient, rate, rank, or period count named by the formula\(\mu'_r,\mu_r\)raw and central moments of order r

Moment about mean.

#549 11–12

Skewness

\(\gamma_1=\frac{\mu_3}{\sigma^3}\)

Variables\(\gamma_1\)Skewness\(\mu\)population mean\(\sigma\)population standard deviation\(\gamma_1,\gamma_2,\beta_2\)skewness, excess kurtosis, and kurtosis measures

Shape measure.

#550 11–12

Sample skewness simple

\(g_1=\frac{\frac{1}{n}\sum(x_i-\bar{x})^3}{s^3}\)

Variables\(g_1\)Sample skewness simple\(\bar{x}\)sample mean or average of x-values\(s\)sample standard deviation\(x_i\)ith data value or observation\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials

Common computational form.

#551 11–12

Kurtosis

\(\beta_2=\frac{\mu_4}{\sigma^4}\)

Variables\(\beta_2\)Kurtosis\(\mu\)population mean\(\sigma\)population standard deviation\(\beta\)Type II error probability, regression parameter, or distribution parameter\(\gamma_1,\gamma_2,\beta_2\)skewness, excess kurtosis, and kurtosis measures

Shape measure.

#552 11–12

Excess kurtosis

\(\gamma_2=\frac{\mu_4}{\sigma^4}-3\)

Variables\(\gamma_2\)Excess kurtosis\(\mu\)population mean\(\sigma\)population standard deviation\(\gamma_1,\gamma_2,\beta_2\)skewness, excess kurtosis, and kurtosis measures

Normal distribution has 0 excess kurtosis.

#553 11–12

Moment generating function

\(M_X(t)=E(e^{tX})\)

Variables\(M_X(t)\)Moment generating function\(X,Y,Z\)random variables or standardized variables used in the formula\(t\)t statistic or time period\(M_X(t),G_X(s)\)moment-generating and probability-generating functions

Advanced extension.

#554 11–12

Mean from MGF

\(E(X)=M'_X(0)\)

Variables\(E(X)\)Mean from MGF\(X,Y,Z\)random variables or standardized variables used in the formula

If MGF exists.

#555 11–12

Second moment from MGF

\(E(X^2)=M''_X(0)\)

Variables\(E(X^2)\)Second moment from MGF\(X,Y,Z\)random variables or standardized variables used in the formula

If MGF exists.

#556 11–12

Probability generating function

\(G_X(s)=E(s^X)\)

Variables\(G_X(s)\)Probability generating function\(s\)sample standard deviation\(X,Y,Z\)random variables or standardized variables used in the formula\(M_X(t),G_X(s)\)moment-generating and probability-generating functions

Discrete nonnegative integer variables.

#557 11–12

Mean from PGF

\(E(X)=G'_X(1)\)

Variables\(E(X)\)Mean from PGF\(X,Y,Z\)random variables or standardized variables used in the formula

If PGF exists.

#558 11–12

Variance from PGF

\(Var(X)=G''_X(1)+G'_X(1)-[G'_X(1)]^2\)

Variables\(Var(X)\)Variance from PGF\(X,Y,Z\)random variables or standardized variables used in the formula

If PGF exists.

#559 11–12

Law of total expectation

\(E(X)=E(E(X\mid Y))\)

Variables\(E(X)\)Law of total expectation\(X,Y,Z\)random variables or standardized variables used in the formula\(\mid\)given condition in conditional probability

Advanced probability.

#560 11–12

Law of total variance

\(Var(X)=E(Var(X\mid Y))+Var(E(X\mid Y))\)

Variables\(Var(X)\)Law of total variance\(X,Y,Z\)random variables or standardized variables used in the formula\(\mid\)given condition in conditional probability

Advanced probability.

#561 11–12

Markov inequality

\(P(X\ge a)\le\frac{E(X)}{a}\)

Variables\(X,Y,Z\)random variables or standardized variables used in the formula\(a,b\)line intercept/slope, constants, cell counts, or interval endpoints

For nonnegative \(X\).

#562 11–12

Normal approximation to binomial

\(X\sim B(n,p)\approx N(np,np(1-p))\)

Variables\(X\sim B(n,p)\)Normal approximation to binomial\(X,Y,Z\)random variables or standardized variables used in the formula\(n\)sample size, number of observations, or number of trials\(N\)total count, population size, or total frequency\(B,C\)events, groups, constants, or distribution labels named by the formula\(p\)probability, population proportion, or success probability

When conditions are suitable.

#563 11–12

Continuity correction lower

\(P(X\ge k)\approx P(Y>k-0.5)\)

Variables\(P(X\ge k)\)Continuity correction lower\(X,Y,Z\)random variables or standardized variables used in the formula\(k\)class position, selected count, number of categories, or period length

For normal approximation.

#564 11–12

Continuity correction upper

\(P(X\le k)\approx P(Y<k+0.5)\)

Variables\(P(X\le k)\)Continuity correction upper\(X,Y,Z\)random variables or standardized variables used in the formula\(k\)class position, selected count, number of categories, or period length

For normal approximation.

#565 11–12

Poisson approximation to binomial

\(B(n,p)\approx Pois(np)\)

Variables\(B(n,p)\)Poisson approximation to binomial\(n\)sample size, number of observations, or number of trials\(B,C\)events, groups, constants, or distribution labels named by the formula\(p\)probability, population proportion, or success probability

When \(n\) large and \(p\) small.

#566 11–12

Normal approximation to Poisson

\(Pois(\lambda)\approx N(\lambda,\lambda)\)

Variables\(Pois(\lambda)\)Normal approximation to Poisson\(N\)total count, population size, or total frequency\(\lambda\)rate parameter or Poisson mean

When \(\lambda\) is large.

#567 11–12

Exponential-Poisson relation

\(P(T>t)=P(N(t)=0)=e^{-\lambda t}\)

Variables\(P(T>t)\)Exponential-Poisson relation\(N\)total count, population size, or total frequency\(t\)t statistic or time period\(\lambda\)rate parameter or Poisson mean

Waiting time to first Poisson event.

#568 11–12

Beta density

\(f(x)=\frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)},\;0<x<1\)

Variables\(f(x)\)Beta density\(x\)data value, outcome, or input value\(f\)frequency or class frequency\(B,C\)events, groups, constants, or distribution labels named by the formula\(\alpha\)significance level or distribution parameter, depending on context\(\beta\)Type II error probability, regression parameter, or distribution parameter

Advanced extension.

#569 11–12

Beta mean

\(E(X)=\frac{\alpha}{\alpha+\beta}\)

Variables\(E(X)\)Beta mean\(X,Y,Z\)random variables or standardized variables used in the formula\(\alpha\)significance level or distribution parameter, depending on context\(\beta\)Type II error probability, regression parameter, or distribution parameter

Beta distribution.

#570 11–12

Beta variance

\(Var(X)=\frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}\)

Variables\(Var(X)\)Beta variance\(X,Y,Z\)random variables or standardized variables used in the formula\(\alpha\)significance level or distribution parameter, depending on context\(\beta\)Type II error probability, regression parameter, or distribution parameter

Beta distribution.

#571 11–12

Normal likelihood, independent data

\(L(\mu,\sigma)=\prod_{i=1}^{n}\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x_i-\mu)^2}{2\sigma^2}}\)

Variables\(L(\mu,\sigma)\)Normal likelihood, independent data\(\mu\)population mean\(\sigma\)population standard deviation\(\sigma^2\)population variance\(x_i\)ith data value or observation\(n\)sample size, number of observations, or number of trials\(L\)lower boundary, likelihood, or index value named by the formula

Advanced modeling extension.

#572 11–12

Maximum likelihood for Bernoulli p

\(\hat{p}=\frac{x}{n}\)

Variables\(\hat{p}\)Maximum likelihood for Bernoulli p\(x\)data value, outcome, or input value\(n\)sample size, number of observations, or number of trials\(p\)probability, population proportion, or success probability

Sample proportion.

#573 11–12

Maximum likelihood for Poisson lambda

\(\hat{\lambda}=\bar{x}\)

Variables\(\hat{\lambda}\)Maximum likelihood for Poisson lambda\(\bar{x}\)sample mean or average of x-values\(x\)data value, outcome, or input value\(\lambda\)rate parameter or Poisson mean

Sample mean.

#574 11–12

AIC

\(AIC=2k-2\ln(L)\)

Variables\(AIC\)AIC\(L\)lower boundary, likelihood, or index value named by the formula\(k\)class position, selected count, number of categories, or period length\(AIC,BIC\)model comparison criteria

Model comparison extension.

#575 11–12

BIC

\(BIC=k\ln(n)-2\ln(L)\)

Variables\(BIC\)BIC\(n\)sample size, number of observations, or number of trials\(L\)lower boundary, likelihood, or index value named by the formula\(k\)class position, selected count, number of categories, or period length\(AIC,BIC\)model comparison criteria

Model comparison extension.

Unit 19

Calculator, Spreadsheet, and Exam-Style Helper Formulas

22 / 22 formulas

#576 6–8

Total weighted marks

\(T=\sum w_is_i\)

Variables\(T\)Total weighted marks\(w_i\)weight for value i

Weighted score calculation.

#577 6–8

Weighted grade percentage

\(G=\frac{\sum w_is_i}{\sum w_i}\times100\%\)

Variables\(G\)Weighted grade percentage\(w_i\)weight for value i

If \(s_i\) are proportions.

#578 6–8

Class average score

\(\bar{x}=\frac{\sum \text{student scores}}{\text{number of students}}\)

Variables\(\bar{x}\)Class average score\(\text{student scores}\)the named quantity shown in the formula\(\text{number of students}\)the named quantity shown in the formula\(x\)data value, outcome, or input value

School data use.

#579 6–8

Pass rate

\(\text{Pass rate}=\frac{\text{number passed}}{\text{number tested}}\times100\%\)

Variables\(\text{Pass rate}\)Pass rate\(\text{number passed}\)the named quantity shown in the formula\(\text{number tested}\)the named quantity shown in the formula

Education statistics.

#580 6–8

Failure rate

\(\text{Failure rate}=100\%-\text{pass rate}\)

Variables\(\text{Failure rate}\)Failure rate\(\text{pass rate}\)the named quantity shown in the formula

Percent form.

#581 6–8

Attendance rate

\(\text{Attendance rate}=\frac{\text{days present}}{\text{total school days}}\times100\%\)

Variables\(\text{Attendance rate}\)Attendance rate\(\text{days present}\)the named quantity shown in the formula\(\text{total school days}\)the named quantity shown in the formula

Education data.

#582 6–8

Absence rate

\(\text{Absence rate}=100\%-\text{attendance rate}\)

Variables\(\text{Absence rate}\)Absence rate\(\text{attendance rate}\)the named quantity shown in the formula

Education data.

#583 9–10

Growth in enrollment

\(\text{Growth \%}=\frac{E_2-E_1}{E_1}\times100\%\)

Variables\(\text{Growth \%}\)Growth in enrollment

School data.

#584 9–10

Standardized test score

\(S=\mu_S+\sigma_S\left(\frac{x-\mu_x}{\sigma_x}\right)\)

Variables\(S\)Standardized test score\(\mu\)population mean\(\sigma\)population standard deviation\(x\)data value, outcome, or input value\(u,u_x,u_y,u_z\)measurement uncertainty values

Convert to desired scale.

#585 9–10

Composite score

\(C=\sum w_iS_i\)

Variables\(C\)Composite score\(w_i\)weight for value i\(B,C\)events, groups, constants, or distribution labels named by the formula

Weighted components.

#586 9–10

Rank percentile simple

\(PR=\frac{\text{number below}}{n}\times100\%\)

Variables\(PR\)Rank percentile simple\(\text{number below}\)the named quantity shown in the formula\(n\)sample size, number of observations, or number of trials

Simple version.

#587 9–10

Class rank percentage

\(\text{Rank \%}=\frac{\text{rank}}{n}\times100\%\)

Variables\(\text{Rank \%}\)Class rank percentage\(\text{rank}\)the named quantity shown in the formula\(n\)sample size, number of observations, or number of trials

Lower may be better depending context.

#588 9–10

Growth rate per period

\(r=\left(\frac{V_t}{V_0}\right)^{1/t}-1\)

Variables\(r\)Growth rate per period\(t\)t statistic or time period\(V_i,V_f,V_t,V_0\)initial, final, current-time, or base values

Compound growth.

#589 9–10

Doubling time approximation

\(T_d\approx\frac{70}{r\%}\)

Variables\(T_d\)Doubling time approximation\(r\)correlation coefficient, rate, rank, or period count named by the formula

Rule of 70.

#590 9–10

Halving time approximation

\(T_h\approx\frac{70}{r\%}\)

Variables\(T_h\)Halving time approximation\(r\)correlation coefficient, rate, rank, or period count named by the formula

For exponential decay percent rate.

#591 11–12

Exact doubling time

\(T_d=\frac{\ln2}{\ln(1+r)}\)

Variables\(T_d\)Exact doubling time\(r\)correlation coefficient, rate, rank, or period count named by the formula

Compound growth.

#592 11–12

Exact halving time

\(T_h=\frac{\ln(0.5)}{\ln(1-r)}\)

Variables\(T_h\)Exact halving time\(r\)correlation coefficient, rate, rank, or period count named by the formula

Compound decay.

#593 11–12

Log return

\(R=\ln\left(\frac{P_t}{P_{t-1}}\right)\)

Variables\(R\)Log return\(t\)t statistic or time period

Advanced data applications.

#594 11–12

Arithmetic return

\(R=\frac{P_t-P_{t-1}}{P_{t-1}}\)

Variables\(R\)Arithmetic return\(t\)t statistic or time period

Data applications.

#595 11–12

Mean return

\(\bar{R}=\frac{1}{n}\sum R_i\)

Variables\(\bar{R}\)Mean return\(n\)sample size, number of observations, or number of trials\(R\)range, return, rank, or number of simulation repetitions named by the formula

Data applications.

#596 11–12

Return volatility

\(s_R=\sqrt{\frac{\sum(R_i-\bar{R})^2}{n-1}}\)

Variables\(s_R\)Return volatility\(n\)sample size, number of observations, or number of trials\(R\)range, return, rank, or number of simulation repetitions named by the formula

Standard deviation of returns.

#597 11–12

Annualized volatility

\(s_{annual}=s_{period}\sqrt{m}\)

Variables\(s_{annual}\)Annualized volatility

\(m\)=periods per year.

Statistics Formula FAQ

What formulas are included in this statistics and probability formula bank?

It includes 597 K-12 statistics and probability formulas covering data displays, averages, spread, probability rules, combinatorics, distributions, inference, regression, sampling, errors, time series, and senior secondary extensions.

Can I use this page for AP, IB, GCSE, IGCSE, CBSE, and A-Level-style revision?

Yes. The bank is organized for broad global K-12 revision and includes formulas commonly seen across AP Statistics, IB Mathematics, GCSE and IGCSE, CBSE or NCERT, Common Core, and senior secondary pathways.

Why are some advanced formulas listed on a school formula page?

Some advanced formulas appear in AP, IB, A-Level-style courses, optional senior secondary statistics topics, or extension work. The grade-band labels help students choose the right level.

Should students memorize every formula on this page?

No. Students should follow their teacher, syllabus, and exam-board formula sheet. This page is designed as a searchable revision and reference bank.