Cumulative standard normal distribution

July 07, 2019 2 minutes reading time Math go statistics

I’ve written about the use of the central limit theorem (CLT) to solve some problems in statistics before. It involves calculating a z-value and looking up the probability in a z-table. A z-table holds the cumulative probabilities for values that follow the standard normal distribution, i.e. the mean is 0 and the standard deviation is 1. For other means and standard deviations, the values can be normalized, so that mean/standard deviation of the standard normal distribution apply.

In programs like Excel or in code, we can easily calculate the z-table values. Some languages like Java require the use of 3rd party libraries if you don’t want to do all the work yourself, but Go has everything built-in already. Let’s see how to do it.

The cumulative distribution function for a function with normal distribution: $$\phi\left(x\right) = \frac{1}{2}\left(1+\textbf{erf}\left(\frac{x-\mu}{\sigma\sqrt{2}}\right)\right)$$

Where $\textbf{erf}$ is the error function: $$\textbf{erf}\left(z\right) = \frac{2}{\sqrt{\pi}}\int_0^z e^{-t^2} dt$$

Go has the error function $\textbf{erf}$ in the math package, so it’s easy to write $\phi(x)$ as a higher-order-function, that takes the mean $\mu$ and the standard deviation $\sigma$ and gives us a function that only depends on the random variable x. With this we can write all the code necessary to solve the statistics problem mentioned above.

package main

import (
	"fmt"
	"math"
)

func phi(m float64, sd float64) func(float64) float64 {
	return func(x float64) float64 {
		z := (x - m) / sd / math.Sqrt(2)
		return (1 + math.Erf(z)) / 2
	}
}

func main() {
	maxWeight, boxes, m, sd := 2630.0, 36.0, 72.0, 3.0

	// the central limit theorem states that if we have a
	// sufficiently large sample of n boxes, we can expect
	// the mean of this sample to be equal to the statistical
	// mean, and the σ_sample of the sample to be σ / sqrt(n)

	zTable := phi(0, 1) // cumulative std. normal distribution
	boxCritical := maxWeight / boxes
	sdSample := sd / math.Sqrt(boxes)
	// normalize for z-table use
	z := (boxCritical - m) / sdSample
	p := zTable(z)

	// probability that boxes can be transported
	fmt.Printf("%f\n", p)
}

We could have used the $\phi$ function without normalizing, since it can handle arbitrary means and standard deviations, but I wanted to show a way that mimics how you would do it by hand using a z-table.