Most people know of π, or ‘pi’, as the number they learned in high school that has to do with circles: it is the ratio of a circle’s diameter to its circumference (π=C/d), the area of the circle is πr^{2} (especially hilarious because pie are round, not squared), etc. Some of us even remember it as an irrational number, meaning you cannot write it down as a simple fraction, and maybe some people, certainly not me, still have it memorized as starting with 3.14159265. What is less appreciated, however, is that this number has utility far beyond allowing us to calculate the area of a circle.

In biophysics, and in science in general, we use statistics to compare our data with our hypotheses. Many of the phenomena we measure fall along (or can be manipulated to fall along) a normal distribution. A normal distribution is a common continuous probability distribution characterized by the familiar “bell curve” shape, or Gaussian, which corresponds to the Gaussian distribution shown in the image below. When the mean, μ, is zero and the variance, σ^{2}, is one, this function (the blue curve) is e^(-x^{2}) and the area under the curve is **the square root of pi! **When the mean and variance are other values, the curve can be described more fully with the equation:

Where* a* = 1 / (σ (2π)^{1/2}) a , b = μ, and c = σ.

Normalized Gaussian curves with expected value μ and variance σ^{2}. The corresponding parameters are *a* = 1 / (σ (2π)^{1/2}) a , b = μ, and c = σ.

How was the Gaussian distribution first determined, you may ask? While pi itself is thought to be first measured by the ancient Babylonians between 1900-1680 B.C., the Gaussian distribution originated in the 18^{th} century when Abraham de Moivre started calculating gambling odds extremely precisely. De Moivre studied a very simple system at first: flipping a coin. He would calculate the probability of getting a certain number of heads from a certain number of coin flips. He found that as the number of events (coin flips) increased, the more his probability distribution approached a smooth curve. Thus he went about finding a mathematical expression for this curve, which resulted in the “normal curve”.

Independently, two mathematicians Adrain and Gauss in 1808 and 1809, respectively, developed the formula for the normal distribution and showed that errors observed in astronomical data fell along this distribution. Small errors in measurements occurred more frequently than large ones. The distribution was also independently discovered by Laplace, who elegantly showed how pi enters into the Gaussian distribution (which is summarized nicely here: http://www.umich.edu/~chem461/Gaussian%20Integrals.pdf). Laplace also introduced the Central Limit Theorem, which proves that with a large enough number of samples the mean will be normally distributed, regardless of the underlying original distribution. This is why the normal distribution ends up popping up in so many places.

In biophysics, every time we think about mean and variance, calculate a p value (which assumes a normal distribution), do image processing, or try to understand the probabilities of a particular event, we owe a debt to pi. Not only do we use the Gaussian for statistics, but we also often use it in fields where we need to apply a potential or some external force either experimentally or in simulation. Basically, pi underlies all of the fundamental biological process we study on a daily basis. Thanks pi!

By Sonya Hanson, postdoc at Memorial Sloan Kettering Cancer Center

References:

https://en.wikipedia.org/wiki/Gaussian_function (Including public domain figure)

http://onlinestatbook.com/2/normal_distribution/history_normal.html

https://www.amazon.com/Cartoon-Guide-Statistics-Larry-Gonick/dp/0062731025