Cumulative Distribution Functions

Another useful way to characterize a distribution is with a cumulative distribution function or cdf. The normalized cumulative distribution is the fraction of values below some number x as a function of x. In other words if the pdf of a distribution is P(x) dx then the cdf is

$cdf(x) = \int_{-\infty}^x P(x’) dx’$

The main advantage the cdf has with real data is that it doesn’t depend on binning. It also smooths over noise.

Readings:

think-stats: cumulative distribution functions