Society & Culture & Entertainment Education

Selecting Distributions for Risk, Reliability and Quality Simulations

There are two basic steps to choosing the best distribution of an input variable.
The first is to work out how well you actually need to define the distribution.
The second is to define a distribution with as little bias as possible.
Definition requirements Not all inputs need to be fully defined.
If the way they interact with each other has certain characteristics, then it is really only the mean and the standard deviation that needs to be known.
The key to this step is the central limit theorem.
If the system is one the adds or multiplies the inputs, then the central limit theorem says that the output will be a Normal or Lognormal distribution respectively.
Both of these distributions are defined only by two parameters.
This means that only the mean and the standard deviation are needed to define each of them.
Additionally, the processes of adding and multiplying only rely upon the mean and the standard deviation.
If the inputs are added or multiplied, then only the mean and standard deviation of each input is needed to calculate the mean and the standard deviation of the output.
Therefore, because the outcome of multiplications and additions is set (Lognormal or Normal respectively) and only the mean and standard deviation are at play, only the mean and standard deviation need to be known for input variables that are either solely multiplied or added together.
Note that this assumes that the requirements of the central limit theorem are met.
Unbiased definition When only a limited amount of information is available, there are limits to the distributions that can be selected.
For example, if only the maximum and minimum possible values are known, then a uniform distribution should be chosen.
This is the least biased distribution that can be used in this case.
It is a distribution that maximises the uncertainty given what is known.
This is called a maximum entropy distribution and it is the type that should if bias is to be removed.
For another example, if only the mean and standard deviation are known then the least biased (maximum entropy) distribution is the Normal distribution.
The more information that is had, the more accurate the defined distribution can be.
The best way to ensure that an unbiased distribution is used is to collect as much data as possible, calculate the moments that can be reasonably calculated and then find the maximum entropy distribution.
Summary When defining the input distributions first check that the central limit theorem doesn't control the way they interact.
If it does, then only put effort into defining the mean and standard deviation for each.
Otherwise, collect as much data on the variable and create a maximum entropy distribution.

Leave a reply