An Introduction to Probability Distributions

Thinking in probabilities

Statistics deal with many different possibilities at once, which often makes it unintuitive to think about. In this introduction, we will discuss several concepts that are important for expressing knowledge about different possibilities, and how we can draw information from them.

A very important concept are random variables. Strictly speaking, random variables are neither random, nor are they variables. Instead, they are conceptual objects which assign numbers to distinct real-world phenomena, outcomes, or properties that are in some way random and mutually exclusive. For instance, a random variable might map "a dice roll of one" to 1, "a die roll of two" to 2, and so forth. A different random variable might map hair colors from black to blond to numbers between 0 and 1.

An important role of random variables is that they categorize and structure quantities of interests into a format to which we can later assign probabilities. These probabilities could represent the chance of an event occurring ("chance of rolling a one"), or the relative frequency of certain phenomena in a population ("chance of having red hair").

Figure 1. A random variable takes some real-world object of interest (here: hair color) and assigns numbers (here: from zero to one) to different mutually-exclusive outcomes (here: it is not possible to have two different hair colors at once).

Joint probability distributions

Joint probability distributions are an important ingredient of statistical inference. In a nutshell, these distributions describe the probability of all possible combinations of outcomes between two or more random variables. For discrete two-dimensional distributions, these probabilities are often represented as a so-called contingency table.

An example of this is provided in Figure 1. Here, you see the relative frequency of different eye colors and hair colors of 592 students. The probability of a specific combination can be calculated by dividing the number of students in the corresponding cell through the total number of students:

Question: What is the probability of your combination of hair and eye color in this contingency table?

Interactive element. Figure 1. This is a contingency table of eye colors and hair colors. Hover your mouse over the table to see the percentage of students that had this specific combination of eye and hair color. (Data source: Snee, 1974, https://doi.org/10.2307/2683520)

Marginal probability distributions

Marginal probability distributions remove the influence of one or more random variables (here: eye color or hair color) from a contingency table by summing over the dimension we want to remove. For instance, if we want are interested in the probability of having green eyes irrespective of hair color, we can calculate this probability by summing over all columns:

The name marginal probability distribution derives from the fact these probability distributions are often visualized as being on the margins of the contingency table.

Question: What are the marginal probabilities of your hair color and your eye color?

Interactive element. Figure 2. Marginal probabilities remove dependencies on random variables by summing over all of their possible realizations. Hover over the contingency table to see the marginal probabilities of different eye colors and hair colors.

Conditional probability distributions

Conditional probability distributions describe "if this, then what?" scenarios. For instance, if we assume that a person's eye colour is green, what is the probability of them also having red hair?

More generally, estimating conditional distributions is highly important in research and industry. They allow us to study scenarios such as "If global temperature raises by 5°C, what is the probability of the Netherlands flooding?" or play an integral part in statistical inference, for example in questions such as "If we observe a water table rise of 1m, how much has it rained?".

In a contingency table, conditional probabilities are calculated by dividing the frequency of a table entry (here: a combination of green eyes and any hair color) by the marginal probability of conditioning variable (here: green eyes).

Question: What are the conditional probabilities of (1) your hair color given your eye color, and (2) your eye color given your hair color? Which of the two is rarer?

Interactive element. Figure 3. Conditional distributions posit that one random variable takes on a certain value, then quantify the conditional probability of the other random variable(s). Hover over the contingency table to see the conditional probabilities, and click one of the circles to select a different conditioning variable.

Probability density functions

So far, we have considered only discrete contingency tables with four different eye colors and four different hair colors (Figure 4, left). In practice, we are often free to choose the level of resolution at which we want to represent a system. For instance, we could choose to resolve the contingency table with nine different eye colors and hair colors instead (Figure 4, center). Observe that as we increase the resolution of the random variables, we require increasingly more total samples in order to retain the level of detail of the probability distribution underlying the contingency table.

As we take this increase in resolution to its extreme, the discrete contingency table gains an infinite number of rows and columns. This would require an infinite number of samples to populate, so we would never finish filling in this contingency table. Instead, we replace it with a continuous probability density function (pdf). Rather than storing the discrete probabilities of every possible value combination, pdfs describe the patterns of the underlying probability distribution directly (Figure 4, right). Integrating over a pdf always yields a value of 1, and integrating over a volume of parameter space (for instance, the grids of a contingency table) returns discrete probabilities.

The marginal and conditional distributions above are still defined the same way, but now they are also continuous pdfs. The summation operations above are replaced with integration.

Interactive element. Figure 4. The joint distribution of eye color and hair color at three different levels of resolution: 4-by-4, 9-by-9, and continuous. Hover your mouse over any of the three subplots and examine the differences at varying degrees of resolution.

Page updated

Google Sites

Report abuse