Discrete random variables

Section 11.3 Discrete random variables

Probability is concerned with the study of random variables, which are numerical values arising from random experiments. In this section we will study discrete random variables.

Definition 11.3.1.

A random variable\(X\) is a function \(X:\Omega \to \mathbf{R}\) whose inputs are the outcomes in a sample space \(\Omega\) and whose outputs are real numbers.

For example, the card drawn from a deck or the result of a coin flip are not random variables. However, the rank of a card or the number of heads in multiple coin flips are.

When the values of \(X\) are members of a countable set, almost always \(\mathbf{N}\text{,}\) we say that \(X\) is discrete. Since all of our sample spaces are finite and all finite sets are countable, we will always work with discrete random variables.

To each random variable we associate a probability distribution. In the case of discrete variables we attach a probability mass function (pmf), a function \(f(x)\) where

\begin{equation*} f(x) = P[X=x]. \end{equation*}

In other words, the pmf \(f(x)\) gives the probability that the random variable \(X\) takes on the value \(x\text{.}\) Capital letters denote random variables, while lowercase letters denote their possible values. Finally, \(P[X=x]\) is the probability of the event that the random variable \(X\) takes the value \(x\text{.}\)

Some examples will clarify this unusual notation.

Example 11.3.2.

A single card is drawn from a deck. Let \(X\) be the rank of the card: remember face cards count as 10, and we'll say aces are high (11). The pmf of \(X\) is given in the table below.

Table 11.3.3.

\(x\)	\(2\)	\(3\)	\(4\)	\(5\)	\(6\)	\(7\)	\(8\)	\(9\)	\(10\)	\(11\)
\(P[X=x]\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(4/13\)	\(1/13\)

For example, \(P[X=10]=4/13\text{;}\) in other words, the probability that a card drawn has a face value of \(10\) is \(4/13\) because counting the tens and the face cards, we have \(16\) cards out of the \(52\) total.

We can ask other questions, like: What is the probability that a card drawn has a rank greater than \(8\text{?}\) Each outcome of \(X\) is disjoint (in other words a card can't have two different ranks at once), so we can use the addition rule to calculate

\begin{equation*} P[X > 8] = P[X=9] + P[X=10] + P[X=11] = 6/13. \end{equation*}

Example 11.3.4.

Three coins are flipped. Let \(Y\) be the number of heads showing. The pmf of \(Y\) is given below.

Table 11.3.5.

\(y\)	\(0\)	\(1\)	\(2\)	\(3\)
\(P[Y=y]\)	\(1/8\)	\(3/8\)	\(3/8\)	\(1/8\)

You know how to count the number of three-coin flips that have exactly \(r\) heads: recall it is \({{3}\choose{r}}\text{.}\) These are the numerators of each probability. The total number of three-coin flips is \(2^3=8\text{,}\) which are the denominators.

For instance, the probability that we get at least \(2\) heads in \(3\) flips is

\begin{equation*} P[Y \ge 2] = P[Y=2] + P[Y=3] = 4/8. \end{equation*}

The pmf has certain properties you may have noticed. First of all, because it represents a probability, it must be positive for all values that \(X\) can take (this set is called the support of \(X\text{.}\)) Also, the sum of the pmf at all values in the support must be 1. The following theorem summarizes these properties.

Theorem 11.3.6.

Let \(f(x)\) be the pmf of a discrete random variable \(X\) with support \(S\text{.}\) Then,

\(f(x) > 0\) for all \(x \in S\text{,}\)
\(\displaystyle \sum_{x\in S} f(x) = 1\text{,}\) and
\(P[X \in A] = \displaystyle \sum_{x \in A} f(x)\text{.}\)

The notation \(\displaystyle \sum_{x\in A}\) represents the sum for all \(x\) in the set \(A\text{.}\)

It is natural to ask what the average of a discrete random variable is. For instance, if I roll two dice, what value will do I expect to get? What value do I expect to get on average if I roll two dice a bunch of times? ( Players of certain board games, such as Settlers of Catan, already know.) The mean, or average, of a random variable is called its expected value.

Definition 11.3.7.

Let \(X\) be a discrete random variable with pmf \(f(x)\) and support \(S\text{.}\) Then its expected value, denoted \(E[X]\) or \(\mu_X\text{,}\) is

\begin{equation*} E[X] = \sum_{x \in S} xf(x). \end{equation*}

The symbol \(\mu\) is the Greek letter “mu.” When there is only one random variable in play we can just write \(\mu\) without the subscript.

We can think of the expected value as a weighted average of all the possible values of the variable, with more likely values more heavily weighted. Sometimes, the expected value is literally the value we expect. Other times, such as with uniform distributions where every value is equally likely, it measures the center value.

Let's calculate the expected values of the two variables whose pmfs we wrote down in prior examples.

Example 11.3.8.

Remember that \(X\) was the rank of a card drawn from a standard deck. Here is the pmf of \(X\text{:}\)

Table 11.3.9.

\(x\)	\(2\)	\(3\)	\(4\)	\(5\)	\(6\)	\(7\)	\(8\)	\(9\)	\(10\)	\(11\)
\(P[X=x]\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(1/13\)	\(4/13\)	\(1/13\)

We can calculate the expected rank of a card in a deck by performing the calculation in the definition of expected value. Let \(\mu_X\) denote the average rank of a card. Let \(S=\{2, 3, 4, \ldots, 10, 11\}\text{,}\) i.e., the support of \(X\text{.}\)

\begin{align*} \mu_X = \sum_{x \in S} xP[X=x] \amp = 2\left(\frac{1}{13}\right)+ 3\left(\frac{1}{13}\right) + 4\left(\frac{1}{13}\right) + 5\left(\frac{1}{13}\right) + 6\left(\frac{1}{13}\right)\\ \amp + 7\left(\frac{1}{13}\right) + 8\left(\frac{1}{13}\right) + 9\left(\frac{1}{13}\right) + 10\left(\frac{4}{13}\right) + 11\left(\frac{1}{13}\right)\\ \amp \approx 7.3. \end{align*}

The average rank of a card is \(7.3\text{.}\) The most common rank is 10, but the average is “pulled down” by the many lower ranks.

No card itself has a rank of \(7.3\text{.}\) We should interpret this number in the following way: If we drew many cards (replacing and shuffling each time) and then averaged the ranks of those cards, we would expect the average to get closer to \(7.3\) as we drew more and more cards.

Example 11.3.10.

Remember that \(Y\) was the number of heads in \(3\) coin flips with pmf

Table 11.3.11.

\(y\)	\(0\)	\(1\)	\(2\)	\(3\)
\(P[Y=y]\)	\(1/8\)	\(3/8\)	\(3/8\)	\(1/8\)

The expected number of heads in \(3\) flips is

\begin{equation*} E[Y] = 0\left(\frac{1}{8}\right) + 1\left(\frac{3}{8}\right) + 2\left(\frac{3}{8}\right) + 3\left(\frac{1}{8}\right) = 1.5. \end{equation*}

Convince yourself that \(1.5\) is an extremely reasonable value for the expected number of heads in \(3\) flips of a fair coin.

The following theorem follows from the linear property of summation.

Theorem 11.3.12.

Let \(X\) and \(Y\) be random variables and \(a\) and \(b\) be real numbers. Then,

\begin{equation*} E[aX+bY] = aE[X] + bE[Y]. \end{equation*}

Example 11.3.13.

Verify for yourself that if \(Z\) is the value of a single six-sided die, then \(E[Z]=3.5\text{.}\) Now we have an easy way to determine the expected value of the sum of two dice:

\begin{equation*} E[2Z] = 2E[Z] = 2 \times 3.5 = 7. \end{equation*}

Example 11.3.14.

Suppose we are playing a weird game where you draw a card from a standard deck, and then flip \(3\) coins. Your score is the rank of the card minus the number of heads flipped. Suppose that if the score is at least \(7\text{,}\) you win a prize. Will the average player win?

The variables \(X\text{,}\) the rank of the card, and \(Y\text{,}\) the number of heads, were studied in prior examples. We know \(E[X]=7.3\) and \(E[Y]=1.5\text{.}\) Therefore, the average score in this game would be

\begin{equation*} E[X-Y] = E[X] - E[Y] = 7.3 - 1.5 = 5.8. \end{equation*}

Therefore, the average player will lose. (This is, of course, exactly what the individual buying the prizes wants.)