\( \definecolor{colordef}{RGB}{249,49,84} \definecolor{colorprop}{RGB}{18,102,241} \)

Discrete Random Variables

Random Variables

Definitions

Definition Random Variable
A random variable, denoted \(X\), is a function that assigns a numerical value to each outcome \(\omega\) in a random experiment. We write this value as \(X(\omega)\).
The possible values of \(X\) are the real numbers that \(X\) can take.
Example
Let \(X\) be the number of heads when tossing 2 fair coins: (red coin) and (blue coin). Find \(X(\textcolor{colordef}{H},\textcolor{colorprop}{T})\).

The outcome \((\textcolor{colordef}{H},\textcolor{colorprop}{T})\) means the red coin shows heads (H) and the blue coin shows tails (T). Since \(X\) counts heads, there’s 1 head. Thus, \(X(\textcolor{colordef}{H},\textcolor{colorprop}{T}) = 1\).

Definition Discrete Random Variable
A random variable is discrete if its set of possible values is finite or countably infinite. This means we can list all possible values.
Definition Events Involving a Random Variable
For a random variable \(X\):
  • \((X = x)\): The set of outcomes where \(X\) takes the value \(x\).
  • \((X \leq x)\): The set of outcomes where \(X\) is less than or equal to \(x\).
  • \((X \geq x)\): The set of outcomes where \(X\) is greater than or equal to \(x\).
Example
Let \(X\) be the number of heads when tossing 2 coins: and . List the outcomes for \((X = 0)\), \((X = 1)\), \((X = 2)\), \((X \leq 1)\), and \((X \geq 1)\).

  • \((X = 0) = \{(\textcolor{colordef}{T},\textcolor{colorprop}{T})\}\) (no heads).
  • \((X = 1) = \{(\textcolor{colordef}{T},\textcolor{colorprop}{H}), (\textcolor{colordef}{H},\textcolor{colorprop}{T})\}\) (one head).
  • \((X = 2) = \{(\textcolor{colordef}{H},\textcolor{colorprop}{H})\}\) (two heads).
  • \((X \leq 1) = (X = 0) \cup (X = 1) = \{(\textcolor{colordef}{T},\textcolor{colorprop}{T}), (\textcolor{colordef}{T},\textcolor{colorprop}{H}), (\textcolor{colordef}{H},\textcolor{colorprop}{T})\}\) (at most one head).
  • \((X \geq 1) = (X = 1) \cup (X = 2) = \{(\textcolor{colordef}{T},\textcolor{colorprop}{H}), (\textcolor{colordef}{H},\textcolor{colorprop}{T}), (\textcolor{colordef}{H},\textcolor{colorprop}{H})\}\) (at least one head).

Probability Distribution

Definition Probability Distribution
The probability distribution of a random variable \(X\) lists the probability \(P(X = x_i)\) for each possible value \(x_1,x_2,\dots,x_n\). It can be shown as a table or formula.
Proposition Characteristic of a Probability Distribution
For a random variable \(X\) with possible values \(x_1,x_2,\dots,x_n\), we have
  • \(0 \leq P(X=x_i) \leq 1\) for all \(i=1,\dots,n\),
  • \(\displaystyle\sum_{i=1}^n P(X=x_i) =P(X=x_1)+P(X=x_2)+\dots+P(X=x_n)= 1 \).
Example
Let \(X\) be the number of heads when tossing 2 fair coins: and .
  1. List the possible values of \(X\).
  2. Find the probability distribution.
  3. Create the probability table.
  4. Draw the probability distribution graph.

  1. Possible values: \(0\) (no heads), \(1\) (one head), \(2\) (two heads).
  2. Probability distribution:
    • \(P(X = 0) = P(\{(\textcolor{colordef}{T},\textcolor{colorprop}{T})\}) = \frac{1}{4}\),
    • \(P(X = 1) = P(\{(\textcolor{colordef}{T},\textcolor{colorprop}{H}), (\textcolor{colordef}{H},\textcolor{colorprop}{T})\}) = \frac{2}{4} = \frac{1}{2}\),
    • \(P(X = 2) = P(\{(\textcolor{colordef}{H},\textcolor{colorprop}{H})\}) = \frac{1}{4}\).
  3. Probability table:
    \(x\) 0 1 2
    \(P(X = x)\) \(\frac{1}{4}\) \(\frac{1}{2}\) \(\frac{1}{4}\)
  4. Graph:

Existence of a Random variable with a given Probability Distribution

Usually, defining a random variable begins by establishing:
  1. a sample space, that is, the set of all possible outcomes,
  2. a probability associated with this sample space,
  3. a function \(X\) that assigns a number to each outcome in the sample space.
This is quite a lengthy task. However, often, we prefer to directly define a random variable \(X\) with a given probability distribution, relying on the context of the situation being studied. For example, imagine we survey a class of 30 students about their siblings and obtain these results: 10 students have 0 siblings, 12 have 1 sibling, 5 have 2 siblings, and 3 have 3 siblings. We can then define the random variable \(X\) as the number of siblings of a randomly chosen student, with this probability distribution:
\(x\) 0 1 2 3
\(P(X = x)\) \(\frac{10}{30}\) \(\frac{12}{30}\) \(\frac{5}{30}\) \(\frac{3}{30}\)
The theorem below shows that it is always possible to construct a sample space, a probability, and a function \(X\) to obtain a random variable with this probability distribution.
Theorem Existence of a Random Variable with a Given Probability Distribution
Suppose you have possible values \(x_1, x_2, \ldots, x_n\) and probabilities \(p_1, p_2, \ldots, p_n\).
If:
  • \(0 \leq p_i \leq 1\) for each \(i = 1, 2, \ldots, n\),
  • \(\displaystyle\sum_{i=1}^n p_i = p_1 + p_2 + \cdots + p_n = 1\),
then there exists a random variable \(X\) with the probability distribution \(P(X = x_i) = p_i\) for each \(i = 1, 2, \ldots, n\).
Method Defining a Random Variable \(X\) with a Valid Probability Distribution
In practice, we often define a random variable \(X\) directly by specifying its probability distribution. The key is to ensure that this distribution is valid, meaning it satisfies the conditions for a probability distribution: all probabilities must be non-negative and sum to 1.
Example
We survey a class of 30 students about their siblings and obtain these results: 10 students have 0 siblings, 12 have 1 sibling, 5 have 2 siblings, and 3 have 3 siblings. We define a random variable \(X\) as the number of siblings of a randomly chosen student, with this probability distribution:
\(x\) 0 1 2 3
\(P(X = x)\) \(\frac{10}{30}\) \(\frac{12}{30}\) \(\frac{5}{30}\) \(\frac{3}{30}\)
Determine if this probability distribution is valid.

  • \(P(X = x) \geq 0\) for all \(x = 0, 1, 2, 3\) (true: \(\frac{10}{30}\), \(\frac{12}{30}\), \(\frac{5}{30}\), and \(\frac{3}{30}\) are all non-negative),
  • \(P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) = \frac{10}{30} + \frac{12}{30} + \frac{5}{30} + \frac{3}{30} = \frac{30}{30} = 1\) (true: the sum equals 1).
Since both conditions are satisfied, the probability distribution is valid.

Measures of Center and Spread

Expectation

The expected value of a random variable \(X\) is the "average you’d expect if you repeated the experiment many times". It’s found by taking all possible values, multiplying each by its probability, and adding them up — essentially a weighted average where the probabilities act as the weights.
Definition Expected Value
For a random variable \(X\) with possible values \(x_1, x_2, \ldots, x_n\), the expected value, \(E(X)\), also called the mean, is:$$\begin{aligned}E(X) &= \sum_{i=1}^{n} x_i P(X = x_i)\\ &= x_1 P(X = x_1) + x_2 P(X = x_2) + \cdots + x_n P(X = x_n)\\ \end{aligned}$$
Example
You toss 2 fair coins, and \(X\) is the number of heads. The probability distribution is:
\(x\) 0 1 2
\(P(X = x)\) \(\frac{1}{4}\) \(\frac{1}{2}\) \(\frac{1}{4}\)
Find the expected value of \(X\).

Calculate \(E(X)\) using the formula:$$\begin{aligned}E(X) &= 0 \times \frac{1}{4} + 1 \times \frac{1}{2} + 2 \times \frac{1}{4} \\ &= \frac{1}{2} + \frac{2}{4} \\ &= 1\end{aligned}$$So, on average, you expect 1 head when tossing 2 coins.

Proposition Linearity of Expectation
For any random variable \(X\) and constants \(a\) and \(b\), the expected value of a linear transformation of \(X\) is:$$ E(aX + b) = aE(X) + b $$This property is derived from two simpler rules:
  • \(E(aX) = aE(X)\) (The expectation of a scaled variable is the scaled expectation).
  • \(E(X+b) = E(X) + b\) (The expectation of a shifted variable is the shifted expectation).

The following derivation relies on the formula for the expectation of a function of a discrete random variable, \(g(X)\), which is given by \(E(g(X)) = \sum g(x_i)P(X=x_i)\).
Let the function be \(g(X) = aX + b\).$$\begin{aligned}E(aX+b) &= \sum_{i} (ax_i + b) P(X=x_i) && \text{(by the formula for } E(g(X))\text{)} \\ &= \sum_{i} (ax_i P(X=x_i) + b P(X=x_i)) && \text{(distribute the probability)} \\ &= \sum_{i} ax_i P(X=x_i) + \sum_{i} b P(X=x_i) && \text{(split the summation)} \\ &= a \sum_{i} x_i P(X=x_i) + b \sum_{i} P(X=x_i) && \text{(factor out constants } a \text{ and } b\text{)} \\ &= a E(X) + b(1) && \text{(using } E(X) \text{ definition and } \sum P(X=x_i)=1\text{)} \\ &= aE(X) + b\end{aligned}$$

Variance and Standard Deviation

The variance measures how spread out the values of a random variable are from its expected value. The standard deviation is the square root of the variance, giving a sense of typical deviation in the same units as \(X\).
Definition Variance and Standard Deviation
The variance, denoted \(V(X)\), is:$$\begin{aligned}V(X) &= \sum_{i=1}^{n} (x_i - E(X))^2 P(X = x_i)\\ &= \left(x_1-E(X)\right)^2 P(X = x_1) + \left(x_2-E(X)\right)^2 P(X = x_2) + \cdots + \left(x_n-E(X)\right)^2 P(X = x_n)\\ \end{aligned}$$The standard deviation, denoted \(\sigma(X)\), is \(\sigma(X) = \sqrt{V(X)}\).
Example
You toss 2 fair coins, and \(X\) is the number of heads. The probability table is:
\(x\) 0 1 2
\(P(X = x)\) \(\frac{1}{4}\) \(\frac{1}{2}\) \(\frac{1}{4}\)
Given \(E(X) = 1\), find the variance.

Calculate \(V(X)\):$$\begin{aligned}V(X) &= (0 - 1)^2 \times \frac{1}{4} + (1 - 1)^2 \times \frac{1}{2} + (2 - 1)^2 \times \frac{1}{4} \\ &= 1 \times \frac{1}{4} + 0 \times \frac{1}{2} + 1 \times \frac{1}{4} \\ &= \frac{1}{4} + 0 + \frac{1}{4} \\ &= \frac{1}{2} \\ \end{aligned}$$The variance is \(\frac{1}{2}\).

Proposition Computational Formula for Variance
A more convenient formula for computation is:$$V(X) = E(X^2) - [E(X)]^2$$

Let \(\mu = E(X)\).$$\begin{aligned}V(X) &= E[(X - \mu)^2] \\ &= E[X^2 - 2\mu X + \mu^2] \\ &= E(X^2) - E(2\mu X) + E(\mu^2) && \text{(by linearity of expectation)} \\ &= E(X^2) - 2\mu E(X) + \mu^2 && \text{(since } \mu \text{ and } \mu^2 \text{ are constants)} \\ &= E(X^2) - 2\mu(\mu) + \mu^2 \\ &= E(X^2) - 2\mu^2 + \mu^2 \\ &= E(X^2) - \mu^2 \\ &= E(X^2) - [E(X)]^2\end{aligned}$$

Classical Distributions

Uniform Distribution

Definition Uniform Distribution
A random variable \(X\) follows a uniform distribution if each possible value has the same probability:$$P(X = x) = \frac{1}{\text{Number of possible values}}, \quad \text{for any possible value }x$$
Example
Let \(X\) be the result of rolling a fair die: .
  1. List the possible values of \(X\).
  2. Create the probability table.
  3. Draw the probability distribution graph.

  1. Possible values: \(1, 2, 3, 4, 5, 6\).
  2. Probability table:
    \(x\) 1 2 3 4 5 6
    \(P(X = x)\) \(\frac{1}{6}\) \(\frac{1}{6}\) \(\frac{1}{6}\) \(\frac{1}{6}\) \(\frac{1}{6}\) \(\frac{1}{6}\)
  3. Graph:

Proposition Expectation and Variance of a Uniform Distribution
For a random variable \(X\) that follows a uniform distribution on the set of integers \(\{1, 2, \ldots, n\}\):
  • The expected value is \(E(X) = \frac{n+1}{2}\).
  • The variance is \(V(X) = \frac{n^2-1}{12}\).

Proof of the Expected Value \(E(X)\):For a uniform distribution on \(\{1, 2, \dots, n\}\), the probability of any outcome is \(P(X=i) = \frac{1}{n}\). $$ \begin{aligned} E(X) &= \sum_{i=1}^n i \cdot P(X=i) \\ &= \sum_{i=1}^n i \cdot \frac{1}{n} \\ &= \frac{1}{n} \sum_{i=1}^n i \quad \text{(factoring out the constant } 1/n\text{)} \\ &= \frac{1}{n} \left( \frac{n(n+1)}{2} \right) \quad \text{(using the formula for the sum of integers)} \\ &= \frac{n+1}{2} \end{aligned} $$

Example
Let \(X\) be the random variable for the score on a roll of a fair six-sided die. Find the mean and variance of \(X\).

The random variable \(X\) follows an uniform distribution on \(\{1, 2, 3, 4, 5, 6\}\).
  • \( E(X) = \frac{6+1}{2} = 3.5 \)
  • \( V(X) = \frac{6^2-1}{12} = \frac{35}{12} \approx 2.92 \)

Bernoulli Distribution

A Bernoulli distribution models an experiment with two outcomes: success (1) or failure (0), like flipping a coin where heads is 1 and tails is 0. The probability of success is \(p\).
Definition Bernoulli Distribution
A random variable \(X\) follows a Bernoulli distribution if:
  • Possible values are 0 and 1.
  • \(P(X = 1) = p\) and \(P(X = 0) = 1 - p\).
We write \(X \sim B(p)\).
Example
A basketball player has an 80\(\pourcent\) chance of making a free throw. Let \(X = 1\) if the shot is made, and \(X = 0\) if it’s missed.
  1. Is \(X\) a Bernoulli random variable?
  2. Find the probability of success.

  1. Yes, \(X\) has values 0 or 1, so it follows a Bernoulli distribution.
  2. Probability of success: \(P(X = 1) = 80\pourcent = 0.8\).

Proposition Expectation and Variance of a Bernoulli Distribution
For a Bernoulli random variable \(X\) with a probability of success \(p\), the following hold:
  • The expected value is \(E(X) = p\),
  • The variance is \(V(X) = p(1 - p)\),
  • The standard deviation is \(\sigma(X) = \sqrt{p(1 - p)}\).

  • \(\begin{aligned}[t]E(X)&=0\times P(X=0)+1\times P(X=1)\\&=0\times(1-p)+1\times p\\&=p\end{aligned}\)
  • \(\begin{aligned}[t] V(X) &= (0-p)^2(1-p) + (1-p)^2 p \\ &= p^2(1-p) + p(1-p)^2 \\ &= p(1-p) [p + (1-p)] \\ &= p(1-p) \\ \end{aligned} \)

Bernoulli Scheme

Definition Bernoulli Scheme
The repetition of \(n\) identical and independent Bernoulli trials is called a Bernoulli scheme of size \(n\).
Example
We toss an unfair coin twice where the probability of success (landing on HEADS) is \(0.4\).
Let \(X\) be the random variable that counts the number of HEADS. Find \(P(X=1)\).

The tosses are independent and identical Bernoulli trials, so this is a Bernoulli scheme of size \(2\). We represent the situation with the tree below, indicating the value of \(X\) at the end of each path.
From the tree, there are two paths leading to \(X=1\): \((H,T)\) and \((T,H)\).
Each path has probability \(0.4 \times 0.6 = 0.24\). Hence$$P(X=1)=2\times 0.24=0.48.$$

Definition Binomial Coefficient
In a binomial experiment (Bernoulli scheme) of \(n\) trials, the binomial coefficient \(\binom{n}{k}\), read as "\(n\) choose \(k\)", represents the number of different ways (or paths) to obtain exactly \(k\) successes.
Proposition Particular Cases
Choosing 0 successes or choosing \(n\) successes can only be done in a single way. So$$ \binom{n}{0} = 1 \quad \quad \binom{n}{n} = 1 $$
Proposition Symmetry Property
Choosing \(k\) successes is the same as choosing \(n-k\) failures. Therefore:$$ \binom{n}{k} = \binom{n}{n-k} $$
Proposition Pascal's Identity
For any integers \(1 \leq k \leq n-1\):$$ \binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k} $$
This relationship allows us to construct binomial coefficients step-by-step using Pascal's Triangle.
Method Pascal's Triangle
To find \(\binom{n}{k}\), we sum the two coefficients directly above it in the previous row.
\(n \backslash k\) 0 1 2 3 4 5
0 1
1 1 1
2 1 2 1
3 1 3 3 1
4 1 4 6 4 1
5 1 5 10 10 5 1
Example
Using Pascal's Triangle, find \(\binom{5}{2}\).

From the row \(n=5\) and column \(k=2\), we see that \(\binom{5}{2} = 10\).

Binomial Distribution

Proposition Distribution of a Binomial Random Variable
Let \(X\) be a binomial random variable with \(n\) independent trials and a probability of success \(p\). The probability distribution of \(X\) is:
This is called the binomial distribution, and we write \(X \sim B(n, p)\).
Example
A basketball player has an 80\(\pourcent\) chance of making a free throw and takes 5 shots. Let \(X\) be the number of shots made.
  1. Is \(X\) a binomial random variable?
  2. Find the probability of making 4 shots.

  1. Yes, \(X\) is a binomial random variable because it counts the number of successes (shots made) in 5 independent trials (free throws), each with a constant success probability of 0.8.
  2. As \(X \sim B(5, 0.8)\), $$ \begin{aligned} P(X = 4) &= \binom{5}{4} (0.8)^4 (1-0.8)^1 \\ &= 5 \times 0.4096 \times 0.2 \\ &= 0.4096 \end{aligned} $$ The probability of making 4 shots is 0.4096.

Proposition Expectation and Variance of a Binomial Random Variable
For \(X \sim B(n, p)\):
  • \(E(X) = n p\) (expected value),
  • \(V(X) = n p (1 - p)\) (variance),
  • \(\sigma(X) = \sqrt{n p (1 - p)}\) (standard deviation).
Example
A basketball player has an 80\(\pourcent\) chance of making a free throw and takes 5 shots. Find the mean and standard deviation of the number of successful shots.

Let \(X\) be the number of successful shots. Since each shot is independent and has a success probability of 0.8, we have \(X \sim B(5, 0.8)\).$$\begin{aligned}E(X) &= 5 \times 0.8 = 4, \\ V(X) &= 5 \times 0.8 \times (1 - 0.8) = 5 \times 0.8 \times 0.2 = 0.8, \\ \sigma(X) &= \sqrt{0.8} \approx 0.89.\end{aligned}$$Mean is 4 successful shots, standard deviation is about 0.89.

Method Cumulative Binomial Distribution
To calculate probabilities of the form \(P(X \leq k)\), we use the Cumulative Distribution Function (CDF) on a scientific calculator.
  • On TI: Use \texttt{binomcdf(n, p, k)}.
  • On Casio: Use \texttt{BinomialCD(k, n, p)}.
  • On NumWorks: Use the \texttt{Probability} application, select \texttt{Binomial}.
To handle other inequalities, use these logic rules:
  • \(P(X < k) = P(X \leq k-1)\)
  • \(P(X \geq k) = 1 - P(X \leq k-1)\)
  • \(P(X > k) = 1 - P(X \leq k)\)
  • \(P(a \leq X \leq b) = P(X \leq b) - P(X \leq a-1)\)
Example

Consider a random variable \(X\) that follows the binomial distribution with parameters \(n = 100\) and \(p = 0.78\). Calculate the following probabilities. Round your results to three decimal places. $$P(X < 75), P(X > 79), P(X \geq 74), P(73 < X \leq 81)$$

With \(X \sim \mathcal{B}(100, 0.78)\), we use the calculator's cumulative distribution function:
  1. \(P(X<75) = P(X\leq 74) \approx \mathbf{0.197}\).
  2. \(P(X>79) = 1 - P(X\leq 79) \approx \mathbf{0.366}\).
  3. \(P(X\geq 74) = 1 - P(X\leq 73) \approx \mathbf{0.861}\).
  4. \(P(73

Method Finding a Probability Interval
To find an interval \(I=[a ; b]\) such that \(P(X \in I) \geq 1-\alpha\):
  1. Find the smallest integer \(a\) such that \(P(X \leq a) > \frac{\alpha}{2}\).
  2. Find the smallest integer \(b\) such that \(P(X \leq b) \geq 1 - \frac{\alpha}{2}\).
Example
Let \(X \sim \mathcal{B}(50, 0.4)\). Find the smallest interval \([a, b]\) such that \(P(a \leq X \leq b) \geq 0.95\).

Here, \(1-\alpha = 0.95\), so \(\alpha = 0.05\) and \(\frac{\alpha}{2} = 0.025\).
  1. We look for \(a\) such that \(P(X \leq a) > 0.025\).
    From the table: \(P(X \leq 12) \approx 0.013\) and \(P(X \leq 13) \approx 0.028\). So \(\mathbf{a = 13}\).
  2. We look for \(b\) such that \(P(X \leq b) \geq 0.975\).
    From the table: \(P(X \leq 26) \approx 0.967\) and \(P(X \leq 27) \approx 0.984\). Thus \(\mathbf{b = 27}\).
The interval is \([13 ; 27]\).

Geometric Distribution

Definition Geometric Distribution
Consider a sequence of independent Bernoulli trials with a probability of success \(p\). The geometric distribution models the number of trials \(X\) required to obtain the first success. We write \(X \sim \mathcal{G}(p)\).
Proposition Probabilities and Expectation
Let \(X \sim \mathcal{G}(p)\). For any integer \(k \geq 1\):
  • \(P(X = k) = (1-p)^{k-1}p\)
  • \(P(X \leq k) = 1 - (1-p)^k\)
  • \(P(X > k) = (1-p)^k\)
  • \(E(X) = \frac{1}{p}\)
Example
We roll a fair four-sided die (numbered 1 to 4) until we get a 2. Let \(D\) be the number of trials required.
  1. Identify the distribution of \(D\).
  2. Find the probability that the first 2 appears on the 5th trial.
  3. Find the expected number of rolls to get a 2.

  1. \(D\) follows a geometric distribution with \(p = 0.25\) (since \(P(\text{rolling a 2}) = 1/4\)).
  2. \(P(D = 5) = (1 - 0.25)^{5-1} \times 0.25 = 0.75^4 \times 0.25 \approx 0.08\).
  3. \(E(D) = \frac{1}{0.25} = 4\). On average, it takes 4 rolls to get a 2.

Proposition Graphical Representation
The bar chart of a geometric distribution shows an exponential decrease. The height of the first bar (at \(k=1\)) is equal to \(p\).
Example
The probability distribution graph of \(X \sim \mathcal{G}(0.25)\) is:
Proposition Memoryless Property
For a random variable \(X\) following a geometric distribution, the probability of success in future trials does not depend on the number of past failures:$$ P_{X>s}(X > s+t) = P(X > t) \quad \text{for all } s, t \in \mathbb{N} $$
Example
Following the previous die example (\(p=0.25\)), find the probability that it takes more than 10 trials to get a 2, given that after 7 trials, no 2 has appeared.

By the memoryless property:$$P_{D>7}(D > 10) = P_{D>7}(D > 7+3) = P(D > 3)=(1 - 0.25)^3 = 0.75^3 \approx 0.42$$This calculation relies on the fact that the probability of succeeding in more than ten trials given that seven have already failed—meaning more than three additional trials are needed—is the same as the probability of succeeding in more than three trials from the very beginning: the first seven trials are effectively "forgotten."