PSY 8814 - Lab 04: Probability Distributions

Author

Linh Nguyen

Published

September 29, 2023

I. Probability Distributions

R has many built-in functions for working with different probability distributions. For each possible distribution (e.g., normal, \(\chi^2\), \(F\)), there are four possible functions:

d___ gives the density of the distribution at a value \(x\)
p___ gives \(P(X \leq x)\) with lower.tail = TRUE (the default), or \(P(X > x)\) with lower.tail = FALSE
q___ gives quantiles, so the value \(x\) such that \(F(x) = p\)
r___ gives random deviates of the specified distribution

Importantly, p__ and q__ functions default to lower.tail = TRUE unless you change this argument explicitly. This means we are looking at the left tail of the distribution: the probability of getting a value or lower values. More clarifications later.

1. Bernoulli and Binomial Distributions

For a random variable \(X\) that follows the Bernoulli distribution with probability \(p\), each realization of the random variable has two possible outcomes (e.g., Success or Failure, 1 or 0, Heads or Tails, etc.). More specifically, the Bernoulli distribution is a discrete probability distribution such that \(P(X = 1) = p\) and \(P(X = 0) = 1 - p = q\).

The Bernoulli and binomial distributions are closely related. If \(Y\) is a a random variable that follows the binomial distribution with parameters \(n\) and \(p\), we write that \(Y \sim B(n, p)\), and the probability mass function (PMF) is

\[ P(X = x) = \binom{n}{x}p^x(1-p)^{n-x} \] where \(\binom{n}{x} = \frac{n!}{x!(n-x)!}\). Here, \(X\) can be thought of as the sum of independent Bernoulli trials. Moreover, if \(n = 1\), then we have

\[ P(X=x) = \binom{1}{x}p^x(1-p)^{1-x}. \]

Given that \(X\) can only take on values of 0 or 1, we can see that

\[ P(X = 1) = \frac{1!}{1!(1-1)!}p^1 (1-p)^0 = 1\times p^1 = p \]

and

\[ P(X = 0) = \frac{1!}{1!(1-0)!}p^0 (1-p)^1 = 1-p \]

which brings us to the PMF for the Bernoulli distribution!

2. Normal Distribution

The normal distribution is perhaps one of the most common probability distributions that is used in social science research. The normal distribution is characterized by two parameters, \(\mu\) and \(\sigma^2\). We write a normally-distributed random variable as \(X \sim N(\mu, \sigma^2)\). Important: \(\sigma^2\) is the variance, not the standard deviation.

The probability density function (PDF) for the normal distribution is:

\[ f(x; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{\big[-\frac{(x-\mu)^2}{2\sigma^2}\big]} \]

We’ve already seen how we can generate random deviates from the normal distribution using rnorm(). The default values are \(\mu = 0\) and \(\sigma^2 = 1\), which is the standard normal distribution.

3. Additional Distributions

Here are functions for additional probability distributions that are available in R.

Continuous uniform distribution
- dunif()
- punif()
- qunif()
- runif()
\(\chi^2\) distribution
- dchisq()
- pchisq()
- qchisq()
- rchisq()
\(t\) distribution
- dt()
- pt()
- qt()
- rt()
\(F\) distribution
- df()
- pf()
- qf()
- rf()

II. Probability Functions

1. Probability Mass Function (PMF)

A PMF is for a discrete random variable: \(X\) takes on a finite, countable set of values. For example, number of children is a discrete value. Unlike what the average statistic tells us, we cannot actually have 2.5 children.

The PMF tells us the probability of obtaining any value \(x\), given the parameters of the distribution. In other words, the PMF tells us \(P(X = x)\). For example, how likely is it to have 1 child, 2 children, …, 10 children?

Defining characteristics of PMF:

The sum of probabilities for all possible values of \(x\) is 1
For any possible value of \(x\), the probability \(P(X = x)\) is greater than zero

PMF density plot:

A set of vertical lines at each value of \(x\)
The line height: the associated probability of that value.
If a value of \(x\) is not in the support of \(X\), that probability is zero. For example, it is impossible to get 11 heads out of 10 coin tosses.

Here is an example of a PMF for \(X \sim \text{Binomial}(10, 0.5)\), where \(10\) is the parameter \(n\) and \(0.5\) is the parameter \(p\) in the binomial distribution. For example, tossing a fair coins 5 times and counting the number of heads.

2. Probability Density Function (PDF)

A PDF is for a continuous random variable: \(X\) takes on an infinite, non-countable set of values. For example, height is a continuous variable because it can theoretically take on infinite decimal places: 5 foot 4.3314132… inches and so on.

Defining characteristics of PDF:

The height of the PDF plot: the density rather than the probability.
The area under the curve or integral: the probability of a range.
The entire area under the curve of a PDF is 1. The PMF uses the sum instead.
For any specific possible value of \(x\), the probability \(P(X = x) = 0\). This is because there is an infinite list of possible values.
We instead calculate a probability range, or \(P(a < X < b)\) using integration (or R). We can’t calculate the exact probability that someone is 5 foot 4.3312… inches, but we can calculate the probability that they are between 5’4” and 5’5”.

Here is an example of the normal distribution of height in inches:

3. Cumulative Distribution Function (CDF)

A CDF tells us the probability that the random variable takes on a value less than or equal to some number, \(P(X \leq x)\). So as we increase \(x\), \(P(X \leq x)\) gets closer and closer to one. The CDF will look different for a discrete versus a continuous \(X\). We use \(F(x)\) to denote a CDF.

The CDF is related to both PMF and PDF. For a PMF \(f(x)\), we can calculate \(F(x)\) as:

\[ F(x)=\sum_{t\leq x}f(t). \]

For a PDF \(f(x)\) where \(-\infty < x < \infty\), we have:

\[ F(t)=\int_{-\infty}^x f(t)dt. \]

a. Discrete CDF example:

Using the coin toss example above, let’s calculate probability! Here are some handy functions for the Binomial distribution:

dbinom() function: find probability at a given value for the binomial distribution
pbinom() function: find probability across a range of values
qbinom() function: find the smallest number of successes \(q\) given a specific probability \(p\) such that \(\Pr(X \leq q) \geq p\)

What is the probability of getting exactly 8 heads out of 10 fair coin tosses \(P(X = 8)\)?

What is the probability of getting between 3 and 5 heads out of 10 coin tosses \(P(3 \leq X \leq 5)\)? We can do this with both dbinom() and pbinom():

Conversely, what is the smallest number of headcounts that would be associated with at least 25% probability?

Try it yourself!

Say we are curious about lottery winnings and wonder what our chances are after buying 50 lottery tickets, each with a 1% chance of winning. Let the number of winnings be \(X \sim B(50, 0.01)\)

What is the probability of not winning at all?
What is the probability of winning on all 50 tickets?
What is the probability of winning at least once?
What is the number of winnings that is associated with at least 10% probability?
What if we sampled 2 gamblers. What is the probability that either of them wins at least once?
What is the probability that they both win at least once?

b. Continuous CDF example:

Using the height example, let’s calculate some probability! Similar to before, here are some handy functions for the normal distribution:

dnorm() function: find density at a given value for the normal distribution
pnorm() function: find probability across a range of values
qnorm() function: find score that is associated with a specific probability

As previously mentioned the probability of getting an exact continuous value is zero: \(P(X = x) = 0\). We can instead calculate the probability for a range of values:

What is the probability of being between 70 and 80 inches (roughly 5’8 and 6’7)?

We can also visualize this range on the density plot. The probability above is the area under the curve between the two vertical lines:

Try it yourself!

Let IQ scores be normally distributed: \(X \sim N(100, 225)\).

What is the probability of scoring 70 or below?
What is the probability of scoring at least 180?
What is the probability of scoring between 85 and 115?
What score is considered the 99th percentile?

III. References

This document is adapted from materials from Justin Kracht and Allie Cooperman.
The interactive document is built using quarto-webr
Penn State University STAT 414 Section 2: Discrete Distributions
Penn State University STAT 414 Section 3: Continuous Distributions