CLT-based inference

Grayson White

Math 141
Week 11 | Fall 2025

Logistics

Final exam format discussion.

Goals for Today

Learn theory-based statistical inference methods.
Learn about sample size calculations.

Introduce a new group of test statistics based on z-scores.
Generalize the SE method confidence interval formula.

Central Limit Theorem – Sampling Distributions

There is a version of the CLT for many of our sample statistics.

For Sample Proportion:

CLT: For large \(n\) (At least 10 successes and 10 failures),

\[ \hat{p} \sim N \left(p,~ \sqrt{\frac{p(1-p)}{n}} \right) \]

For Sample Mean:

CLT: For large \(n\) (At least 30 observations),

\[ \bar{x} \sim N \left(\mu,~ \frac{\sigma}{\sqrt{n}} \right) \]

Power of the CLT: Provides formula for the SE!

Statistical Inference Zoom Out – Estimation

Statistical Inference Zoom Out – Testing

Sample Statistics as Random Variables

Sample statistics can be recast as random variables.
Need to figure out what random variable is a good approximation for our sample statistic.
- Then use the properties of that random variable to do inference.
Sometimes it is easier to find a good random variable approximation if we standardize our sample statistic first.

Z-scores

All of our test statistics so far have been sample statistics.
Another commonly used test statistic takes the form of a z-score:

\[ \mbox{Z-score} = \frac{X - \mu}{\sigma} \]

Standardized version of the sample statistic.
Z-score measures how many standard deviations the sample statistic is away from its mean.

Z-score Example

Z-score Example

\(\hat{p}\) = proportion of Japanese Flowering Cherry trees in a sample of 100 trees

\[ \hat{p} \sim N \left(0.231, 0.042 \right) \]

Suppose we have a sample where \(\hat{p} = 0.15\). Then the z-score would be:

\[ \mbox{Z-score} = \frac{0.15 - 0.231}{0.042} = -1.93 \]

Z-score Test Statistics

A Z-score test statistic is one where we take our original sample statistic and convert it to a Z-score:

\[ \mbox{Z-score test statistic} = \frac{\mbox{statistic} - \mu}{\sigma} \]

Allows us to quickly (but roughly) classify results as unusual or not.
- \(|\) Z-score \(|\) > 2 → results are unusual/p-value will be smallish
Commonly used because if the sample statistic \(\sim N(\mu, \sigma)\), then

\[ \mbox{Z-score test statistic} = \frac{\mbox{statistic} - \mu}{\sigma} \sim N(0, 1) \]

Let’s consider theory-based inference for a population proportion.

Inference for a Single Proportion – Testing

Let’s consider conducting a hypothesis test for a single proportion: \(p\)

Need:

Hypotheses
- Same as with the simulation-based methods
Test statistic and its null distribution
- Use a z-score test statistic and a standard normal distribution
P-value
- Compute from the standard normal distribution directly

Inference for a Single Proportion – Testing

Let’s consider conducting a hypothesis test for a single proportion: \(p\)

\(H_o: p = p_o\) where \(p_o\) = null value and \(H_a: p > p_o\) or \(H_a: p < p_o\) or \(H_a: p \neq p_o\)

By the CLT, under \(H_o\):

\[ \hat{p} \sim N \left(p_o, \sqrt{\frac{p_o(1-p_o)}{n}} \right) \]

Z-score test statistic:

\[ Z = \frac{\hat{p} - p_o}{\sqrt{\frac{p_o(1-p_o)}{n}}} \]

Use \(N(0, 1)\) to find the p-value once you have computed the test statistic.

Inference for a Single Proportion – Testing

Let’s consider conducting a hypothesis test for a single proportion: \(p\)

Example: Bern and Honorton’s (1994) extrasensory perception (ESP) studies

# Construct data frame of sample results
esp <- data.frame(guess = c(rep("correct", 106),
                            rep("incorrect", 329 - 106)))

Inference for a Single Proportion – Testing

Let’s consider conducting a hypothesis test for a single proportion: \(p\)

Example: Bern and Honorton’s (1994) extrasensory perception (ESP) studies

library(infer)
# Compute observed test statistic
test_stat <- esp %>%
  specify(response = guess,
          success = "correct") %>%
  hypothesize(null = "point", p = 0.25) %>%  
  calculate(stat = "z")
test_stat

Response: guess (factor)
Null Hypothesis: point
# A tibble: 1 × 1
   stat
  <dbl>
1  3.02

# Use N(0,1) to find p-value
pnorm(q = test_stat$stat, mean = 0, sd = 1,
      lower.tail = FALSE)

[1] 0.001247763

# Or 
1 - pnorm(q = test_stat$stat,
          mean = 0, sd = 1)

[1] 0.001247763

prop_test(esp, response = guess, success = "correct", p = 0.25,
          z = TRUE, alternative = "greater")

# A tibble: 1 × 3
  statistic p_value alternative
      <dbl>   <dbl> <chr>      
1      3.02 0.00125 greater

Note: There is also a base R function called prop.test() but its arguments are different.

Theory-Based Confidence Intervals

Suppose statistic \(\sim N(\mu = \mbox{parameter}, \sigma = SE)\).

95% CI for parameter:

\[ \mbox{statistic} \pm 1.96 SE \]

Can generalize this formula!

P% CI for parameter:

\[ \mbox{statistic} \pm z^* SE \]

# Find z-star
qnorm(p = 0.975, mean = 0, sd = 1)

[1] 1.959964

qnorm(p = 0.95, mean = 0, sd = 1)

[1] 1.644854

Theory-Based CIs in Action

Let’s consider constructing a confidence interval for a single proportion: \(p\)

By the CLT,

\[ \hat{p} \sim N \left(p,~ \sqrt{\frac{p(1-p)}{n}} \right) \]

P% CI for parameter:

\[\begin{align*} \mbox{statistic} \pm z^* SE \end{align*}\]

Theory-Based CIs in Action

Example: Bern and Honorton’s (1994) extrasensory perception (ESP) studies

# Use probability model to approximate null distribution
prop_test(esp, response = guess, success = "correct", 
          z = TRUE, conf_int = TRUE, conf_level = 0.95)

# A tibble: 1 × 5
  statistic  p_value alternative lower_ci upper_ci
      <dbl>    <dbl> <chr>          <dbl>    <dbl>
1     -6.45 1.12e-10 two.sided      0.274    0.374

Don’t use the reported test statistic and p-value!

Theory-Based CIs

P% CI for parameter:

\[ \mbox{statistic} \pm z^* SE \]

Notes:

Didn’t construct the bootstrap distribution.
Need to check that \(n\) is large and that the sample is random/representative.
- Condition depends on what parameter you are conducting inference for.

count(esp, guess)

      guess   n
1   correct 106
2 incorrect 223

Interpretation of the CI doesn’t change.
For some parameters, the critical value comes from a \(t\) distribution.
Now we have a formula for the Margin of Error.

How do we perform probability model calculations in `R`?

Probability Calculations in R

Question: How do I compute probabilities in R?

pnorm(q = 1, mean = 0, sd = 1)

[1] 0.8413447

pt(q = 1, df = 52)

[1] 0.8390293

Doesn’t seem quite right…

Probability Calculations in R

Question: How do I compute probabilities in R?

1 - pnorm(q = 1, mean = 0, sd = 1)

[1] 0.1586553

# OR: use the lower.tail argument:
pnorm(q = 1, mean = 0, sd = 1,
      lower.tail = FALSE)

[1] 0.1586553

1 - pt(q = 1, df = 52)

[1] 0.1609707

# OR: use the lower.tail argument:
pt(q = 1, df = 52, lower.tail = FALSE)

[1] 0.1609707

P*100% CI for parameter:

\[ \mbox{statistic} \pm z^* SE \]

Question: How do I find the correct critical values \((z^* \mbox{ or } t^*)\) for the confidence interval?

qnorm(p = 0.975, mean = 0, sd = 1)

[1] 1.959964

qt(p = 0.975, df = 52)

[1] 2.006647

P*100% CI for parameter:

\[ \mbox{statistic} \pm z^* SE \]

Question: What percentile/quantile do I need for a 90% CI?

qnorm(p = 0.95, mean = 0, sd = 1)

[1] 1.644854

qt(p = 0.95, df = 52)

[1] 1.674689

Probability Calculations in R

To help you remember:

Want a Probability?

→ use pnorm(), pt(), …

Want a Quantile (i.e. percentile)?

→ use qnorm(), qt(), …

Probability Calculations in R

Question: When might I want to do probability calculations in R?

Computed a test statistic that is approximated by a named random variable. Want to compute the p-value with p---()
Compute a confidence interval. Want to find the critical value with q---().
To do a Sample Size Calculation.

Sample Size Calculations

Very important part of the data analysis process!
Happens BEFORE you collect data.
You determine how large your sample size needs for a desired precision in your CI.
- The power calculations from hypothesis testing relate to this idea.

Sample Size Calculations

Question: Why do we need sample size calculations?

Example: Let’s return to the dolphins for treating depression example.

With a sample size of 30 and 95% confidence, we estimate that the improvement rate for depression is between 14.5 percentage points and 75 percentage points higher if you swim with a dolphin instead of swimming without a dolphin.
With a width of 60.5 percentage points, this 95% CI is a wide/very imprecise interval.

Question: How could we make it narrower? How could we decrease the Margin of Error (ME)?

Sample Size Calculations – Single Proportion

Let’s focus on estimating a single proportion. Suppose we want to estimate the current proportion of Reedies with COVID with 95% confidence and we want the margin of error on our interval to be less than or equal to 0.02. How large does our sample size need to be?

Want

\[ z^* \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} \leq B \]

Need to derive a formula that looks like

\[ n \geq \quad ... \]

Question: How can we isolate \(n\) to be on a side by itself?

Sample Size Calculations – Single Proportion

Sample size calculation:

\[ n \geq \frac{\hat{p}(1 - \hat{p})z^{*2}}{B^2} \]

What do we plug in for, \(\hat{p}\), \(z^{*}\), \(B\)?
Consider sample size calculations when estimating a mean on this week’s lab!

Statistical Inference using Probability Models

We went through theory-based inference for \(p\).
There are similar results for other parameters. But the specific named random variable may change!
- Will extend beyond inference for \(p\) next time.

Logistics

Goals for Today

Central Limit Theorem – Sampling Distributions

Statistical Inference Zoom Out – Estimation

Statistical Inference Zoom Out – Testing

Sample Statistics as Random Variables

Z-scores

Z-score Example

Z-score Example

Z-score Test Statistics

Let’s consider theory-based inference for a population proportion.

Inference for a Single Proportion – Testing

Inference for a Single Proportion – Testing

Inference for a Single Proportion – Testing

Inference for a Single Proportion – Testing

Theory-Based Confidence Intervals

Theory-Based CIs in Action

Theory-Based CIs in Action

Theory-Based CIs

How do we perform probability model calculations in R?

Probability Calculations in R

Probability Calculations in R

P*100% CI for parameter:

P*100% CI for parameter:

Probability Calculations in R

Probability Calculations in R

Sample Size Calculations

Sample Size Calculations

Sample Size Calculations – Single Proportion

Sample Size Calculations – Single Proportion

Statistical Inference using Probability Models

How do we perform probability model calculations in `R`?