Lecture 8
NC State University
ST 511 - Fall 2025
2025-09-15
β HW-1 grades are released
> email Nick and I if you have any questions
β HW-2 released today (due Sunday at 11:59pm)
> your repo is called homework-2
> we will look at the number of commits you have
β Quiz released Wednesday (due Sunday at 11:59pm)
β I wrote up a resource on random variables / probability distributions on our website! Check it out.
What do we mean by population?
What do we mean by sample?
Population - βcomplete setβ, or every possible observational unit of interest
Sample - a subset of data collected from a larger population
Whatβs a ranodm sample?
Why do we care?
Random sample - subgroup of observational units selected from a larger population where every unit has an equal chance of being chosen
β Helps ensure observations are independent
β Helps ensure observations are representitative of the larger population
Whatβs the difference between a probability distribution and a sampling distribution?
A probability distribution describes the set of all possible values a random variable can take and the probability of each value occurring
A sampling distribution is essentially the probability distribution of a statistic (like the sample mean or sample proportion) from all possible samples of a given size.
β More with random variables, probability distributions, and sampling distributions
β Sampling schemes
β Inference (Hypothesis Testing)
Let X = the number of minutes a college student plays video games in a week. Assume that X is distributed normal with a mean of 60 and a standard deviation of 20.
β What does 60 mean? What does 20 mean?
β How do we write mathematically how X is distributed?
What probability are we calculating? How can we write this out in mathematical notation?
β Pick 10 random words
β Take the mean
β Report it
Letβs talk about different sampling schemes
β Simple random sample
β Systematic sampling
β Convinence sampling
(more complex techniqeus later)
Advantages
β Helps us assume independence
β Helps us generalize to a larger population
Disadvantages
β Itβs really hardβ¦ could be difficult to actually target a truly large population of interest
β Random sampling βοΈ
β Systematic sampling
β Convinence sampling
A probability sampling technique where you select participants from a larger population by choosing a random starting point and then selecting every nth individual from a list or sampling frame
Advantages
β Helps us assume independence
β Helps us generalize to a larger population
Disadvantages
β Risks bias if the population has a hidden periodic pattern
β Need to know your entire sampling frame
β Random sampling βοΈ
β Systematic sampling βοΈ
β Convinence sampling
Convenience sampling: a non-random sampling method where participants are selected based on their availability, willingness, or ease of access
Can be useful in certain situations (observational studies)β¦ but
β non-representative
β sampling bias can occur
Set up a null and alternative hypothesis (letβs talk about this)
Collect data
Check assumptions
Analyze data
Make decisions and conclusions
We are going to test to see if a coin is fair!
Letβs collect data
For our hypothesis test, we need to check the following assumptions:
β Independence (do we satisfy this condition?)
β Normality
β n* \(\pi_o\) > 10
β n* (1 - \(\pi_o\)) > 10
Do we satisfy this condition?
What is the standard error of the sampling distribution?
Note: The standard deviation of the sampling distribution is called the standard error!
Replace test_stat, null_mean, ect with the appropriate values
β Decisions are always in terms of the null
β Conclusions are always in terms of the alternative
Typically, researchers use fixed level testing \(\alpha\).
What is \(\alpha\)?