Confidence Intervals: Sampling Distribution [PDF]

Sep 13, 2012 - IMPORTANT POINTS. â¢ Sample statistics vary from sample to sample. (they will not match the parameter exactly). â¢ KEY QUESTION: For a given sample statistic, what are plausible values for the population parameter? How much uncertainty surrounds the sample statistic? â¢ KEY ANSWER: It depends on ...

46 downloads 71 Views 2MB Size

Report

Download PDF

PNG Network

Recommend Stories

Confidence Intervals

Goodbyes are only for those who love with their eyes. Because for those who love with heart and soul

Confidence Intervals

You can never cross the ocean unless you have the courage to lose sight of the shore. Andrè Gide

Bootstrap confidence intervals

I tried to make sense of the Four Books, until love arrived, and it all became a single syllable. Yunus

Confidence Intervals for Proportions

Be like the sun for grace and mercy. Be like the night to cover others' faults. Be like running water

Estimating and Finding Confidence Intervals

If you are irritated by every rub, how will your mirror be polished? Rumi

Confidence Intervals for the Mean of Non-Normal Distribution

Open your mouth only if what you are going to say is more beautiful than the silience. BUDDHA

Hypothesis Testing and Confidence Intervals

Where there is ruin, there is hope for a treasure. Rumi

Chebyshev's, CLT, and Confidence Intervals

If you are irritated by every rub, how will your mirror be polished? Rumi

Confidence and tolerance intervals for the normal distribution

Learn to light a candle in the darkest moments of someone’s life. Be the light that helps others see; i

Confidence Intervals for Population Forecasts

Be who you needed when you were younger. Anonymous

Idea Transcript

12/23/2012

The Big Picture

STAT 101 Dr. Kari Lock Morgan 9/13/12

Confidence Intervals: Sampling Distribution

Population

SECTIONS 3.1, 3.2 • Sampling Distributions (3.1) • Confidence Intervals (3.2) Statistics: Unlocking the Power of Data

Sampling

Sample

Statistical Inference Lock5

Statistical Inference

Statistics: Unlocking the Power of Data

Lock5

Statistic and Parameter

Statistical inference is the process of drawing conclusions about the entire population based on information in a sample.

A parameter is a number that describes some aspect of a population. A statistic is a number that is computed from data in a sample.  We usually have a sample statistic and want to

use it to make inferences about the population parameter Statistics: Unlocking the Power of Data

Lock5

The Big Picture

Population

Statistics: Unlocking the Power of Data

Lock5

Parameter versus Statistic

Sampling

mu

PARAMETERS

x-bar p-hat

sigma

Sample

STATISTICS

rho

Statistical Inference Statistics: Unlocking the Power of Data

Lock5

Statistics: Unlocking the Power of Data

Lock5

1

12/23/2012

Election Polls

Point Estimate

 Over the weekend (9/7/12 – 9/9/12), 1000

registered voters were asked who they plan to vote for in the 2012 presidential election

 Point estimates will not match population

 What proportion of voters plan to vote for

Obama? 𝑝 = 0.50

We use the statistic from a sample as a point estimate for a population parameter.

parameters exactly, but they are our best guess, given the data

p = ???

http://www.politico.com/p/2012-election/polls/president Statistics: Unlocking the Power of Data

Lock5

Election Polls

Statistics: Unlocking the Power of Data

Lock5

IMPORTANT POINTS

 Actually, several polls were conducted over

• Sample statistics vary from sample to sample.

the weekend (9/7/12 – 9/9/12):

(they will not match the parameter exactly) • KEY QUESTION: For a given sample statistic, what are plausible values for the population parameter? How much uncertainty surrounds the sample statistic? • KEY ANSWER: It depends on how much the statistic varies from sample to sample!

http://www.politico.com/p/2012-election/polls/president Statistics: Unlocking the Power of Data

Lock5

Lock5

Sampling Distribution

Reese’s Pieces • What proportion of Reese’s pieces are orange? • Take a random sample of 10 Reese’s pieces • What is your sample proportion?  class dotplot

A sampling distribution is the distribution of sample statistics computed for different samples of the same size from the same population.  A sampling distribution shows us how the

• Give a range of plausible values for the population proportion

Statistics: Unlocking the Power of Data

Statistics: Unlocking the Power of Data

sample statistic varies from sample to sample

Lock5

Statistics: Unlocking the Power of Data

Lock5

2

12/23/2012

Sampling Distribution

Reese’s Pieces • www.lock5stat.com/statkey

In the Reese’s Pieces sampling distribution, what does each dot represent?

a) One Reese’s piece b) One sample statistic

Statistics: Unlocking the Power of Data

Lock5

Statistics: Unlocking the Power of Data

Lock5

Random Samples

Sample Size Matters!

• If you take random samples, the sampling

As the sample size increases, the variability of the sample statistics tends to decrease and the sample statistics tend to be closer to the true value of the population parameter

distribution will be centered around the true population parameter

 For larger sample sizes, you get less variability

• If sampling bias exists (if you do not take random samples), your sampling distribution may give you bad information about the true parameter

in the statistics, so less uncertainty in your estimates Statistics: Unlocking the Power of Data

Lock5

Lincoln’s Gettysburg Address

Statistics: Unlocking the Power of Data

Lock5

Interval Estimate An interval estimate gives a range of plausible values for a population parameter.

Statistics: Unlocking the Power of Data

Lock5

Statistics: Unlocking the Power of Data

Lock5

3

12/23/2012

Margin of Error

Sampling Distribution • We can use the spread of the sampling distribution to determine the margin of error for a statistic

One common form for an interval estimate is statistic ± margin of error where the margin of error reflects the precision of the sample statistic as a point estimate for the parameter.  How do we determine the margin of error???

Statistics: Unlocking the Power of Data

Lock5

Margin of Error

Statistics: Unlocking the Power of Data

Lock5

Election Polling

The higher the standard deviation of the sampling distribution, the a) higher b) lower

the margin of error.  Why is the margin of error smaller for the

Gallup poll than the ABC news poll? http://www.realclearpolitics.com/epolls/2012/president/us/general_election_romney_vs_obama-1171.html

Statistics: Unlocking the Power of Data

Lock5

Interval Estimate

Statistics: Unlocking the Power of Data

Lock5

Election Polling

 Using the Gallup poll, calculate an interval

estimate for the proportion of registered voters who plan to vote for Obama.

Statistics: Unlocking the Power of Data

Lock5

Statistics: Unlocking the Power of Data

Lock5

4

12/23/2012

Confidence Interval

Confidence Intervals

A confidence interval for a parameter is an interval computed from sample data by a method that will capture the parameter for a specified proportion of all samples

 www.lock5stat.com/StatKey  The parameter is fixed  The statistic is random

(depends on the sample)  The interval is random

 The success rate (proportion of all samples

whose intervals contain the parameter) is known as the confidence level

(depends on the statistic)

 A 95% confidence interval will contain the true

parameter for 95% of all samples

Statistics: Unlocking the Power of Data

Lock5

If you had access to the sampling distribution, how would you find the margin of error to ensure that intervals of the form statistic ± margin of error

The standard error of a statistic, SE, is the standard deviation of the sample statistic  The standard error can be calculated as the

standard deviation of the sampling distribution

would capture the parameter for 95% of all samples? Lock5

95% Confidence Interval

Statistics: Unlocking the Power of Data

Lock5

Economy A survey of 1,502 Americans in January 2012 found that 86% consider the economy a “top priority” for the president and congress this year.

If the sampling distribution is relatively symmetric and bell-shaped, a 95% confidence interval can be estimated using

The standard error for this statistic is 0.01.

What is the 95% confidence interval for the true proportion of all Americans that considered the economy a “top priority” at that time?

statistic ± 2 × SE

(a) (0.85, 0.87) (b) (0.84, 0.88) (c) (0.82, 0.90) Statistics: Unlocking the Power of Data

Lock5

Standard Error

Sampling Distribution

Statistics: Unlocking the Power of Data

Statistics: Unlocking the Power of Data

Lock5

statistic ± 2×SE 0.86 ± 2×0.01 0.86 ± 0.02 (0.84, 0.88)

http://www.people-press.org/2012/01/23/public-priorities-deficit-rising-terrorismslipping/ Statistics: Unlocking the Power of Data Lock5

5

12/23/2012

Interpreting a Confidence Interval  95% of all samples yield intervals that contain

the true parameter, so we say we are “95% sure” or “95% confident” that one interval contains the truth.  “We are 95% confident that the true proportion

of all Americans that considered the economy a ‘top priority’ in January 2012 is between 0.84 and 0.88”

Statistics: Unlocking the Power of Data

Lock5

Reese’s Pieces The standard error for 𝑝, the proportion of orange Reese’s Pieces in a random sample of 10, is closest to a) 0.05 b) 0.15 c) 0.25 d) 0.35

Statistics: Unlocking the Power of Data

Reese’s Pieces

Reese’s Pieces Each of you will create a 95% confidence interval based off your sample. If you all sampled randomly, and all create your CI correctly, what percentage of your intervals do you expect to include the true p? a) 95% b) 5% c) All of them d) None of them Statistics: Unlocking the Power of Data

Lock5

 Use StatKey to more precisely estimate the SE  Use this estimated SE and your 𝑝 to create a

95% confidence interval based on your data.  Come up to the board and draw your interval  How many include (our best guess at) the

truth?

Lock5

Statistics: Unlocking the Power of Data

Lock5

Confidence Intervals

Reese’s Pieces

If context were added, which of the following would be an appropriate interpretation for a 95% confidence interval:

Did your 95% confidence interval include the true p? a) Yes b) No

a)“we are 95% sure the interval contains the parameter” b)“there is a 95% chance the interval contains the parameter” c)Both (a) and (b) d)Neither (a) or (b) 95% of all samples yield intervals that contain the true parameter, so we say we are “95% sure” or “95% confident” that one interval contains the truth. We can’t make probabilistic statements such as (b) because the interval either contains the truth or it doesn’t, and also the 95% pertains to all intervals that could be generated, not just the one you’ve created.

Statistics: Unlocking the Power of Data

Lock5

Statistics: Unlocking the Power of Data

Lock5

6

12/23/2012

Summary

Confidence Intervals Sample

Population

• To create a plausible range of values for a parameter:

statistic ± ME

o

Sample

Sample

o

Sample

... Sample

Margin of Error (ME) (95% CI: ME = 2×SE)

Sample

o

Take many random samples from the population, and compute the sample statistic for each sample Compute the standard error as the standard deviation of all these statistics Use statistic  2SE

Sampling Distribution Calculate statistic for each sample

Standard Error (SE): standard deviation of sampling distribution

Statistics: Unlocking the Power of Data

Lock5

Reality

•

One small problem…

Statistics: Unlocking the Power of Data

Lock5

To Do  Read Sections 3.1, 3.2

… WE ONLY HAVE ONE SAMPLE!!!!

 Do Homework 2 (due Tuesday, 9/18)

• How do we know how much sample statistics vary, if we only have one sample?!? … to be continued Statistics: Unlocking the Power of Data

Lock5

Statistics: Unlocking the Power of Data

Lock5

7

Confidence Intervals: Sampling Distribution [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch