Idea Transcript
ST311
Midterm Exam 2
SP10 Name:___________________ Class time:___________________ Instructor:___________________
Instructions: o Put your name, and the name of your instructor, in the space provided o Read each question carefully. o Show all work to receive full credit. o Make all work legible. o Provide one answer to each question in the space provided. o A copy of the normal/t distribution table is given as the last pages. o Unless otherwise stated use a significance level of 5% and a confidence level of 95%. o For tests of hypothesis, be sure to state the hypotheses, test statistic, p-value and conclusion about H0. o There are 7 pages of questions, 18 total questions, be sure you have all of them o Test is printed on front and back of each page.
PLEASE SIGN HONOR PLEDGE AFTER LAST QUESTION
Page 1 of 7
Version 2b
ST311
Midterm Exam 2
SP10
True/False CIRCLE the best answer (3 points each). 1. True or False: In a hypothesis test, we first assume the null hypothesis is true. 2. True or False: The confidence interval will vary from sample to sample. 3. True or False: Decreasing sample size reduces the probability of Type 2 error. 4. True or False: If a two-sided hypothesis test is significant, then the corresponding one-sided hypothesis test will also be significant. 5. True or False: A confidence interval may not always capture the true parameter value. Multiple Choice CIRCLE the best answer (3 points each) 6. In confidence intervals, the margin of error will get larger a. b. c. d.
if the confidence level is decreased. if the sample size is increased. if the confidence level is increased. None of the above.
7. We have conducted a survey where we took a random sample of 400 subjects. We would like to create an 85% confidence interval for a population proportion. Which of the following would be the appropriate confidence coefficient (z*) for this confidence level? a. b. c. d.
1.44 1.72 1.04 It is impossible to tell without knowing the sample proportion.
8. A researcher conducted a study of homes in suburban Chicago. He randomly selected 100 homes and found that the average size was 1473 square feet. He calculated a 95% confidence interval and found it to be (1344 to 1602). This would imply a. There is a 95% chance that the average for this sample is between 1344 and 1602 square feet. b. The interval (1344 to 1602) was created with a process which if repeated would capture the true value of the population mean 95% of the time. c. If we randomly select another sample, there is a 95% chance that the average would be between 1344 and 1602 square feet. d. Of all homes in suburban Chicago 95% of them are between 1344 and 1602 square feet. 9. The t-distribution a. b. c. d.
Is always symmetric. Is always centered at 100% Is always skewed to the right Both a and b but not c.
Page 2 of 7
Version 2b
ST311
Midterm Exam 2
SP10
10. An engineering team has developed a new engine design that is intended to improve the fuel economy of an automobile. For a family sedan the new engine increased the fuel economy by 18.7 miles per gallon. The researchers conducted a hypothesis test based on a sample of n=40. The resulting p-value was 0.0007. From this we would conclude: a. b. c. d.
The result was statistically significant but not likely to be practically significant. The result was likely to be both statistically and practically significant. The result was likely to be practically significant but not statistically significant. The result was neither statistically nor practically significant.
11. In hypothesis testing, the p-value is a. b. c. d.
calculated assuming the alternative hypothesis is true. calculated assuming the null hypothesis is true. always set at 0.05 or 0.01. both b and c, but not b.
12. We create an 85% CI for the mean. This indicates a. There is an 85% of all sample means will be between the lower and upper limits of the interval. b. There is an 85% probability that the procedure will produce an interval that covers the population mean. c. 15% of all samples we take will produce confidence intervals that do not cover the population mean. d. Both b and c, but not a.
13. Fill in the Blank- Select the best answer from the choices below (3 points each) The sampling distribution allows us to quantify the ____________________________ in sample statistics, including how they differ from the ____________________________and what type of variability would not be expected to happen by ____________________________.
confidence interval variability sample mean probability statistic
Page 3 of 7
data predictability parameter random chance hypothesis
Version 2b
ST311
Midterm Exam 2
SP10
Short Answer- Provide one answer to each question in the space provided. Show all work! 14. Lantus is an insulin medication for Type 1 diabetes. The makers of Lantus claim that only 7.5% of the patients who take Lantus experience a rash as a side effect. a. (5 points) Determine the sampling distribution for the sample proportion of 200 patients who experience a rash as a side effect from Lantus. Be specific and describe the shape, center and spread (provide numerical values if possible). Justify your answer.
b. (5 points) Suppose a doctor takes a random sample of 200 patients on Lantus and asks if they have experienced a rash and 24 replied yes. What is the sample statistic? Give both a verbal and numerical description.
c. (5 points) Determine the probability that the proportion of patients that experience a rash as a side effect is greater than 0.12 in a sample of 200 patients (given the true proportion is 0.075).
Answer: _____________
Page 4 of 7
Version 2b
ST311
Midterm Exam 2
SP10
15. A state representative in North Carolina wants to determine the proportion of parents of NC State students have an income over $100,000 a year. He and his staff talk with 200 parents at an NC State basketball game. They found that 93 of these individuals made over $100,000 a year. a. (5 points) What is the parameter of interest in this study?
b. (5 points) The representative would like to create a press release that presents the results of his study using a 95% confidence interval. Is this an appropriate inference to make based on the data collected? Explain why or why not.
16. To test the effectiveness of the Rosetta Stone Spanish language software, which promises to improve your language skills in 6 weeks, a randomly selected group of 30 subjects were given a Spanish comprehension test prior to using the software. After 6 weeks of using the software, each of the 30 subjects took the comprehension test again. The researchers then calculated a paired difference test to determine the difference between post-test (after using the software) and the pre-test (prior to using the software), the difference was calculated by subtracting the pre-test score from the post-test score (post-pre). The 95% confidence interval for the mean paired difference is (-5.3, 13.6). a. (5 points) Based on the confidence interval, did the Rosetta Stone software improve the Spanish comprehension of the subjects? Explain how you know.
b. (5 points) Suppose instead of a confidence interval, we conducted a hypothesis test to determine if the software improved Spanish comprehension. i. Circle the correct alternative hypothesis: H a : μd < 0
H a : μd > 0
H a : μd ≠ 0
ii. Circle the correct conclusion: Reject the null hypothesis
Page 5 of 7
Cannot reject the null hypothesis
Version 2b
ST311
Midterm Exam 2
SP10
17. The daily suggested amount of sleep for young adults is 8 hours a night. School health officials wanted to test the claim that NC State students get less than 8 hours of sleep per night. They randomly selected from the registrar 200 students and asked them to report how much sleep they get per night. The mean number of hours of sleep for the 200 students was 7.6 hours with a sample standard deviation of 2.4 hours. a. (5 points) Determine an appropriate hypotheses test for the claim that the mean number of hours of sleep an NC State student gets per night is less than 8 hours. Write out the null and alternative hypothesis test symbolically and verbally.
b. (5 points) Calculate the appropriate test statistics and corresponding p-value to test whether NC State students get less than 8 hours of sleep per night.
Test Statistic: _____________ p-value: _____________ c. (5 points) Given a significance level of 0.05, is there sufficient evidence to support the claim that NC State students get less than 8 hours of sleep? Do we reject the null hypothesis? Interpret the results in the context of the problem and be specific and detailed in your conclusions.
Page 6 of 7
Version 2b
ST311
Midterm Exam 2
SP10
18. (5 points) Consider the following excerpt from a news article: A study on Metformin XR has been published in the Archives of Pediatrics & Adolescent Medicine. Darrell M. Wilson, M.D., of Stanford University and the Lucile S. Packard Children's Hospital, Stanford, Calif., randomly assigned 77 obese adolescents (ages 13 to 18) to either one daily dose of metformin XR (2,000 milligrams) (39 patients) or placebo (38 patients) for 48 weeks. Participants were monitored for an additional 48 weeks. "Metformin XR had a small but statistically significant impact on BMI over the initial 52 weeks of the study," the authors write. The average BMI increased by 0.2 in the placebo group and decreased by 0.9 in the metformin XR group. This article uses the term statistically significant. Explain to someone who has no knowledge of statistics what the term statistically significant means.
Honor Pledge: I certify that I have not received or given unauthorized aid in taking this exam. Signed:___________________________
Printed name:___________________________
Page 7 of 7
Version 2b
Exploring Data: Distributions • Evaluate overall pattern (shape, center, spread) and deviations (outliers). • Mean (use a calculator): x + x + ... + xn 1 x= 1 2 = ∑ xi n n • Standard deviation (use a calculator): 1 s= ∑ ( xi − x )2 n −1
• Median: Arrange all observations from smallest to largest. - If n is odd, then the median (M) is the middle value. - If n is even, then the median (M) is the average of the middle two values. • Quartiles: The first quartile Q1 is the median of the observations less than the overall median in the ordered list. The third quartile Q3 is the median of the observations greater than the overall median in the ordered list. • Interquartile range: IQR=Q3-Q1 • Five-number summary: Minimum, Q1, M, Q3, Maximum • Standardized value of x: x−μ z=
σ
Sampling Distributions • Sampling distribution of a sample mean: o x has mean μ and standard deviation
σ n
.
o x has a Normal distribution if the population distribution is Normal. o Central Limit Theorem: x is approximately Normal when n is large (n ≥ 30) even if the population is not normal. o Standardized value of x : x −μ z=
σ
n
• Sampling distribution of a sample proportion: number of yes pˆ = total number ˆ o p has mean p and standard deviation p (1 − p ) . n o pˆ is approximately Normal when n is large o Standardized value of pˆ pˆ − p z= p(1 − p) n
Inference About Proportions
pˆ (1 − pˆ ) n • Large-sample z confidence interval for p: sample statistic ± margin of error • Standard error: s.e.( pˆ ) =
sample statistic ± multiplier × standard error pˆ ± z *
pˆ (1 − pˆ ) where z* is from N(0,1) n
• z test statistic for H0: p = p0 if we have a large simple random sample (SRS): z=
Sample statistic – Null value Standard Error pˆ − p0
=
p0 (1 − p0 ) n
Get p-values from N(0,1)
s n
• t confidence interval for a population mean if we have a SRS from Normal population: sample statistic ± margin of error sample statistic ± multiplier × standard error s x ±t* where t * is from t ( n − 1) n • t test statistic for H0: μ = μ0 if we have a SRS from a Normal population: t=
=
Sample statistic – Null value Standard Error x − μ0 s n
• Least-squares regression line (found using a computer output): yˆ = b0 + b1 x • Residuals: ei= residual = observed y − predicted y = y − yˆ Inference for Regression • The regression model: We have n observations on x and y. The response y for any fixed x has a Normal distribution with mean given by the true regression line y = β0 + β1x and standard deviation σ. Parameters are β0, β1, σ. • t test statistic for no linear relationship, H0: β1 = 0:
Sample statistic – Null value t= Standard error
Inference About Means Standard error: s.e.( x ) =
Exploring Data: Relationships • Evaluate overall pattern (form, direction, strength) and deviations (outliers, influential observations).
Get p-values from t (n-1)
=
b1 − 0 s.e.(b1 )
Get p-values from t (n – 2)
18-W3527-APP 11/4/05 7:26 AM Page 726
726
Appendix of Tables
Table A.1 Standard Normal Probabilities (for z a 0)
Table probability
z
z
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
3.4 3.3 3.2 3.1 3.0 2.9 2.8 2.7 2.6 2.5 2.4 2.3 2.2 2.1 2.0 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
.0003 .0005 .0007 .0010 .0013 .0019 .0026 .0035 .0047 .0062 .0082 .0107 .0139 .0179 .0228 .0287 .0359 .0446 .0548 .0668 .0808 .0968 .1151 .1357 .1587 .1841 .2119 .2420 .2743 .3085 .3446 .3821 .4207 .4602 .5000
.0003 .0005 .0007 .0009 .0013 .0018 .0025 .0034 .0045 .0060 .0080 .0104 .0136 .0174 .0222 .0281 .0351 .0436 .0537 .0655 .0793 .0951 .1131 .1335 .1562 .1814 .2090 .2389 .2709 .3050 .3409 .3783 .4168 .4562 .4960
.0003 .0005 .0006 .0009 .0013 .0018 .0024 .0033 .0044 .0059 .0078 .0102 .0132 .0170 .0217 .0274 .0344 .0427 .0526 .0643 .0778 .0934 .1112 .1314 .1539 .1788 .2061 .2358 .2676 .3015 .3372 .3745 .4129 .4522 .4920
.0003 .0004 .0006 .0009 .0012 .0017 .0023 .0032 .0043 .0057 .0075 .0099 .0129 .0166 .0212 .0268 .0336 .0418 .0516 .0630 .0764 .0918 .1093 .1292 .1515 .1762 .2033 .2327 .2643 .2981 .3336 .3707 .4090 .4483 .4880
.0003 .0004 .0006 .0008 .0012 .0016 .0023 .0031 .0041 .0055 .0073 .0096 .0125 .0162 .0207 .0262 .0329 .0409 .0505 .0618 .0749 .0901 .1075 .1271 .1492 .1736 .2005 .2296 .2611 .2946 .3300 .3669 .4052 .4443 .4840
.0003 .0004 .0006 .0008 .0011 .0016 .0022 .0030 .0040 .0054 .0071 .0094 .0122 .0158 .0202 .0256 .0322 .0401 .0495 .0606 .0735 .0885 .1056 .1251 .1469 .1711 .1977 .2266 .2578 .2912 .3264 .3632 .4013 .4404 .4801
.0003 .0004 .0006 .0008 .0011 .0015 .0021 .0029 .0039 .0052 .0069 .0091 .0119 .0154 .0197 .0250 .0314 .0392 .0485 .0594 .0721 .0869 .1038 .1230 .1446 .1685 .1949 .2236 .2546 .2877 .3228 .3594 .3974 .4364 .4761
.0003 .0004 .0005 .0008 .0011 .0015 .0021 .0028 .0038 .0051 .0068 .0089 .0116 .0150 .0192 .0244 .0307 .0384 .0475 .0582 .0708 .0853 .1020 .1210 .1423 .1660 .1922 .2206 .2514 .2843 .3192 .3557 .3936 .4325 .4721
.0003 .0004 .0005 .0007 .0010 .0014 .0020 .0027 .0037 .0049 .0066 .0087 .0113 .0146 .0188 .0239 .0301 .0375 .0465 .0571 .0694 .0838 .1003 .1190 .1401 .1635 .1894 .2177 .2483 .2810 .3156 .3520 .3897 .4286 .4681
.0002 .0003 .0005 .0007 .0010 .0014 .0019 .0026 .0036 .0048 .0064 .0084 .0110 .0143 .0183 .0233 .0294 .0367 .0455 .0559 .0681 .0823 .0985 .1170 .1379 .1611 .1867 .2148 .2451 .2776 .3121 .3483 .3859 .4247 .4641
In the Extreme (for z a 0) z
3.09
3.72
4.26
4.75
5.20
5.61
6.00
Probability
.001
.0001
.00001
.000001
.0000001
.00000001
.000000001
S-PLUS was used to determine information for the “In the Extreme” portion of the table.
18-W3527-APP 11/4/05 7:26 AM Page 727
Appendix of Tables
Table A.1 Standard Normal Probabilities (for z b 0)
Table probability
z
z
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4
.5000 .5398 .5793 .6179 .6554 .6915 .7257 .7580 .7881 .8159 .8413 .8643 .8849 .9032 .9192 .9332 .9452 .9554 .9641 .9713 .9772 .9821 .9861 .9893 .9918 .9938 .9953 .9965 .9974 .9981 .9987 .9990 .9993 .9995 .9997
.5040 .5438 .5832 .6217 .6591 .6950 .7291 .7611 .7910 .8186 .8438 .8665 .8869 .9049 .9207 .9345 .9463 .9564 .9649 .9719 .9778 .9826 .9864 .9896 .9920 .9940 .9955 .9966 .9975 .9982 .9987 .9991 .9993 .9995 .9997
.5080 .5478 .5871 .6255 .6628 .6985 .7324 .7642 .7939 .8212 .8461 .8686 .8888 .9066 .9222 .9357 .9474 .9573 .9656 .9726 .9783 .9830 .9868 .9898 .9922 .9941 .9956 .9967 .9976 .9982 .9987 .9991 .9994 .9995 .9997
.5120 .5517 .5910 .6293 .6664 .7019 .7357 .7673 .7967 .8238 .8485 .8708 .8907 .9082 .9236 .9370 .9484 .9582 .9664 .9732 .9788 .9834 .9871 .9901 .9925 .9943 .9957 .9968 .9977 .9983 .9988 .9991 .9994 .9996 .9997
.5160 .5557 .5948 .6331 .6700 .7054 .7389 .7704 .7995 .8264 .8508 .8729 .8925 .9099 .9251 .9382 .9495 .9591 .9671 .9738 .9793 .9838 .9875 .9904 .9927 .9945 .9959 .9969 .9977 .9984 .9988 .9992 .9994 .9996 .9997
.5199 .5596 .5987 .6368 .6736 .7088 .7422 .7734 .8023 .8289 .8531 .8749 .8944 .9115 .9265 .9394 .9505 .9599 .9678 .9744 .9798 .9842 .9878 .9906 .9929 .9946 .9960 .9970 .9978 .9984 .9989 .9992 .9994 .9996 .9997
.5239 .5636 .6026 .6406 .6772 .7123 .7454 .7764 .8051 .8315 .8554 .8770 .8962 .9131 .9279 .9406 .9515 .9608 .9686 .9750 .9803 .9846 .9881 .9909 .9931 .9948 .9961 .9971 .9979 .9985 .9989 .9992 .9994 .9996 .9997
.5279 .5675 .6064 .6443 .6808 .7157 .7486 .7794 .8078 .8340 .8577 .8790 .8980 .9147 .9292 .9418 .9525 .9616 .9693 .9756 .9808 .9850 .9884 .9911 .9932 .9949 .9962 .9972 .9979 .9985 .9989 .9992 .9995 .9996 .9997
.5319 .5714 .6103 .6480 .6844 .7190 .7517 .7823 .8106 .8365 .8599 .8810 .8997 .9162 .9306 .9429 .9535 .9625 .9699 .9761 .9812 .9854 .9887 .9913 .9934 .9951 .9963 .9973 .9980 .9986 .9990 .9993 .9995 .9996 .9997
.5359 .5753 .6141 .6517 .6879 .7224 .7549 .7852 .8133 .8389 .8621 .8830 .9015 .9177 .9319 .9441 .9545 .9633 .9706 .9767 .9817 .9857 .9890 .9916 .9936 .9952 .9964 .9974 .9981 .9986 .9990 .9993 .9995 .9997 .9998
In the Extreme (for z b 0) z
3.09
3.72
4.26
4.75
5.20
5.61
6.00
Probability
.999
.9999
.99999
.999999
.9999999
.99999999
.999999999
S-PLUS was used to determine information for the “In the Extreme” portion of the table.
727
18-W3527-APP 11/4/05 7:26 AM Page 728
Table A.2 t * Multipliers for Confidence Intervals and Rejection Region Critical Values
One-tailed a (Two-tailed a)/2 Confidence level = Central area – t*
t*
Confidence Level df
.80
.90
.95
.98
.99
.998
.999
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 50 60 70 80 90 100 1000 Infinite
3.08 1.89 1.64 1.53 1.48 1.44 1.41 1.40 1.38 1.37 1.36 1.36 1.35 1.35 1.34 1.34 1.33 1.33 1.33 1.33 1.32 1.32 1.32 1.32 1.32 1.31 1.31 1.31 1.31 1.31 1.30 1.30 1.30 1.29 1.29 1.29 1.29 1.282 1.281
6.31 2.92 2.35 2.13 2.02 1.94 1.89 1.86 1.83 1.81 1.80 1.78 1.77 1.76 1.75 1.75 1.74 1.73 1.73 1.72 1.72 1.72 1.71 1.71 1.71 1.71 1.70 1.70 1.70 1.70 1.68 1.68 1.67 1.67 1.66 1.66 1.66 1.646 1.645
12.71 4.30 3.18 2.78 2.57 2.45 2.36 2.31 2.26 2.23 2.20 2.18 2.16 2.14 2.13 2.12 2.11 2.10 2.09 2.09 2.08 2.07 2.07 2.06 2.06 2.06 2.05 2.05 2.05 2.04 2.02 2.01 2.00 1.99 1.99 1.99 1.98 1.962 1.960
31.82 6.96 4.54 3.75 3.36 3.14 3.00 2.90 2.82 2.76 2.72 2.68 2.65 2.62 2.60 2.58 2.57 2.55 2.54 2.53 2.52 2.51 2.50 2.49 2.49 2.48 2.47 2.47 2.46 2.46 2.42 2.40 2.39 2.38 2.37 2.37 2.36 2.330 2.326
63.66 9.92 5.84 4.60 4.03 3.71 3.50 3.36 3.25 3.17 3.11 3.05 3.01 2.98 2.95 2.92 2.90 2.88 2.86 2.85 2.83 2.82 2.81 2.80 2.79 2.78 2.77 2.76 2.76 2.75 2.70 2.68 2.66 2.65 2.64 2.63 2.63 2.581 2.576
318.31 22.33 10.21 7.17 5.89 5.21 4.79 4.50 4.30 4.14 4.02 3.93 3.85 3.79 3.73 3.69 3.65 3.61 3.58 3.55 3.53 3.50 3.48 3.47 3.45 3.43 3.42 3.41 3.40 3.39 3.31 3.26 3.23 3.21 3.20 3.18 3.17 3.098 3.090
636.62 31.60 12.92 8.61 6.87 5.96 5.41 5.04 4.78 4.59 4.44 4.32 4.22 4.14 4.07 4.01 3.97 3.92 3.88 3.85 3.82 3.79 3.77 3.75 3.73 3.71 3.69 3.67 3.66 3.65 3.55 3.50 3.46 3.44 3.42 3.40 3.39 3.300 3.291
Two-tailed a
.20
.10
.05
.02
.01
.002
.001
One-tailed a
.10
.05
.025
.01
.005
.001
.0005
Note that the t-distribution with infinite df is the standard normal distribution.