**Question 1) (20 points)**Suppose that you are asked to conduct a study to determine whether smaller class sizes lead to improved student performance of fourth graders.

1. If you could conduct any experiment you want, what would you do? Be specific.

2. More realistically, suppose you can collect observational data on several thousand fourth graders in a given state. You can obtain the size of their fourth-grade class and a standardized test score taken at the end of fourth grade. Why might you expect a negative correlation between class size and test score?

3. Would a negative correlation necessarily show that smaller class sizes cause better performance? Explain.

**Question 2) (40 points)**

The data in MEAP01.XLS are for the state of Michigan in the year 2001. Use these data to answer the following questions.

1. Find the largest and smallest values of math4. Does the range make sense? Explain.

2. How many schools have a perfect pass rate on the math test? What percentage is this of the total sample?

3. How many schools have math pass rates of exactly 50%?

4. Compare the average pass rates for the math and reading scores. Which test is harder to pass?

5. Find the correlation between math4 and read4. What do you conclude?

6. The variable exppp is expenditure per pupil. Find the average of variableexppp along with its standard deviation. Would you say there is wide variation in per pupil spend-ing?

7. Suppose School A spends $6,000 per student and School B spends $5,500 per student. By what percentage does School A?s spending exceed School B?s? Compare this to 100 [log(6,000) – log(5,500)], which is the approximation percentage difference based

**Question 3) (40 points)**

Use the data in BWGHT.XLS to answer this question.1. How many women are in the sample, and how many report smoking during pregnancy?

2. What is the average number of cigarettes smoked per day? Is the average a good measure of the “typical” woman in this case? Explain.

3. Among women who smoked during pregnancy, what is the average number of cigarettes smoked per day? How does this compare with your answer from part (ii), and why?

4. Find the average of variable fatheduc in the sample. Why are only 1,192 observations used to compute this average?

5. Report the average family income and its standard deviation in dollars.