Answer all five questions (all parts) clearly and concisely. You must show all your calculations and work and ATTACH ALL XLSTAT WORKSHEETS. No credits will be given to answers without detailed work and the XLStat Worksheets .

The exam is open books, open notes, closed friends, colleagues, classmates and professors . Plagiarism tools will be used to detect similarities in exams. You are NOT to discuss or share the exam and/or your answers with anyone . Any deviation from this stated guideline/policy will be considered as academic dishonesty and will automatically result in an “F” grade without any exceptions.

The answers must be TYPED and TURNED IN as a WORD DOCUMENT (except for diagrams and mathematical expressions/equations if needed). Use CANVAS TURNITIN tool to submit it. Go to Assignment section to access it. Do not submit the exam until you are sure, you can submit it only ONCE. Make sure to email me all the excel work-sheets as well as the exam. The exam is due back by 4:00 P.M. on Saturday, October 16. If you need to get in touch with me for any reason, you must use your STU email for all correspondence. You can also reach me at gupta505@icloud.com

Good Luck!

1. Look at the School Data.

Race: 1 — White; 2—Black; 3 —Hispanic; 4—Other

Gender: 0 —Male; 1 = Female

Socioeconomic Status (SES): 1—Low; 2—Medium; 3 –High

School Type (sctyp) 1— Public; 2 – Private.

a) Identify the types of variables (nominal, scale etc.) in the dataset.

b) How were student average READING SCORES related to their WRITING SCORES? Make a graph, run the test, and report the results as a sentence:

c) Compare student’s average MATH TEST SCORES across Public vs. Private schools (sctyp). Show means in a graph, and do test, and report the results as a sentence.

d) Were the student’s SCIENCE TEST SCORES different across SOCIOECONOMIC LEVELS (low, Medium, High)? Show bar graph, run test, and report who was highest? Lowest?

e) Were the students’ mean CIVICS test scores different across Gender and across SOCIO-ECONOMIC STATUS? Show a bar graph, do test, and report (explain) significant results.

f) Analyze if “Locus of Control” is a predictor of student MATH scores? Now add Socioeconomic Status, School Type, gender and locus of control. What changed in this regression model with the added predictor variables? Explain in detail.

2. Suppose frank has estimated a cross sectional regression model for demand for gasoline by state:

PCONi = 389.6 + 60.8 UHMi – 36.5TAXi – 0.061REGi

T Stat : 5.92 -2.77 -1.43

N =50 R2 = .919

Where:

PCONi = petroleum consumption in the ith state (trillions of BTUs).

UHMi = urban highway miles within the ith state.

TAXi = the gasoline tax rate in the ith state (cents per gallon).

REGi = motor vehicle registration in the ith state (in thousands)

a) What do you expect the signs of the explanatory variables to be? Explain why.

b) According to the estimated equation, motor vehicle registrations variable is insignificant and it has a negative sign. Does this make sense to you? Why or why not. Explain carefully.

c) Suppose the simple correlation coefficient between REG and UHM is 0.98. What do you infer from that? In light of this added information, what, if anything you would do and why? What would you expect to find?

3. a) The One-Sample t test: This t-test is used only to see if a mean differs from a hypothesized value. This value would have to be known, or given to you.

Enter these data on XLStat:

Home Sale Prices

1

$99,000

2

$109,000

3

$85,900

4

$116,200

5

$101,250

6

$88,490

7

$86,980

8

$102,000

9

$94,500

10

$112,000

i)Is the mean home sale price significantly different from $100,000 at the p < .05 level? ii) Is the mean home sale price significantly different from $95,000 at the p < .05 level? 3b) THE DEPENDENT (PAIRED) t-test Time 1 Time 2 ID# Morning Afternoon 1 89.8 87.3 2 90.2 87.6 3 98.1 87.3 4 91.2 91.8 5 88.9 86.4 6 90.3 83.4 7 99.2 91.0 8 94.0 89.2 9 88.7 90.1 10 83.9 81.3 In this example, the same group of 10 students took a test in the morning, and then the same 10 subjects took the test again in the afternoon. THERE IS ONLY ONE GROUP, but TWO TIMES OF TESTING. THIS IS KNOWN AS A “WITHIN-SUBJECT DESIGN” This is also called a “Longitudinal Design” where the same subjects are tested repeatedly over time. Therefore, this design is also known as “Repeated Measures”. i) Compare the means for these two variables using a LINE GRAPH. Analyze if they did better on their AM test or PM test. ii) RUN this “Dependent or Paired Samples t-test” iii) Write a sentence telling the reader what test you used, the result, report means and SDs, list the t-value and the p level. From this evidence, does it appear that time of day made a difference in test performance? 4. Refer to the GDP Data and answer the following questions. a) find the descriptive statistics and make a table that includes the mean, std. dev, maximum, min of each variable. Label your table appropriately. b) Find the correlation coefficients. Do the signs of the correlation coefficients make sense to you? Why or why not? You must discuss the correlation coefficient for each pair of variables. c) If you are to estimate business cycle fluctuation, i.e., how GDP per capita changes over time, what would be the best dependent variable? What are the independent variables? d) Run a regression and write the model. Is this model statistically significant? Explain how reliable your results are referring to both the overall fit of the model and individual coefficients. f) Do the signs of the regression function make sense to you? Explain. g) How is it different from the signs you found in part b? Can you explain why? h) If unemployment rate is 7%, and percentage of debt is 20% what would be the GDP growth rate? Find the 95% confidence interval for the estimated GDP growth rate.