Link to source file

Just for fun (1 min)


Discuss together (5-10 min)

As an entire section, discuss these together:

  1. What are the advantages of using multiple linear vs single linear regression, especially regarding interpreting the estimated coefficients?
  2. What are the main assumptions you need to check for multiple linear regression that could suggest a poor fit, and how do these compare to simple linear regression?


Exercise

Today we’re going to revisit the mtcars dataset and analyze it using multiple linear regression. Note this is a built-in dataset provided as part of the datasets package in R.

As usual, break off into groups of 3-4 students. In your group, nominate one person to share their screen.

Background

Run ?(mtcars) in the console (do NOT add it to this Rmd file) and briefly read the help page. Specifically, take note of the following:

  1. What is the source of this data?
  2. What is this dataset measuring? (i.e. what is the response variable?)
  3. What predictors are available and what do they mean?

Feel free to also run head(mtcars, 10) or View(mtcars) to inspect the data frame briefly before moving on.

Fitting

Uncomment the line below and finish it. Specifically, use lm to run a regression of mpg on all other predictors (an easy way to do this is to use mpg ~ . as the first argument). Make sure to also include data = mtcars as an argument or it won’t know where to get the variable names from.

# lm.mtcars = lm(...)

View a summary of the regression by uncommenting and running the line below

# summary(lm.mtcars)

Briefly inspect the residuals plot by running plot(lm.mtcars,which=1:2) . What do you observe, and what does it mean?

REPLACE TEXT WITH RESPONSE

Interpretation

Uncomment the line below to get the estimated coefficients along with their standard errors.

# summary(lm.mtcars)$coefficients[,1:2]

Give an interpretation of the estimate and standard error for one of these predictor variables. Be careful in your wording of the interpretation.

REPLACE TEXT WITH RESPONSE

What does the intercept here mean? (Except for special situations, we generally don’t care much about the intercept, but you should still understand what it means.)

REPLACE TEXT WITH RESPONSE

Karl doesn’t like the R² statistic, but what is the R² for this model? (Hint: look at the output of summary) Give an interpretation of this value.

REPLACE TEXT WITH RESPONSE

Briefly read about the adjusted R² here. What is the adjusted R² of this model and how does this differ from the normal R² value? (Hint: again, look at the output of summary).

REPLACE TEXT WITH RESPONSE

Generate \(95\%\) confidence intervals for the coefficients using the confint function. Give an interpretation of these confidence intervals.

# confint(...)

REPLACE TEXT WITH RESPONSE

Prediction

According to the model, what mileage would I expect on average with a car that has 6 cylinders, 200 displacement, 120 horsepower, 3.4 rear axle ratio, 2500 pounds, 17.5 1/4 mile time, a straight engine, automatic transmission, 4 forward gears, and 3 carburetors? (Be careful of your units and how you denote the engine and transmission variables. Again, the help page ?mtcars may be very helpful here.)


Submission

As usual, make sure the names of everyone who worked on this with you is included in the header of this document. Then, knit this document and submit both this file and the HTML output on Canvas under Assignments ⇒ Discussion 9.