As an entire section, discuss these together:
How do you interpret an interaction term? (For example, if TV and radio advertising spending were found to have a significant interaction term with estimate 0.0011 when predicting sales, what does the 0.0011 physically mean?) Can you give 2 different interpretations?
Remember adding any term to your lm( )
formula will always decrease the RSS, even if it’s useless (like Karl said, you can test this by adding a randomly-generated column to any regression). How then do you test if adding an effect is significant?
How do you include a higher order term in a regression, and what does it represent? (For example, if TV advertising was found to have a significant quadratic term with estimate -0.002, what does it physically mean?)
What is the hierarchy principle, what does it mean, and why should we follow it?
Remember interaction can exist between any combination (2 or more) of categorical and quantitative variables. If you feel like you have a good understanding of how to interpret each of these cases, feel free to move on the next section; if you don’t, briefly reading this page is highly recommended (pay close attention to the difference in wording for each interpretation and how that difference is reflected in the plots).
For the exercise, I have modified mtcars by reducing it to the first four columns (mpg
is still the dependent variable; your predictor variables are now cyl
disp
and hp
), and adding in some combination of significant interactions and/or quadratic terms.
Run the code below to import the new modified data frame, then fit a complete model with ALL interaction and quadratic terms (it is not uncommon to add all interaction terms, but you usually wouldn’t add all quadratic terms like this unless you had a good reason; we’re just doing it here for the sake of the exercise).
= as_tibble(read.csv( # first use base R read.csv, then convert to tibblex
mtcars2 row.names = 1, # row.names=1 means treat first column as row names
text = ",mpg,cyl,disp,hp # text='....' means use this string of text as the data
Mazda RX4,14.8,6,160,110
Mazda RX4 Wag,14.8,6,160,110
Datsun 710,14.5,4,108,93
Hornet 4 Drive,21.4,6,258,110
Hornet Sportabout,28.1,8,360,175
Valiant,15.7,6,225,105
Duster 360,23.7,8,360,245
Merc 240D,17.6,4,147,62
Merc 230,15.8,4,141,95
Merc 280,13.4,6,168,123
Merc 280C,12,6,168,123
Merc 450SE,17.8,8,276,180
Merc 450SL,18.7,8,276,180
Merc 450SLC,16.6,8,276,180
Cadillac Fleetwood,33.8,8,472,205
Lincoln Continental,32.1,8,460,215
Chrysler Imperial,33.7,8,440,230
Fiat 128,23.3,4,78.7,66
Honda Civic,21.3,4,75.7,52
Toyota Corolla,24.7,4,71.1,65
Toyota Corona,13.7,4,120,97
Dodge Challenger,20.7,8,318,150
AMC Javelin,19.1,8,304,150
Camaro Z28,21.7,8,350,245
Pontiac Firebird,33.2,8,400,175
Fiat X1-9,18.2,4,79,66
Porsche 914-2,18.2,4,120,91
Lotus Europa,21.8,4,95.1,113
Ford Pantera L,24.3,8,351,264
Ferrari Dino,12.9,6,145,175
Maserati Bora,18.6,8,301,335
Volvo 142E,13.6,4,121,109"
))
After importing the new mtcars data, repeat all interpretation and prediction steps of discussion 9. Specifically, answer the following questions:
What are the coefficient estimates and standard errors? Give an interpretation of one of the new terms added in this discussion.
REPLACE TEXT WITH RESPONSE
What are the R² and adjusted R² for this model? Give an interpretation of both.
REPLACE TEXT WITH RESPONSE
Give \(95\%\) confidence intervals for all coefficients. Give an interpretation of one of the new intervals.
REPLACE TEXT WITH RESPONSE
According to the model, what mileage would I expect on average with a car that has 6 cylinders, 200 displacement, and 120 horsepower? Be careful in your calculation with the interactions and quadratic terms.
REPLACE TEXT WITH RESPONSE
As usual, make sure the names of everyone who worked on this with you is included in the header of this document. Then, knit this document and submit both this file and the HTML output on Canvas under Assignments ⇒ Discussion 10.