Interpreting your Model's Slopes


By fitting our LinearRegression() with our training features matrix with the dummy variables, we get the following multiple linear regression curve.

$\hat{income}=-77.9+9.24accommodates+86.17bedrooms+12.10beds$
$\qquad -12.47neighborhood_{Logan\_Square}+103.03neighborhood_{Near\_North\_Side}$
$\qquad +47.45neighborhood_{Near\_West\_Side}+35.58neighborhood_{West\_Town}+14.62roomtype_{private\_room}$

from sklearn.linear_model import LinearRegression
main_model = LinearRegression()
main_model.fit(X_train_dummies, y_train)
print('intercept: ', main_model.intercept_)
print('slopes:')
pd.DataFrame(main_model.coef_.T, index=X_train_dummies.columns)
    intercept:  -77.97650910336392
    slopes:
0
accommodates 9.242633
bedrooms 86.169861
beds 12.101156
neighborhood_Logan Square -12.471008
neighborhood_Near North Side 103.034109
neighborhood_Near West Side 47.451976
neighborhood_West Town 35.581501
room_type_Private room 14.624617

Numerical Explanatory Variable Slopes

Let's try putting the bedrooms slope 86.17 into words.

Question: Do you think the interpretation that we are making below is valid?

"If we add one more bedroom to an Airbnb, then the price of the listing will increase by $86.17."

There's actually a few issues with this interpretation.

  1. First, consider the fact that this dataset was collected without any kind of random assignment. Thus, any statistical analysis interpretations that you make should not imply that there is a causal relationship between your variables. (The "will increase" implies this causal relationship).
  2. Second, remember that there are now many other variables included in this model. We give no mention as to what is going on with these variables.
  3. This interpretation implies that this impact will definitely happen in real life. It makes no mention that this is merely our model prediction as to what may happen on average, and not reality.

To make a more mathematically valid and insightful interpretation, we might want to think about what manipulations we might need to make to our linear regression equation to get this 86.17 slope by itself. What we can do find the difference between:

  • predicted price of a listing with a certain number of bedrooms plus one more
  • predicted price of a listing with a certain number of bedrooms

and then hold every other variable in this different constant. We get some nice cancellations when we do this below, such that the bedrooms slope 86.17 is isolated.

$\hat{income}_{bedrooms+1}-\hat{income}_{bedrooms}$

$= \Big( -77.9+9.24accommodates+\mathbf{86.17(bedrooms+1)}+12.10beds$
$\qquad -12.47neighborhood_{Logan\_Square}+103.03neighborhood_{Near\_North\_Side}$
$\qquad +47.45neighborhood_{Near\_West\_Side}+35.58neighborhood_{West\_Town}+14.62roomtype_{private\_room} \Big)$
$\qquad - \Big(-77.9+9.24accommodates+\mathbf{86.17(bedrooms)}+12.10beds$
$\qquad -12.47neighborhood_{Logan\_Square}+103.03neighborhood_{Near\_North\_Side}$
$\qquad +47.45neighborhood_{Near\_West\_Side}+35.58neighborhood_{West\_Town}+14.62roomtype_{private\_room} \Big)$

$= \mathbf{86.17}$

Formal Definition of the numerical explanatory variable slope

Thus, when interpreting a given numerical explanatory variable slope in a multiple linear regression model, the following template is a valid one to use.

"All else held equal, by increasing the given explanatory variable by 1, we expect the predicted response variable to increase/decrease by $|\hat{\beta}_i |$ on average."

  • We say "increase" if $|\hat{\beta}_i |>0$, and "decrease" if $|\hat{\beta}_i |<0$.

Indicator Variable Slopes

Let's try putting the slope for $neighborhood_{Logan\_Square}$ -12.47 into words. Let's think about how we might isolate this slope.

What we can do is find the difference between:

  • predicted price of a Logan Square listing
  • predicted price of a Lake View listing (i.e., the reference level of the neighborhood categorical explanatory variable)

and then hold every other variable in this different constant. We get some nice cancellations when we do this below, such that the slope -12.47 is isolated.

$\hat{income}_{Logan\_Square} - \hat{income}_{Lake\_View}$

$=\Big(-77.9+9.24accommodates+86.17bedrooms+12.10beds$
$\qquad -\mathbf{12.47(1)+103.03(0)+47.45(0)+35.58(0)}+14.62roomtype_{private\_room}\Big)$
$\qquad - \Big(-77.9+9.24accommodates+86.17bedrooms+12.10beds$
$\qquad -\mathbf{12.47(0)+103.03(0)+47.45(0)+35.58(0)}+14.62roomtype_{private\_room}\Big)$
= $\mathbf{-12.47}$

Formal Definition of the indicator variable slope

Thus, when interpreting a given indicator variable slope in a multiple linear regression model, the following template is a valid one to use.

"All else held equal, we expect the predicted response variable value that corresponds to the given indicator variable level to be $|\hat{\beta}_i |$ higher/lower than the reference level, on average.

  • We say "increase" if $|\hat{\beta}_i |>0$, and "decrease" if $|\hat{\beta}_i |<0$.

Intercept

Finally, we can get the intercept of a regression model itself by plugging in all 0's for the numerical explanatory variables and indicator variables.

$\hat{price}$
$=-77.9+9.24accommodates+86.17bedrooms+12.10beds-12.47neighborhood_{Logan\_Square}$
$\qquad +103.03neighborhood_{Near\_North\_Side}+47.45neighborhood_{Near\_West\_Side}$
$\qquad +35.58neighborhood_{West\_Town}+14.62roomtype_{private\_room}$
$=-77.9+9.24(0)+86.17(0)+12.10(0)-12.47(1)$
$\qquad+103.03(0)+47.45(0)$
$\qquad+35.58(0)+14.62roomtype(0)$
$= -77.9$

Formal Definition of the Intercept

Thus, when interpreting the intercept in a multiple linear regression model, the following template is a valid one to use.

"We expect the predicted response variable value that corresponds to the observation in which all explanatory and indicator variable values are 0 to be our intercept value, on average."

Example: For instance, we would say that we expect the predicted price of a Chicago Airbnb listing in Lake View that has the entire house/apartment that accommodates 0 people, has 0 beds, and has 0 bedrooms to be priced at -$79.98. Of course this negative value is nonsensical from an interpretation perspective.