Slope and Intercept Interpretations


The way we go about interpreting the intercept and slopes of a logistic regression model is somewhat similar to how we interpret the intercept and slopes of a linear regression model, however there are some key distinctions as well.

Why is interpretation language important?

As with our interpretations that we make with a linear regression model, we want our language to be precise and thorough so as to not mislead non-technical audiences.

Non-Causal Language

Just like in a linear regression model, unless our dataset was collected via random assignment (ie. the dataset is not just found via an observational study), then we want to make sure that our interpretations do not imply a causal relationship between our explanatory variable and the response variable.

Instead, we should use we expect and on average when describing how a change in our explanatory variable value impacts the response variable value in the model.

Considering the Effect of other Explanatory Variables

Also similar to a multiple linear regression model, when we are interpreting the slope of just a single explanatory/indicator variable in a multiple logistic regression model, we want to hold all other explanatory/indicator variable values equal/constant. To express this, we also need to include this language by stipulating all else held equal/constant.

Intercept Interpretation

Let's use the logistic regression equation version that predicts the odds of success to interpret our intercept $\hat{\beta}_0$.

$\hat{odds}=\frac{\hat{p}}{1-\hat{p}}=e^{\hat{\beta}_0+\hat{\beta}_1x_1+...+\hat{\beta}_px_p}$

If we plug in 0's for all of our explanatory/indicator variables in the equation above, then we are able to isolate $e^{\hat{\beta}_0}$ by itself.

$\hat{odds}=\frac{\hat{p}}{1-\hat{p}}=e^{\hat{\beta}_0+\hat{\beta}_1(0)+...+\hat{\beta}_p(0)}=e^{\hat{\beta}_0}$

So typically rather than interpreting $\hat{\beta}_0$ by itself, we instead interpret $e^{\hat{\beta}_0}$ and we call it the baseline odds. It represents the predicted odds of success for the observation in which $x_1=0,...,x_p=0$.

Example: For instance, let's interpret the baseline odds $e^{\hat{\beta}_0}$ of our fitted logistic regression equation.

\begin{align*} \hat{odds} = \frac{\hat{p}}{1-\hat{p}} =\exp\left(\begin{aligned} &-37.47 \\ &+ 30.73(\,\text{has a profile pic}[T.yes]) \\ &+ 2.60(\,\text{number of words in name}) \\ &+ 0.087(\,\text{num characters in bio}) \\ &- 0.0060(\,\text{number of posts}) \\ &+ 0.025(\,\text{number of followers}) \\ &- 0.0046(\,\text{number of follows}) \end{aligned}\right) \end{align*}

By exponentiating our intercept $e^{-37.47}=5.33\times 10^{-17}$ we can make the following interpretations.

Numerical Odds

"We predict the odds of an Instagram account without a profile picture that has no words in their name, no characters in their bio, no posts, no followers, and does not follow anyone being real are $5.33\times 10{-17}$."

Prose Odds

The above interpretation used the numerical odds. We can also convert this numerical odds interpretation to a prose odds interpretation as follows.

$5.33\times 10{-17}=\frac{5.33\times 10^{-17}\ real\ chances}{1 fake chance} =\frac{10^{17}}{10^{17}}\cdot\frac{5.33\times 10^{-17}\ real\ chances}{1 fake chance} = \frac{5.33\ real\ chance}{10^{17}\ fake\ chance}$.

"We predict the odds of an Instagram account without a profile picture that has no words in their name, no characters in their bio, no posts, no followers, and does not follow anyone being real are 5.33 to $10^{17}$."

Or in other words, the model predicts this type of "low activity" account is very likely fake.

np.exp(-37.47)

5.333174121454362e-17

Numerical Explanatory Variable Slope Interpretation

Let's also use the logistic regression equation version that predicts the odds of success to interpret a slope $\hat{\beta}_i$ that corresponds to a numerical explanatory variable.

$\hat{odds}=\frac{\hat{p}}{1-\hat{p}}=e^{\hat{\beta}_0+\hat{\beta}_1x_1+...+\hat{\beta}_ix_i+...+\hat{\beta}_px_p}$

Using algebraic manipulation, this equation above is actually equivalent to the following.

$\hat{odds}=\frac{\hat{p}}{1-\hat{p}}=e^{\hat{\beta}_0}\cdot e^{\hat{\beta}_1x_1}\cdot...\cdot e^{\hat{\beta}_ix_i}\cdot...\cdot e^{\hat{\beta}_px_p}$

Specifically let's create two odds predictions: $\hat{odds}_{old}$ and $\hat{odds}_{new}$. We hold all other explanatory/indicator variables equal in these two equations except for $x_i$. In $\hat{odds}_{new}$ we add 1 additional unit to $x_i$.
  • $\hat{odds}_{old}=e^{\hat{\beta}_0}\cdot e^{\hat{\beta}_1x_1}\cdot...\cdot e^{\hat{\beta}_ix_i}\cdot...\cdot e^{\hat{\beta}_px_p}$

  • $\hat{odds}_{new}=e^{\hat{\beta}_0}\cdot e^{\hat{\beta}_1x_1}\cdot...\cdot e^{\hat{\beta}_i(x_i+1)}\cdot...\cdot e^{\hat{\beta}_px_p}$

Let's take the ratio of these two odds terms. This ratio gives us some nice cancellation that isolates the $e^{\hat{\beta}_i}$ term.

$\frac{\hat{odds}_{new}}{\hat{odds}_{old}} = \frac{e^{\hat{\beta}_0}\cdot e^{\hat{\beta}_1x_1}\cdot...\cdot e^{\hat{\beta}_i(x_i+1)}\cdot...\cdot e^{\hat{\beta}_px_p}}{e^{\hat{\beta}_0}\cdot e^{\hat{\beta}_1x_1}\cdot...\cdot e^{\hat{\beta}_i(x_i)}\cdot...\cdot e^{\hat{\beta}_px_p}}=\frac{e^{\hat{\beta}_i(x_i+1)}}{e^{\hat{\beta}_i(x_i)}} = \frac{e^{\hat{\beta}_ix_i}\cdot e^{\hat{\beta}_i}}{e^{\hat{\beta}_ix_i}}=e^{\hat{\beta}_i}$

Thus, rather than interpreting the slope $\hat{\beta}_i$ by itself, we instead interpret $e^{\hat{\beta}_i}$ as what we call the odds multiplier of the explanatory variable $x_i$.

Example: For instance, let's interpret the odds multipler of the number of followers explanatory variable, which is numerical.

\begin{align*} \hat{odds} = \frac{\hat{p}}{1-\hat{p}} =\exp\left(\begin{aligned} &-37.47 \\ &+ 30.73(\,\text{has a profile pic}[T.yes]) \\ &+ 2.60(\,\text{number of words in name}) \\ &+ 0.087(\,\text{num characters in bio}) \\ &- 0.0060(\,\text{number of posts}) \\ &+ 0.025(\,\text{number of followers}) \\ &- 0.0046(\,\text{number of follows}) \end{aligned}\right) \end{align*}

Number of Followers Slope Interpretation

By exponentiating the number of followers slope as $e^{0.025}=1.025$, we can make the following interpretation.

"All else held equal if we were to increase the number of followers an account has by 1, then we would expect the odds of the account being real to increase by a factor (ie. multiple) of 1.025, on average."

This language is very similar to how we interpret the slope for a numerical explanatory variable, except now we are stating that our odds of the account being real is now increasing by a multiple (ie. the product) of 1.025 rather than the addition of 1.025.

From our definitions above, it follows that:

$\frac{\hat{odds}_{new}}{\hat{odds}_{old}}=1.025$

And thus,

$\hat{odds}_{new}=1.025\hat{odds}_{old}$
So we are multiplying $\hat{odds}_{old}$ by 1.025 to get $\hat{odds}_{new}$.

Because 1.025 is only slightly larger than 1 (whose product does nothing), the addition of one extra follower only slightly increases the odds that the account is predicted to be real.

np.exp(0.025)

1.0253151205244289

Number of Follows Slope Interpretation

Alternatively, let's interpret the slope for the number of follows explanatory variable.

"All else held equal if we were to increase the number of accounts that a given account follows by 1, then we would expect the odds of the account being real to decrease by a factor (ie. multiple) of $e^{-0.0046}=0.995$, on average."

Notice, that becase $e^{-0.0046}=0.995<1$, then odds of this account being real decreases by a factor of 0.995 by this account following one more account (the model predicts on average). However, because 0.995 is very close to 1, this decrease is quite small.

$\hat{odds}{new}=0.995\hat{odds}$

np.exp(-0.0046)

0.9954105637959723

Indicator Variable Slope Interpretation

Finally, let's interpret the slope that corresponds to a 0/1 indicator variable in our logistic regression model.

For instance, let's interpret the slope that corresponds to our has_a_profile_pic[T.yes] indicator variable in our logistic regression model below.

\begin{align*} \hat{odds} = \frac{\hat{p}}{1-\hat{p}} =\exp\left(\begin{aligned} &-37.47 \\ &+ 30.73(\,\text{has a profile pic}[T.yes]) \\ &+ 2.60(\,\text{number of words in name}) \\ &+ 0.087(\,\text{num characters in bio}) \\ &- 0.0060(\,\text{number of posts}) \\ &+ 0.025(\,\text{number of followers}) \\ &- 0.0046(\,\text{number of follows}) \end{aligned}\right) \end{align*}

Let's next calculate two odds. Holding all other variables constant, let's calculate the following.

  1. the odds of the of the level that this particular slope corresponds to (ie. has_a_profile_pic='yes')
\begin{align*} \hat{odds}_{yes} =\exp\left(\begin{aligned} &-37.47 \\ &+ \mathbf{30.73(1)} \\ &+ 2.60(\,\text{number of words in name}) \\ &+ 0.087(\,\text{num characters in bio}) \\ &- 0.0060(\,\text{number of posts}) \\ &+ 0.025(\,\text{number of followers}) \\ &- 0.0046(\,\text{number of follows}) \end{aligned}\right) \end{align*}

  1. the reference level of the categorical explanatory variable has_a_profile_pic (ie. 'no')
\begin{align*} \hat{odds}_{no} =\exp\left(\begin{aligned} &-37.47 \\ &+ \mathbf{30.73(0)} \\ &+ 2.60(\,\text{number of words in name}) \\ &+ 0.087(\,\text{num characters in bio}) \\ &- 0.0060(\,\text{number of posts}) \\ &+ 0.025(\,\text{number of followers}) \\ &- 0.0046(\,\text{number of follows}) \end{aligned}\right) \end{align*}

Then, let's calculate the log of the ratio of these two odds.

\begin{align*} \log\left(\frac{\hat{odds}_{yes}}{\hat{odds}_{no}}\right)&=log(\hat{odds}_{yes}) - log(\hat{odds}_{no})\\ &=\left(\begin{aligned} &-37.47 \\ &+ \mathbf{30.73(1)} \\ &+ 2.60(\,\text{number of words in name}) \\ &+ 0.087(\,\text{num characters in bio}) \\ &- 0.0060(\,\text{number of posts}) \\ &+ 0.025(\,\text{number of followers}) \\ &- 0.0046(\,\text{number of follows}) \end{aligned}\right)-\left(\begin{aligned} &-37.47 \\ &+ \mathbf{30.73(0)} \\ &+ 2.60(\,\text{number of words in name}) \\ &+ 0.087(\,\text{num characters in bio}) \\ &- 0.0060(\,\text{number of posts}) \\ &+ 0.025(\,\text{number of followers}) \\ &- 0.0046(\,\text{number of follows}) \end{aligned}\right) \\ &=30.73 \end{align*}

By using properties of logarithms and some nice cancellations, this equation will isolate the slope 30.73 of our has_a_profile_pic[T.yes] indicator variable.

$\log\left(\frac{\hat{odds}_{yes}}{\hat{odds}_{no}}\right)=30.73$

The log of a ratio of odds is not as interpretable, so we can exponentiate both sides of the equation above to get the following.

$\frac{\hat{odds}_{yes}}{\hat{odds}_{no}}=e^{30.73}=22175296168523.9$

Or in other words...

$\hat{odds}_{yes}=22175296168523.9{\hat{odds}_{no}}$

We call $e^{30.73}=22175296168523.9$ the odds ratio of the has_a_profile_pic[T.yes] indicator variable and we interpret as follows.

"All else held equal we expect that the odds that an account WITH a profile picture is real to be a multiple of 22175296168523.9 times higher than he odds that an account WITHOUT a profile picture is real, on average."

Obviously, having a profile picture in this model makes it much more likely for the account to be predicted as real.

np.exp(30.73)

22175296168523.855