Odds and Probability


To answer what the terms $ln(\frac{\hat{p}}{1-\hat{p}})$ and $\frac{\hat{p}}{1-\hat{p}}$ represent in a logistic regression curve, we need to introduce another way other than probability of measuring and talking about the concept of "chance". Specifically, in this section we will talk about the odds of an event happening.

There are often two ways to talk about the odds of a particular event happening:

  1. the prose odds definition
  2. the numerical odds definition

Prose Odds Definition

You have most likely heard the prose odds definition being used before, especially if you have ever been interested in the chances of, say, a politician winning an election. For instance, you might have heard a political pundit say that the "odds of candidate Prudence Waffleton winning the election are 4 to 1". What does this mean? And how can we convert these odds into probability?

Two Possible Event Outcomes

When talking about the odds of an event happening, we always consider there to be just two possible outcomes of the event:

  • the success outcome and
  • the failure outcome.

For instance, if we choose to define "Prudence Waffleton winning" as our success outcome, then "Prudence Waffleton NOT winning" is the failure outcome.

Prose Odds Chances

When talking about the odds of a given outcome happening, we are always comparing two "chance numbers" to each other:

  • the "chances of success" vs.
  • the "chances of failure".

So for instance, when we say "odds of candidate Prudence Waffleton winning the election are 4 to 1", we are saying that for every 4 chances of her winning (4 success chances) there is 1 chance of her not winning (1 failure chance).

Using the prose odds definition specifically, we can actually define two types of odds:

  • the odds of success (also called the odds against failure)
  • the odds of failure (also called the odds against success)

Prose Odds of Success Definition


Using the prose definition, we define odds of success (or the odds against failure) by comparing the “number of success chances” to the “number of failure chances” (in this order).

Thus we might say the following equivalent statements:

  • the odds of candidate Prudence Waffleton winning the election (ie. our success outcome) are 4 to 1
  • the odds against candidate Prudence Waffleton losing the election (ie. our failure outcome) are 4 to 1

Prose Odds to Probability Conversion

Thus when it comes to Prudence winning, we are saying that there are 4 "chances" that she will win for every 1 "chance" that she will not win. When it comes to talking about odds, we can think of each of these "chances" as a piece of paper with the corresponding outcome written on it. We would then theoretically throw each chance paper into a hat. Thus in this case we would have 5 slips of paper thrown in the hat: 4 with "Prudences wins" and 1 with "Prudence loses".

The probability of "Prudence winning" (the success outcome) is equivalent to randomly selecting a "Prudence wins" slip out of the hat. Thus
$p=P(Prudence\ Wins)=4/5$

Similarly, the probability of "Prudence NOT winning" (the failure outcome) is equivalent to randomly selecting a "Prudence loses" slip out of the hat. Thus
$1-p=P(Prudence\ Loses)=1/5$

Equivalent Prose Odds Formats

There's actually infinitely many equivalent ways that we could have represented the chances of Prudence winning in the prose odds format. For instance, we could have equivalently have said "the odds of Prudence winning are 12 to 3".

Why is this statement equivalent?

In this statement, we are now told that there are $12(=3\cdot 4)$ winning (ie. success) chances and $3(=3\cdot 1)$ not winning (ie. failure) chances. Similarly, if we throw these 15 "chance slips" into a hat, then we end up with the same probabilities.


$p=P(Prudence\ Wins)=12/15=4/5$
$1-p=P(Prudence\ Loses)=3/15=1/5$


Thus, for any integer $c\geq 0$, the following prose odds are expressing equivalent statements of chance.
  • The odds of success are $A$ to $B$
  • The odds of success are $c\cdot A$ to $c\cdot B$

Prose Odds of Failure Definition


Using the prose definition, we define odds for failure (or the odds against success) by comparing the “number of failure chances” to the “number of succes chances” (in this order).

Thus we might say the following equivalent statements:

  • the odds of candidate Prudence Waffleton losing the election (ie. our failure outcome) are 1 to 4
  • the odds against candidate Prudence Waffleton winning the election (ie. our success outcome) are 1 to 4

Prose Odds to Chance Conversion

Similarly, we would toss 1 "Prudence losing" slip chance and 4 "Prudence winning" slip chances and randomly draw a slip from the hat to calculate the probability of the given event happening.

Thus, we would arrive at the same probabilities that we got from odds of success definition.


$p=P(Prudence\ Wins)=4/5$
$1-p=P(Prudence\ Loses)=1/5$

Numerical Odds (and Log Odds) Definitions

Another way to express of the odds of a given event happening is to use the numerical odds definition. Rather than writing out the "success chances" and "failure chances" in a sentence like we do with the prose odds definition, for mathematical ease we simply just calculate the ratio of these two numbers.

Calculating Numerical Odds

When you know the prose odds definition

If we have prose odds of a given event, then we define/convert the numerical odds as follows.

  • the odds of success (or the odds against failure) $=\frac{number\ of\ success\ chances}{number\ of\ failure\ chances}$
  • the odds of failure (or the odds of success)$=\frac{number\ of\ failure\ chances}{number\ of\ success\ chances}$

So similarly, given that the odds of Prudence winning are 4 to 1, then:

  • the numerical odds of her winning are $\frac{4\ win\ chances}{1\ win\ chances}=4$
  • the numerical odds of her losing are $\frac{1\ lose\ chances}{4\ win\ chances}=1/4$

When you know the probability

If we have the probability $p$ of a success happening, then we define/convert the numerical odds as follows.

If $p$ is the probability of a success, then we call:
  • the odds of success (or the odds against failure)=$\frac{p}{1-p}$
  • the odds of failure (or the odds of success)$=\frac{1-p}{p}$

So given that we know $p=P(Prudence\ Wins)=4/5$, then the:

  • numerical odds of Prudence winning (ie. the odds against her losing) is $\frac{p}{1-p}=\frac{4/5}{1-4/5}=\frac{4/5}{1/5} = \frac{4}{1}=4$
  • numerical odds of Prudence losing (ie. the odds against her winning) is $\frac{1-p}{p}=\frac{1-4/5}{4/5}=\frac{1/5}{4/5} = \frac{1}{4}$

Log Odds

Similarly, when we talk about the log odds of a given outcome happening, we're simply taking the natural log of our numerical odds.

If we have the probability $p$ of a success happening, then we define/convert the log odds as follows.

If $p$ is the probability of a success, then we call:
  • the log odds of success (or the log odds against failure)=$log(\frac{p}{1-p})$
  • the log odds of failure (or the log odds against success)$=log(\frac{1-p}{p})$

Logistic Regression Left-Hand-Side

Recall that we defined $\hat{p}=P(y=1)=P(y=success)$ in our logistic regression model below. Thus, we can now see that the left hand side of the representation below represents the log odds of success (or the log odds that Y=1) for the corresponding $x$ value.

Simple Logistic Regression Model

$log(\hat{odds})=log(\frac{\hat{p}}{1-\hat{p}})=\hat{\beta}_0+\hat{\beta}_1x$

Conversions

At this point we have defined three ways to talk about chance:

  • probability
  • numerical odds
  • prose odds

Below is a summary of the 6 ways you can convert to each type.

Probability to Numerical Odds

In 4.2.1 we talked about how to convert a probability into a numerical odds.

If $p$ is the probability of a success, then we call:
  • the odds of success (or the odds against failure)=$\frac{p}{1-p}$
  • the odds of failure (or the odds of success)$=\frac{1-p}{p}$

Numerical Odds to Probability

Alternatively, by manipulating the equations above we can convert a numerical odds number into a probability using the following equations.

  • If you know the odds of success
    • $P(success)=p=\frac{odds\ of\ success}{1+odds\ of\ success}$
    • $P(failure)=1-p$

  • If you know the odds of failure
    • $P(failure)=1-p=\frac{odds\ of\ failure}{1+odds\ of\ failure}$
    • $P(success)=p$

Prose Odds to Probability

In 4.1. we talked about how to go from prose odds to probability.

  • $P(success)=p=\frac{number\ of\ success\ chances}{total\ number\ of\ chances}$
  • $P(failure)=1-p=\frac{number\ of\ failure\ chances}{total\ number\ of\ chances}$

Prose Odds to Numerical Odds

Also in 4.2.1 we talked about how to convert prose odds into numerical odds.

  • the odds of success (or the odds against failure) $=\frac{number\ of\ success\ chances}{number\ of\ failure\ chances}$
  • the odds of failure (or the odds of success)$=\frac{number\ of\ failure\ chances}{number\ of\ success\ chances}$

Numerical Odds to Prose Odds

Suppose someone tells us that the (numerical) odds of it raining today is 1.6. Unfortunately, a numerical odds number tends to be less easy to interpret by itself. How can we convert this numerical odds number to a prose odds statement?

Well there's actually infinitely many ways that you can do this. Remember that the numerical odds of success simply just represents the ratio of a set of possible "success chances" to "failure chances".

So given that $1.6=\frac{1.6}{1}$, we could technically say $1.6=\frac{1.6\ success\ chances}{1\ failure\ chances}$, or that there were "1.6 chances of it raining (ie. success)" to "1 chance of it not raining (ie. failure)".

So we might say that "the odds of it raining are 1.6 to 1". However, some people do not like non-integer odds numbers. So we might multiply these two numbers by, say 10, and then make an equivalent statement saying "the odds of it raining are 16 to 10".

  • the odds of success (or the odds against failure) $=\frac{odds\ of\ success}{1} = \frac{success\ chances}{failure\ chances}$
  • "success chances"=odds of success
  • "failure chances"=1
  • The odds of success are "success chances" to "failure chances".