Margin call: Is odds ratio really that good?

3 minute read

Published:

It has been commonly argued in sociology that odds ratio—as a measure of association between categorical variables—is appealing, because of its “margin-free” property. This has always baffled me, so I decided to articulate my bafflement here as a sanity check.

To begin with, here’s my understanding of what people are really trying to achieve when they use a “margin-free” measure. The idea is to separate absolute/marginal/uniform/across-the-board changes in the marginal distribution from (relative) changes in the disparity/association (e.g., Mare, 1981). In other words, people want a measure of disparities/associations that’s impervious to marginal changes. And odds ratio is such a measure in terms of cell or joint probabilities.

Consider a simple two-by-two table, where the dependent variable is a binary indicator of college graduation, and the independent variable is a binary SES indicator. Then the odds ratio is defined as

\begin{equation} \left. \frac{Pr(college=1 | SES=high)}{Pr(college=0 | SES=high)} \middle/ \frac{Pr(college=1 | SES=low)}{Pr(college=0 | SES=low)} \right., \end{equation} which can be written as \begin{equation} \left. \frac{Pr(college=1, SES=high)}{Pr(college=0, SES=high)} \middle/ \frac{Pr(college=1, SES=low)}{Pr(college=0, SES=low)} \right.. \end{equation}

So Expression (1) is in terms of conditional probabilities, and Expression (2) is in terms of joint probabilities (or equivalently, cell probabilities, referring to cells of the 2-by-2 table). And odds ratio is margin-free in the sense that if we multiply both $Pr(college=1, SES=high)$ and $Pr(college=1, SES=low)$ by a number $u$, and correspondingly multiply $Pr(college=0, SES=high)$ and $Pr(college=0, SES=low)$ by $v=\frac{1-\alpha[Pr(college=1, SES=high)+Pr(college=1, SES=low)]}{Pr(college=0, SES=high) + Pr(college=0, SES=low)}$, then the odds ratio remains unchanged (the value of $v$ is such so that the the four joint probabilities still sum to 1). Note that this logic only works on Exression (2), not Expression (1), due to the constraint that $Pr(college=1 \mid SES=high)+Pr(college=0 \mid SES=high)=1$ and $Pr(college=1 \mid SES=low)+Pr(college=0 \mid SES=low)=1$.

And this is why I find the margin-free property of odds ratio not that appealing. To me, joint probabilities and changes defined in terms of them are substantively not as natural and relevant as conditional probabilities and the corresponding changes. I think conditional probabilities are more natural because I tend to think about them when I think about stratification. For example, it feels right to talk about the probability of college graduation and how it varies by SES, not the probability of being in a cell jointly defined by graduation status and SES. I think this intuition also works well with how statistical models are set up in practice: we often model the conditional probability as, for example, $Pr(college=1 \mid SES)= invlogit(\alpha+\beta SES)$, but we never model a joint probability.

In fact, once we focus on conditional probabilities, both risk difference and risk ratio seem to be measures that are more naturally “margin free”. In particular, if there had been an across-the-board additive expansion in education (adding the same number to $Pr(college=1 \mid SES=high)$ and $Pr(college=1 \mid SES=low)$), the risk difference, \begin{equation} Pr(college=1 \mid SES=high)-Pr(college=1 \mid SES=low), \end{equation} wouldn’t change. And if there had been an across-the-board multiplicative expansion in eduaction (multiplying $Pr(college=1 \mid SES=high)$ and $Pr(college=1 \mid SES=low)$ by the same number), the risk ratio, \begin{equation} \frac{Pr(college=1 \mid SES=high)}{Pr(college=1 \mid SES=low)}, \end{equation} wouldn’t change. These changes are marginal in the sense that they happen uniformly to the two SES groups, and risk difference and risk ratio are each impervious to one of these changes. In contrast, odds ratio does not remain the same under either of these changes.

Leave a Comment