<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>COMPAS on Sebastian Spicker</title>
    <link>https://sebastianspicker.github.io/tags/compas/</link>
    <description>Recent content in COMPAS on Sebastian Spicker</description>
    <image>
      <title>Sebastian Spicker</title>
      <url>https://sebastianspicker.github.io/og-image.png</url>
      <link>https://sebastianspicker.github.io/og-image.png</link>
    </image>
    <generator>Hugo -- 0.160.0</generator>
    <language>en</language>
    <lastBuildDate>Fri, 08 Mar 2024 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://sebastianspicker.github.io/tags/compas/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>You Cannot Have All Three: The Fairness Impossibility Theorem</title>
      <link>https://sebastianspicker.github.io/posts/fairness-impossibility-ai-bias/</link>
      <pubDate>Fri, 08 Mar 2024 00:00:00 +0000</pubDate>
      <guid>https://sebastianspicker.github.io/posts/fairness-impossibility-ai-bias/</guid>
      <description>Three natural fairness criteria for an AI classifier — calibration, equal false positive rates, equal false negative rates — cannot all hold simultaneously when base rates differ across groups. This is not an engineering failure. It is a theorem. Choosing which criterion to satisfy is a political decision, not a technical one.</description>
      <content:encoded><![CDATA[<h2 id="summary">Summary</h2>
<p>In 2016 ProPublica published an investigation showing that COMPAS — a widely used recidivism risk
assessment tool — assigned higher risk scores to Black defendants than to White defendants with
equivalent actual recidivism rates. The tool&rsquo;s developer responded that COMPAS is well-calibrated:
among defendants of any race assigned a given score, the subsequent recidivism rates are
consistent with that score. Both claims were correct.</p>
<p>The apparent contradiction between them is resolved by a mathematical result that was proved
independently by two groups the same year. The fairness impossibility theorem establishes that
calibration, equal false positive rates, and equal false negative rates cannot all hold
simultaneously when base rates differ between groups — unless the classifier is perfect.</p>
<p>This is not a property of COMPAS specifically. It is not fixed by a better algorithm, more
diverse training data, or more careful engineering. It is a constraint that holds for any
probabilistic classifier operating on groups with unequal prevalence of the predicted outcome.</p>
<p>The question this forces is not &ldquo;how do we make the algorithm fair?&rdquo; The question is &ldquo;which
fairness criterion do we endorse, and can we defend that choice to the people it disadvantages?&rdquo;
That is not a technical question.</p>
<h2 id="the-compas-investigation">The COMPAS Investigation</h2>
<p>Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner published &ldquo;Machine Bias&rdquo; in ProPublica
on 23 May 2016 (<a href="#ref-angwin2016">Angwin et al., 2016</a>). They had obtained COMPAS risk scores for
approximately 7,000 defendants in Broward County, Florida, along with actual two-year recidivism
data. Their finding: among defendants who did not go on to reoffend, Black defendants were
falsely labelled high-risk at roughly twice the rate of White defendants. The false positive rate
was substantially higher for Black defendants.</p>
<p>Northpointe (now Equivant), the tool&rsquo;s developer, responded that ProPublica&rsquo;s analysis was
misleading. COMPAS is <em>calibrated</em>: within any given score band, the actual recidivism rate is
the same regardless of race. A score of 7 means approximately the same thing for a Black
defendant as for a White defendant. This is a genuine and important property for a risk assessment
to have.</p>
<p>Both analyses were conducted correctly. The tension between them is not a matter of one side
being wrong. It is a matter of two legitimate fairness criteria being simultaneously satisfied
being mathematically impossible.</p>
<h2 id="three-definitions-of-fairness">Three Definitions of Fairness</h2>
<p>Let \(Y \in \{0, 1\}\) be the true outcome (reoffend/not), \(\hat{Y}\) be the classifier&rsquo;s
prediction, and \(A \in \{0, 1\}\) indicate group membership.</p>
<p><strong>Calibration</strong> (predictive parity): for all score values \(s\),</p>
$$P(Y = 1 \mid \hat{Y} = s, A = 0) = P(Y = 1 \mid \hat{Y} = s, A = 1)$$<p>If the model assigns a score of 7 to a defendant, the actual reoffending rate should be the
same regardless of race. This is what COMPAS satisfies.</p>
<p><strong>False positive rate parity</strong>:</p>
$$P(\hat{Y} = 1 \mid Y = 0, A = 0) = P(\hat{Y} = 1 \mid Y = 0, A = 1)$$<p>Among defendants who will not reoffend, the probability of being incorrectly labelled high-risk
should be equal across groups. This is what ProPublica found violated.</p>
<p><strong>False negative rate parity</strong>:</p>
$$P(\hat{Y} = 0 \mid Y = 1, A = 0) = P(\hat{Y} = 0 \mid Y = 1, A = 1)$$<p>Among defendants who will reoffend, the probability of being incorrectly labelled low-risk
should be equal across groups.</p>
<p>All three properties seem like reasonable things to ask of a fair classifier. The impossibility
theorem says you cannot have all three at once — with a precise exception.</p>
<h2 id="the-impossibility-theorem">The Impossibility Theorem</h2>
<p>Alexandra Chouldechova proved the relevant result in 2017 using Broward County data as her case
study (<a href="#ref-chouldechova2017">Chouldechova, 2017</a>). Jon Kleinberg, Sendhil Mullainathan, and
Manish Raghavan proved an equivalent result independently (<a href="#ref-kleinberg2017">Kleinberg et al., 2017</a>).</p>
<p>The argument is straightforward. Suppose a classifier is calibrated and produces a binary
prediction (high/low risk). Let \(p_0\) and \(p_1\) be the base rates — the actual reoffending
rates — in groups 0 and 1. For a binary classifier with positive predictive value PPV and
negative predictive value NPV:</p>
<ul>
<li>The false positive rate satisfies (via Bayes): \(\text{FPR} = \frac{\text{TPR} \cdot \text{PR} \cdot (1-\text{PPV})}{\text{PPV} \cdot (1-\text{PR})}\) where PR is prevalence and TPR is sensitivity</li>
<li>The false negative rate satisfies (via Bayes): \(\text{FNR} = \frac{\text{TNR} \cdot (1-\text{PR}) \cdot (1-\text{NPV})}{\text{NPV} \cdot \text{PR}}\) where TNR is specificity</li>
</ul>
<p>If calibration holds — PPV and NPV are equal across groups — and the base rates \(p_0 \neq p_1\),
then the FPR and FNR in each group are functions of that group&rsquo;s specific base rate. They cannot
both be equalized across groups unless either:</p>
<ol>
<li>\(p_0 = p_1\): the base rates are equal, or</li>
<li>The classifier is perfect: FPR = FNR = 0.</li>
</ol>
<p>In the real case — unequal base rates, imperfect classifier — calibration and equalized error
rates are mutually exclusive. You can have one or the other but not both. The three criteria have
two degrees of freedom, and the third is determined by the first two plus the base rates. It is an
algebraic constraint, not an engineering limitation.</p>
<h2 id="a-structural-analogy">A Structural Analogy</h2>
<p>The structural similarity to another impossibility result is worth noting.</p>
<p>The Robertson inequality in quantum mechanics (<a href="#ref-robertson1929">Robertson, 1929</a>) states that
for any two observables \(\hat{A}\) and \(\hat{B}\):</p>
$$\Delta A \cdot \Delta B \geq \frac{1}{2} \left| \langle [\hat{A}, \hat{B}] \rangle \right|$$<p>This is not an engineering failure. It is a consequence of the algebraic structure of the theory:
if \([\hat{A}, \hat{B}] \neq 0\), then \(\Delta A\) and \(\Delta B\) cannot simultaneously be
made arbitrarily small. No measurement apparatus, however precise, can violate it. The constraint
is in the mathematics, not the hardware.</p>
<p>The fairness impossibility has the same character. Three desiderata, a structural constraint that
prevents simultaneous satisfaction, and no algorithmic escape route. A better model does not help.
Richer training data does not help. The constraint is in the arithmetic of conditional
probabilities and base rates.</p>
<p>The disanalogy is this: in quantum mechanics, \(\hbar\) is a fundamental constant — you cannot
reduce it. In fairness, the base rates are not constants of nature. They are historical outcomes
of social processes: incarceration rates, policing patterns, economic conditions, educational
access. The theorem does not tell you that unequal base rates are acceptable; it tells you that
given unequal base rates, the three fairness criteria cannot all be satisfied.</p>
<h2 id="gender-bias-in-ai-systems">Gender Bias in AI Systems</h2>
<p>The impossibility theorem applies to any binary classification setting with unequal base rates.
The empirical landscape of AI gender bias gives several concrete instances where one criterion was
satisfied while others were not.</p>
<p>In October 2018, Reuters reported that Amazon had developed and then abandoned an internal
AI-based recruiting tool that systematically downgraded résumés from women
(<a href="#ref-dastin2018">Dastin, 2018</a>). The model had been trained on a decade of hiring decisions,
in which successful hires were predominantly male. The model learned that &ldquo;male&rdquo; features were
associated with success and penalized female indicators accordingly. Calibration to the training
data produced systematic gender bias in output.</p>
<p>Tolga Bolukbasi and colleagues showed in 2016 that word embeddings trained on large text corpora
encoded gender stereotypes in their geometric structure
(<a href="#ref-bolukbasi2016">Bolukbasi et al., 2016</a>). The analogy \(\text{man} : \text{computer
programmer} :: \text{woman} : \text{homemaker}\) could be recovered directly from the vector
arithmetic of the embedding space. The embedding was calibrated to the text corpus, which reflected
the occupational distribution of the time — and perpetuated it.</p>
<p>Jieyu Zhao and colleagues found that image captioning and activity recognition models amplified
existing gender associations (<a href="#ref-zhao2017">Zhao et al., 2017</a>). &ldquo;Cooking&rdquo; was associated with
women in 67% of training images; the models amplified that to 84% at inference.
The amplification is a consequence of models learning the easiest features that predict the label
— and in a world where cooking is disproportionately female, &ldquo;female appearance&rdquo; becomes a
feature that predicts &ldquo;cooking.&rdquo;</p>
<p>Joy Buolamwini and Timnit Gebru&rsquo;s &ldquo;Gender Shades&rdquo; study found error rates of up to 34.7% for
darker-skinned women in commercial facial recognition systems, compared to 0.8% for lighter-skinned
men (<a href="#ref-buolamwini2018">Buolamwini &amp; Gebru, 2018</a>). The classifiers were calibrated on
predominantly light-skinned training data. Calibration on the majority group produced large errors
on the minority group — exactly the pattern the impossibility theorem describes.</p>
<p>Hadas Kotek and colleagues tested four large language models on gender-stereotyped occupational
prompts in 2023 (<a href="#ref-kotek2023">Kotek et al., 2023</a>). The models were three to six times more
likely to choose the gender-stereotyped occupation when responding to ambiguous prompts. The
models were calibrated to human-generated text; human-generated text encodes human stereotypes.</p>
<h2 id="the-solutions-and-their-limits">The Solutions and Their Limits</h2>
<p>Three broad approaches exist to algorithmic debiasing, and all three face the same constraint.</p>
<p><strong>Pre-processing</strong> removes bias from training data before training. Zemel and colleagues proposed
&ldquo;Learning Fair Representations&rdquo; — a latent embedding that encodes the data usefully while
obscuring group membership (<a href="#ref-zemel2013">Zemel et al., 2013</a>). This can reduce bias in the
learned representation, but it cannot simultaneously satisfy all three fairness criteria; it
trades one against another by compressing the group-informative dimensions.</p>
<p><strong>Post-processing</strong> adjusts the classifier&rsquo;s decisions after training. Moritz Hardt, Eric Price,
and Nathan Srebro&rsquo;s equalized odds approach (<a href="#ref-hardt2016">Hardt et al., 2016</a>) adjusts
decision thresholds separately per group to achieve FPR/FNR parity. This satisfies equalized
odds by construction — but only by abandoning calibration, which the Chouldechova theorem requires
when base rates differ.</p>
<p><strong>In-processing</strong> incorporates a fairness constraint into the training objective. Agarwal and
colleagues proposed a reductions approach that allows the practitioner to specify which fairness
constraint to impose (<a href="#ref-agarwal2018">Agarwal et al., 2018</a>). But you must choose. The
algorithm can optimize for any one of the three criteria; it cannot optimize for all three
simultaneously when base rates differ.</p>
<p>A 2021 survey by Mitchell and colleagues confirms that all three paradigms face the same
impossibility (<a href="#ref-mitchell2021">Mitchell et al., 2021</a>). The choice of paradigm is a choice
about which criterion to prioritize, and that choice has distributional consequences that fall
differently on different groups.</p>
<h2 id="the-political-choice">The Political Choice</h2>
<p>This is where Arvind Narayanan&rsquo;s framing becomes essential. His 2018 tutorial catalogued 21
distinct definitions of algorithmic fairness and titled it &ldquo;21 Fairness Definitions and Their
Politics&rdquo; (<a href="#ref-narayanan2018">Narayanan, 2018</a>). The title is the argument: the definitions
are not equivalent, choosing among them is not a technical decision, and the choice encodes a
prior about what justice requires.</p>
<p>In the criminal justice context: a false positive (predicting recidivism when the defendant will
not reoffend) imposes a cost on the defendant — higher bail, longer sentence, restricted
conditions of release. A false negative (predicting non-recidivism when the defendant will
reoffend) imposes a cost on potential future victims and on public safety. When we choose to
minimize FPR parity, we are choosing to protect defendants from false accusation. When we choose
to minimize FNR parity, we are choosing to protect the public from missed offenders. These are
both defensible values. They produce different error distributions across groups.</p>
<p>Choosing overall accuracy as the metric — which is what maximizing predictive performance
typically means — is itself a value choice: it implicitly weights errors by their frequency in
the population, which means errors made on less-common outcomes are relatively under-penalized.
When racial disparities in base rates are products of historical injustice, this choice compounds
that injustice.</p>
<p>Solon Barocas, Moritz Hardt, and Arvind Narayanan&rsquo;s textbook <em>Fairness and Machine Learning</em>
(2023) makes explicit that the choice between fairness criteria is a normative, not technical,
decision (<a href="#ref-barocas2023">Barocas et al., 2023</a>). The book does not tell you which criterion
to choose. It tells you that you must choose, that the choice has political content, and that
presenting it as a technical optimization problem conceals that content.</p>
<p>Reuben Binns&rsquo; analysis through political philosophy confirms that different fairness criteria
correspond to different underlying theories of justice: Rawlsian, Dworkinian, luck egalitarian
framings all generate different orderings of the three criteria
(<a href="#ref-binns2018">Binns, 2018</a>). The choice of fairness criterion is the choice of a
theory of justice, whether or not the engineers implementing the system have thought of it in
those terms.</p>
<h2 id="the-theorem-is-not-the-problem">The Theorem Is Not the Problem</h2>
<p>I want to be clear about what the impossibility theorem does and does not say.</p>
<p>It does not say that algorithmic fairness is impossible. It says that you must choose among
competing fairness criteria when base rates differ across groups, and that the choice has
distributional consequences. Systems can be built that satisfy calibration, or equalized odds,
or demographic parity — just not all three at once with unequal base rates.</p>
<p>It does not say that base rate disparities are natural or acceptable. The disparities in
recidivism rates, hiring rates, image training sets, and text corpora are products of social
history. The theorem constrains what a classifier can do <em>given</em> those disparities; it does not
prescribe them.</p>
<p>What it does say is that &ldquo;we built a fair algorithm&rdquo; is not a statement that can be made without
specifying which fairness criterion was satisfied and which was not. It is not a statement that
can be defended on purely technical grounds. And it is not a statement that escapes political
accountability by hiding behind mathematical precision.</p>
<p>The fairness debate in AI is, at its core, a debate about which errors we are willing to make, in
whom, with what consequences. The theorem makes that debate unavoidable. Whether we have the
vocabulary and the will to conduct it in those terms is a different question entirely.</p>
<h2 id="references">References</h2>
<ul>
<li><span id="ref-angwin2016"></span>Angwin, J., Larson, J., Mattu, S., &amp; Kirchner, L. (2016, May 23). Machine bias. <em>ProPublica</em>. <a href="https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing">https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing</a></li>
<li><span id="ref-chouldechova2017"></span>Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. <em>Big Data</em>, 5(2), 153–163. <a href="https://doi.org/10.1089/big.2016.0047">DOI: 10.1089/big.2016.0047</a></li>
<li><span id="ref-kleinberg2017"></span>Kleinberg, J., Mullainathan, S., &amp; Raghavan, M. (2017). Inherent trade-offs in the fair determination of risk scores. In <em>Proceedings of the 8th Innovations in Theoretical Computer Science Conference</em> (ITCS 2017). <a href="https://doi.org/10.4230/LIPIcs.ITCS.2017.43">DOI: 10.4230/LIPIcs.ITCS.2017.43</a></li>
<li><span id="ref-robertson1929"></span>Robertson, H. P. (1929). The uncertainty principle. <em>Physical Review</em>, 34, 163–164. <a href="https://doi.org/10.1103/PhysRev.34.163">DOI: 10.1103/PhysRev.34.163</a></li>
<li><span id="ref-dastin2018"></span>Dastin, J. (2018, October 10). Amazon scraps secret AI recruiting tool that showed bias against women. <em>Reuters</em>. <a href="https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G">https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G</a></li>
<li><span id="ref-bolukbasi2016"></span>Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V., &amp; Kalai, A. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In <em>Advances in Neural Information Processing Systems 29</em> (NeurIPS 2016). arXiv:1607.06520</li>
<li><span id="ref-zhao2017"></span>Zhao, J., Wang, T., Yatskar, M., Ordonez, V., &amp; Chang, K.-W. (2017). Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In <em>Proceedings of EMNLP 2017</em>, pp. 2979–2989. <a href="https://aclanthology.org/D17-1323/">ACL Anthology: D17-1323</a></li>
<li><span id="ref-buolamwini2018"></span>Buolamwini, J., &amp; Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In <em>Proceedings of the 1st Conference on Fairness, Accountability and Transparency</em> (FAT* 2018), PMLR Vol. 81, pp. 77–91. <a href="https://proceedings.mlr.press/v81/buolamwini18a.html">https://proceedings.mlr.press/v81/buolamwini18a.html</a></li>
<li><span id="ref-kotek2023"></span>Kotek, H., Dockum, R., &amp; Sun, D. Q. (2023). Gender bias and stereotypes in large language models. In <em>Proceedings of The ACM Collective Intelligence Conference</em> (CI &lsquo;23), pp. 12–24. <a href="https://doi.org/10.1145/3582269.3615599">DOI: 10.1145/3582269.3615599</a></li>
<li><span id="ref-zemel2013"></span>Zemel, R., Wu, Y., Swersky, K., Pitassi, T., &amp; Dwork, C. (2013). Learning fair representations. In <em>Proceedings of the 30th International Conference on Machine Learning</em> (ICML 2013), PMLR Vol. 28, No. 3, pp. 325–333. <a href="https://proceedings.mlr.press/v28/zemel13.html">https://proceedings.mlr.press/v28/zemel13.html</a></li>
<li><span id="ref-hardt2016"></span>Hardt, M., Price, E., &amp; Srebro, N. (2016). Equality of opportunity in supervised learning. In <em>Advances in Neural Information Processing Systems 29</em> (NeurIPS 2016), pp. 3323–3331. arXiv:1610.02413</li>
<li><span id="ref-agarwal2018"></span>Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., &amp; Wallach, H. (2018). A reductions approach to fair classification. In <em>Proceedings of the 35th International Conference on Machine Learning</em> (ICML 2018), PMLR Vol. 80, pp. 60–69. arXiv:1803.02453</li>
<li><span id="ref-mitchell2021"></span>Mitchell, S., Potash, E., Barocas, S., D&rsquo;Amour, A., &amp; Lum, K. (2021). Algorithmic fairness: Choices, assumptions, and definitions. <em>Annual Review of Statistics and Its Application</em>, 8, 141–163. <a href="https://doi.org/10.1146/annurev-statistics-042720-125902">DOI: 10.1146/annurev-statistics-042720-125902</a></li>
<li><span id="ref-narayanan2018"></span>Narayanan, A. (2018). <em>21 Fairness Definitions and Their Politics</em>. Tutorial at FAT* 2018. <a href="https://facctconference.org/static/tutorials/narayanan-21defs18.pdf">PDF</a></li>
<li><span id="ref-barocas2023"></span>Barocas, S., Hardt, M., &amp; Narayanan, A. (2023). <em>Fairness and Machine Learning: Limitations and Opportunities</em>. MIT Press. <a href="https://fairmlbook.org">https://fairmlbook.org</a></li>
<li><span id="ref-binns2018"></span>Binns, R. (2018). Fairness in machine learning: Lessons from political philosophy. In <em>Proceedings of the 2018 Conference on Fairness, Accountability, and Transparency</em> (FAT* 2018), PMLR Vol. 81, pp. 149–159. arXiv:1712.03586</li>
</ul>
<hr>
<h2 id="changelog">Changelog</h2>
<ul>
<li><strong>2025-11-05</strong>: Updated the Zhao et al. (2017) cooking statistics to match the paper: 67% female agents for cooking in the training set (33% was the male share), amplified to 84% female at inference.</li>
</ul>
]]></content:encoded>
    </item>
  </channel>
</rss>
