Building AI Fairness by Reducing Algorithmic Bias
Emily Diana explores algorithmic bias in machine learning and outlines three intervention stages: pre-processing, in-processing, and post-processing to mitigate algorithmic discrimination.
Yan Huang reveals how Zillow's Zestimate, despite occasional algorithmic bias and varying accuracy across neighborhoods, benefits both buyers and sellers in underserved areas by reducing market uncertainty and improving transaction outcomes, even potentially lessening existing inequality compared to human-only systems.
Algorithms are now informing high-stakes decisions from hiring and lending to healthcare and criminal justice. At first, these systems were seen as objective and impartial. However, researchers and practitioners soon began documenting algorithmic bias and realizing that these tools can reinforce or exacerbate existing social inequalities.
As machine learning, artificial intelligence, and algorithms become more common, algorithmic bias is getting more public attention. High-profile media coverage, such as ProPublica’s investigation into racial bias in criminal risk assessments, MIT News’ reporting on biased facial recognition systems, and Reuters’ and Business Insider’s stories on Amazon’s scrapped AI hiring tool, has all highlighted how algorithms can go wrong. In response, some individuals and organizations have become wary of adopting algorithmic systems, fearing their unintended consequences.
In one of my recent studies, my coauthors and I examined the impact of Zestimate, Zillow’s algorithm-based home value estimate, on the housing market. For most buyers and sellers, estimating the value of a property is both important and difficult. Zestimate aims to help by offering an estimate generated using Zillow’s proprietary algorithm trained on large-scale real estate data. At the population level, it performs reasonably well: according to Zillow, the median error for on-market homes is about 2 percent. But at the individual property level, any prediction error could lead buyers or sellers to misjudge a property’s value. This raises the question: Does Zestimate help or hurt market participants?
To find out, we studied how the presence of Zestimate affects housing transactions. Surprisingly, we found that it benefits both buyers and sellers: buyer surplus increases by 5.9 percent, and seller profit increases by 4.4 percent. This is unexpected, since home sales are often seen as a zero-sum game; one party’s gain typically comes at the other’s expense.
Why the win-win? Even though Zestimate is not perfectly accurate, it reduces uncertainty. Sellers become more confident in their asking prices and are more patient in finding the right buyer. This improves the overall match quality between buyers and sellers, leading to better outcomes for both sides.
We also examined fairness. While Zestimate is statistically unbiased overall (i.e., its prediction errors center around zero), it is less accurate (measured by error rate, the metric Zillow uses to report Zestimate’s accuracy) in poor neighborhoods. This suggests the potential for Zestimate to reinforce existing inequalities. However, our analysis shows the opposite: poorer neighborhoods benefit more from Zestimate than richer ones. In areas where access to qualified real estate agents or quality information is scarce and market participants face greater uncertainty, Zestimate provides a helpful signal that improves decision-making. And if its accuracy in poorer neighborhoods could be raised to match that of wealthier areas, total welfare would rise by an additional 31 percent.
This study highlights the need to evaluate algorithms not just by how biased their outputs appear, but by their overall effects compared to the status quo.
Yes, algorithms can produce biased outputs. But in many cases, the alternative is a decision made solely by humans, who may be even more biased. Even an imperfect algorithm can reduce inequality, for example, if it more effectively improves outcomes for underserved or disadvantaged groups relative to human-only systems.
Of course, this does not mean we should accept bias in algorithms. Improving algorithmic fairness remains essential. However, if we focus solely on disparities in predictions, we may reject tools that, while imperfect, still improve equity compared to the status quo.