An image of a statue of holding a set of scales in one hand and a sword in the other, overlaid with a computer code.

Features

How Biased Data and Algorithms Can Harm Health

Public health researchers are working to uncover and correct unfairness in AI.

By Carrie Arnold • Photo Illustrations by Patrick Kirchner

It seemed like an ideal problem for machine learning.

Knee pain from osteoarthritis affects 16% of adults worldwide, according to a recent eClinicalMedicine study, but doctors still aren’t good with predicting a person’s pain levels based on X-rays, MRIs, and other imaging data. Researchers trained early algorithms on thousands of X-rays and physician notes. After months of work, the algorithm was able to predict patient-reported pain from X-rays—but only for white patients.

For people of color, AI didn’t perform any better than random chance.

Kadija Ferryman, PhD, MA, wasn’t surprised. Ferryman has spent much of the past decade studying machine learning and health and has learned that algorithms are often as biased as the rest of society. In this case, researchers were training the models based on physician reports of pain, and since doctors are less likely to believe marginalized people when they report pain, this algorithm replicated this bias. When a team of computer scientists at the University of California, Berkeley, tweaked the algorithm to factor in patient pain reports rather than a physician’s, however, they eliminated that racial bias, paving the way for more equitable treatment of osteoarthritis.

“Data is never raw,” says Ferryman, an assistant professor in Health Policy and Management and a core faculty member at the Berman Institute of Bioethics. “It’s always cooked.”

“Data is never raw. It's always cooked.”

Every day, humans create 2.5 million terabytes of data. This almost unfathomable quantity of information fuels the engines of commerce, medicine, and public health, which rely on increasingly sophisticated algorithms to make sense of this data tsunami. Many researchers hoped that emotionless calculations of artificial intelligence and automated analyses would strip away some of the racial, ethnic, and gender biases that have come to pervade every aspect of our lives. Yet far from erasing medical bias, the rise of big data has the potential to further entrench prejudices and inequalities, Ferryman says.

There are still a lot of areas in health where there are biases, says Ferryman—and better algorithms and even better data won’t simply fix the problem.

Instead, says Nicholas Reed, AuD, an assistant professor in Epidemiology, researchers need to understand the origins of these biases and their own role in the process of doing science.

“We’ve so long viewed the world almost like a Petri dish, and we're just sitting on the outside, observing,” Reed says. “In reality, science is active. It's not passive. We're involved in it as well.”

Rather than pretending that data is neutral, Ferryman says scientists need to grasp the limitations—and biases—of the information they use every day.

IN-YOUR-FACE BIASES

Far from standing against racism, many of public health’s founding fathers actively promoted racist ideas and eugenics. These early scientists effectively cemented inequalities into the very foundation of public health practice, says Bloomberg School historian Karen Kruse Thomas, PhD. Statistics may have been intended to be a way to improve population health, but it “became a tool to promote scientific racism,” she says. That racism reached its apogee in the eugenics movement.

These biases aren’t historical artifacts of a bygone era, Ferryman says. They’re right here, in our faces—even if they are sometimes hard to spot because they are embedded in the dominant culture.

Scientists may be taught to believe that they, and the data they use, are objective and unbiased, but nothing could be further from the truth, says Biostatistics Professor Scott Zeger, PhD, MS. The variables researchers choose to focus on, the material they collect, and the means they use to analyze it can slant the results and conclusions scientists draw.

“Data are just measurements made by human beings,” Zeger says. “Bias reflects the difference between what a human says or concludes and the actual state of nature.”

Decisions made by researchers can explain how an equation used to predict kidney function from common laboratory values led to unintentionally racist outcomes. Scientists believed that people of African descent had more muscle mass than those with European ancestry. Since muscle mass is a key variable in estimating kidney function from creatinine levels in the blood, scientists introduced a “race corrector” into the equation. The calculations systematically overestimated renal performance in Black patients, leading to reduced access to lifesaving dialysis and kidney transplants. In a November 2021 New England Journal of Medicine study, a team of researchers from the Chronic Kidney Disease Epidemiology Consortium developed a newer, more accurate equation without the need to consider race.

A NEW EPIDEMIC OF MISDIAGNOSIS?

While bias in medicine is nothing new, the deluge of data in the last decade has made the problem more acute. Everything from genetic sequencing and electronic health records to real-time GPS monitoring and fitness trackers have opened new opportunities for public health endeavors. However powerful this data might be, scientists need to tread carefully, Ferryman warns.

Because much of this data is collected automatically, she says, it’s easy to perceive it as somehow less biased. Computers aren’t racist. They operate using strings of binary code with no opportunity for prejudice. The problem is that even with the most advanced computing systems, humans are still involved. We decide how health data is collected. We write the code. We teach the algorithms how to recognize cancer in images. We decide what information is included with these images. These algorithms can amplify biases that researchers didn’t even know existed.

Ferryman points to an international effort in 2021 to study how AI models learned about race. The team built a machine learning algorithm to analyze chest X-rays—something that, in theory, should be race-blind. (Clinicians can’t recognize the race or ethnicity of the person behind a radiograph.) But when the researchers tried to see if it was possible to use AI to predict race from an X-ray, they found it was easy to do so, even without giving the system any information about the patients’ ethnic background. The authors went back into the algorithm to try and figure out what information the model was using to make its predictions, but even after poring over the data, they couldn’t figure out how a computer could look at a black-and-white X-ray of someone’s chest and identify their race. It’s a cautionary tale, Ferryman says, because it shows that these models can make predictions using background information that we can’t detect.

The problem is that even with the most advanced computing systems, humans are still involved. We decide how health data is collected. We write the code.

On some level, it may seem innocuous. So what if a computer model can predict race on radiographs? But to Ferryman, the work shows how race is embedded in every aspect of health. It means that AI models could, for example, misclassify all the X-rays from Black patients, and the humans reviewing the data wouldn’t be able to detect any anomalies, even if they tried.

“Even if we think our model is fair, maybe we just need to always audit and assume there's going to be some level of unfairness,” Ferryman says. “It’s not a technical fix.”

What could result is a new epidemic of misdiagnosis and missed treatments that could further widen health disparities around the world. Ferryman says that this study makes clear the brave new world of AI and big data hasn’t miraculously cured the problem of bias.

A COURSE CORRECTION

The data deluge also brings opportunities for those willing to ask the right questions. Biostatistics faculty members Carrie Wright, PhD, Ava Hoffman, PhD, and Professor Michael Rosenblum, PhD, MS, designed a course with SOURCE that pairs students with community-based organizations in Baltimore. The goal: Partner with CBOs to address their data-related needs. Students working with groups like HeartSmiles, the Baltimore Transit Equity Coalition, and the No Boundaries Coalition will develop a dashboard, data analysis tool, series of visualizations, or other data science product. The course is designed to help students ask key questions: How will data be used? How can we ensure data privacy? What kinds of stories can the data tell? How do we support the organizations’ ongoing data capabilities?

“By facilitating connections between data science students and CBOs, we’re hoping to build lasting relationships and help the CBOs take their own data-related goals further,” Wright says.

Thomas, the School historian, sees these as small steps toward a more just future.

“When thinking about bias in data becomes the norm, it's going to really revolutionize all of science and medicine,” Thomas says. “It’s going to create a ripple effect for good.”

Where Electronic Health Records Fail

Electronic health records hold great promise to facilitate both medical practice and research, but they have limitations, says Nicholas Reed, an epidemiologist at the Johns Hopkins Disability Health Research Center.

Example: Many people with hearing loss don’t recognize it or acknowledge it. “We know people aren’t getting their hearing checked and many doctors don’t really code for it,” Reed says. So if he relied solely on health record data, Reed wouldn’t have included many struggling with hearing loss in a recent study. Instead, Reed and colleagues used a mathematical model for a 2019 JAMA Otolaryngology study, showing that untreated hearing loss increased the likelihood of hospitalization and caused $22,434 per person in extra health care costs.

“Not everything is medical in this world, and not everything is diagnosed,” Reed says.