Diversity in AI: All About Data

Nikki Hallgrimsdottir

  • Share:    
An important angle to consider when it comes to making sure that machine learning and AI products and innovations are fair and inclusive is the collection and annotation of training data: the fuel that makes AI possible.

One of the most public and embarrassing stories that highlight errors when there are blind spots in training data is the infamous example where Google labeled images of black people as gorillas.

These errors do not reflect a conscious bias on Google's part, but rather the unconscious bias that can happen when the creators of something are lacking representation from certain groups and don't catch bias or issues with the training data.

Google's response included the fact that they can't anticipate and test for "fringe cases," but likely this would not be a fringe case to engineers that look like the consumers that were surprised to see their pictures in an automatically created album named Gorillas. In 2017, Apple reported only nine percent of their engineers are black, and the number working on machine learning is likely far less.

Biased Data Will Lead Us to the Wrong Conclusions

Before founding Roovillage, Yukari Takata Schneider, PhD, MPH was a public health researcher and points out that most study participants in medical research are men.

Consequently, most health information and public health messages are tailored to their physiology.

For example, heart disease/attack has a higher fatality rate for women than men. Why? Because we experience it differently, but society didn't know to tell us that, because that's not what the data said.

As Yukari puts it, "Data matters. Diversity matters. Diversity of data matters. Data informs policies, interventions, and their effects snowball from there. We need to make sure AI is learning from an inclusive data set."

Women Are More Conscious of the Consequences of Bias in Data

At a recent keynote panel entitled "Women in AI & Blockchain" at the 6th Annual Global Big Data Conference in Santa Clara, the panel was asked "Why do we need a Women in AI panel?"

Jennifer Prendki, PhD and vice president of machine learning at Figure Eight, provided an example in which a group of machine learning engineers at her company, headed by mostly women, were working on a data set for a client and proactively pointed out several potential issues of bias with the training data and discussed how to proactively avoid issues of inclusivity that could come up.

Jennifer pointed out that women tend to be more sensitive to the potential negatives and hurt feelings that can happen when AI products are exclusionary, and she used this example to point out that having women on the teams that deal with the training data can help bring a proactive approach to avoiding bias that a male team would not necessarily place emphasis on.

There are not enough women working in machine learning and AI. Wired reports that Just 12 percent of machine learning researchers are women, which is a worrying statistic for a field supposedly reshaping society. Until the deck is more evenly stacked, we need to raise the profile of women in the field to encourage other women to enter.


Source: Wired


What Do We Do About It?

Like most complex issues, there isn't one obvious solution, but acknowledging that there is a potential problem and being proactive in recognizing and correcting it are the first steps we can take.

One of the best ways to combat bias is to make sure that we put power in the hands of people that are able and willing to recognize bias—and in this case, when the bias impacts underrepresented populations, the most obvious answer is to increase their participation in the process.

Nikki Hallgrimsdottir is a co-founder at Algo.ai, where she leverages Artificial Intelligence, Augmented Reality, and Automation in a unique way to help retailers, distributors, and manufacturers create an agile approach to increase profits and cut costs.

Tags

Comments