How Biased Datasets Cause Inequality

Will AI make the world a better place? Maybe, however, one major set back is AI and other technologies need human-made databases to learn from. Despite Google making their data available, the problem is—human’s don’t always see the world in 1’s and 0’s. Human bias can spill into technology very easily and often without intent.


For example, white-privilege has gained a lot of attention in the past few decades. This refers to how White people, especially wealthy white men, run our institution’s and develop major technologies. However, having privilege isn’t only a white thing, a class thing or a male thing, as privilege can come from anything. For example, having higher education, not having a disability, looking and acting gender normative, being straight, not having a regional accent. Most people developing technology fall into at least some of the above categories. Therefore, the needs of others who don’t tick the same boxes can be sidelined by predominantly white male developers.


Bigotry or a Bug?

At the beginning of this decade, in 2011, the newly launched ‘Siri’ couldn’t understand Scottish accents. Fast forward to 2017, Alexa and Google Assistants still had trouble understanding some accents, especially with those belonging to regional and non-native speakers. The problem wasn’t just that people with non-standard accents couldn’t use these devices properly, it was that it also reinforces the idea that non-standard accents are somewhat less.


A popular idea for centuries was that in order to speak “properly” one should speak like the upper classes. In more recent decades, the focus shifted to using “standard” accents, used in the media. This even spawned the 1940s ‘transatlantic accent’ in Hollywood. The logic at the time was that regional accents would be harder to “sell” to mainstream audiences. Hence, a home assistant not understanding an accent sort of implies to those users, that they sound too regional, too poor or too foreign to use that product. It has even resulted in users changing their accent to be understood by the device.


Human biases infused to tech can actually be dangerous too. It was revealed that self-driving cars potentially have issues identifying dark-skinned people in low light, leading to them getting run over. However, cameras having problems identifying and displaying dark skin has been around for a while and it predates digital photography.


Shirley Set The Standard

It all goes back to the introduction of colour film, back in the days of the ‘Shirley Card‘. This was a photo of a white woman (dubbed as ‘Shirley’) being used as a measure in photography. In photo labs, the developer would measure lightness and darkness against her photo, when processing pictures. A dataset for analogue photography so to speak. Black consumers quickly notice their photos didn’t render the details of dark-skinned people too well. However, it wasn’t the complaints from these consumers that promoted the photography companies to do something about it, it was the chocolate and furniture industry who noticed the photos didn’t accurately portray their products properly.


‘White Shirley’ was later joined by “Asian”, “Black” and “Latina” Shirley. However, by this point, ‘White Shirley’ already became a digital algorithm. This resulted in some digital imaging software and cameras still following the same photography principles that the Shirley Card was invented for. Meanwhile, for People of Color, digital imagery did improve over time, but ‘White Shirley’ still lived on in some cameras and photo software. This skintone bias still exists today, just not as much as it did in the past.


An example of a “Shirley Card”


In 2009, HP motion sensor web cameras didn’t follow the faces of Black users. In 2015, Google tagged two Black people as Gorillas. This is literally when software is not recognizing Black people as human. More shocking was the fact even soap dispensers were not immune from discriminating. While these types of incidents nowadays tend to be glitches in the software, it is reminiscent of the ‘White Shirly’ prejudice.


Why is there Bias In Tech?

So why do these things happen? It has a lot to do with data fed to form the algorithms that run these technologies. This software learns to distinguish people from non-humans through data feeds of what a human looks, speaks and behaves like. However we humans come in all different shapes and sizes, with different features. There are two theories on why well educated, wealthy, ‘soft-spoken’, White people don’t suffer prejudice at the hands of tech. One theory is the developers of these products are white men and the areas where tech and other huge companies are based tend to have less regional sounding accents. As a consequence, these developers are not in a great position to access a diverse group of people. This limits what their software will learn about the diversity of humans.




Another theory is that early adapters of high tech products tends to be wealthy White people. Therefore, the data collected from devices and used for learning has a bias for these people.


Part of the solution to rectify this is to improve the datasets used. For example, in speech, datasets like Switchboard contained a high amount of people with standard sounding accents. Nowadays, more and more data sets are being produced that have a more inclusive range of accents. Similarly, image datasets contain more diversity to train facial recognition software.


Microsoft’s Tay is a good lesson on how to not to teach AI. This bot was meant to learn how to tweet like a human by having real users talk to it. Quickly, Microsoft learned that Tay was being taught by trolls when it started to spout offensive tweets. This really brings home the point, AI at the moment, is a reflection of its creators and users.



AI is going to change the world. How it does that is up to those who develop it and who uses it. If we want AI with minimal bias, it requires work. Such as having more inclusive teams and reaching a larger range of consumers and of course using more diverse data sets. AI has to reach a point where it doesn’t dehumanize people. As AI gets smarter, the smarter we need to be about how we train it.


Here at Wiredelta, we do believe AI is the future and diversity in tech firms is a key part of building that future. To read more about AI and other advances in tech, keep up to date with us by subscribing to our newsletter.

Success stories

In the past decade we have launched over 100 websites and more than 20 mobile apps, helping each of our client get closer to their digital goals.

Executive Global

Connecting executives around the world in one of the largest professional networks

Philip Morris

Working together towards a smoke-free future for the Nordics.


Denmark’s largest wish cloud is going global with a brand new look and a lot of new features

How can we bring you value?