The Dangers of Machine Learning Bias

April 28, 2022

Also known as AI bias, machine learning bias occurs when an AI system outputs results that are discriminatory and structually biased, caused by errors in the machine learning procedure. Some factors that affect the quality of the AI include the quantity, quality, and variety of the training data used. 

Types of machine learning bias

  1. Algorithm bias: The algorithm performing the machine learning calculations is biased, creating unjust outcomes. Often, this advantages certain groups of users over other groups of people.

  2. Sample bias: The training data is not representative and/or not large enough to teach a system accurately. For example, a system trained using all male doctors will teach the system that all doctors are men—which is untrue.

  3. Prejudice bias: Similar to sample bias, prejudice bias is caused by training data reflecting real world stereotypes. This instills bias into the system, and continues to perpetuate harmful assumptions to users.

  4. Measurement bias: This bias derives from faulty training data that has been assessed incorrectly or contaminated by people involved in the experiment. For example, if people are participating in a photoshoot for new training data, they may fall for the “participant’s expectations effect”—and behave differently than usual based on the particular study or purpose. Additionally, if training data is always rounded down, the system will be biased towards lower numbers.

  5. Exclusion bias: Training data is extracted in a way that leaves out important data. This can be intentional (data is chosen to meet expectations) or unintentional (data points are not considered important, and are left out). 

Machine learning bias makes a difference in people’s everyday lives. For example, facial recognition systems are less capable of recognizing the faces of women and people of colour, compared to men and white people. Noticeably, research has consistently shown that facial recognition AI systems have the poorest recognition accuracy for Black women between 18-30. This leads to more prejudiced errors with law enforcement surveillance technology, employment decisions, and airport passenger screening—common uses of facial recognition technology. 

Another way that marginalized communities are affected by machine learning bias is through the job market search. In the status quo, a wide range of large companies use recruiting algorithms to search through the thousands of resumes they get daily. However, not all these algorithms are fair. Multinational e-commerce company Amazon penalizes resumes containing certain word patterns, and favours men over women by prioritizing resumes without the word “women’s”. 

How to prevent AI bias: 

  1. Use training data that is representative of the population, including women of colour, disabled people, queer and trans people, religious minorities, and more. The training data needs to be large enough to combat sample bias and prejudice bias. 
  1. Have multiple experts check the algorithm(s) to ensure the training data will not be affected by algorithm bias and conduct various tests.

  2. Over time, review the machine learning system while they perform instructions to prevent the AI from developing bias as they continue to learn. 

Machine learning bias creates tangible negative impacts by disadvantaging marginalized groups of people while elevating the privileged. These biases can be avoided and reduced by proper procedures of reviewing algorithms, a more representative data set, and the removal of bias in our real-world setting. 

Latest Articles

All Articles

Graph Theory

How verticies and edges power self driving cars

Artemis: Humanity's Return to the Moon

NASA's next chapter of lunar exploration, Artemis, has the task of not just going to the Moon to create a long-term human presence on and around it, but also to prepare for ever-more-complex human missions to Mars.

Successor of Hubble - The James Webb Space Telescope

The multi-billion dollar system to observe our creation.