Crowds deliver diverse data sets to produce effective AI

Deriving the greatest benefits from AI requires businesses to understand and be comfortable with what it can and can’t do, and how it should be leveraged for the best results. In this article, Jonathan Zaleski, Sr. Director of Engineering at Applause and Head of Applause Labs, explores how businesses can overcome any difficulties associated with implementing AI and automated systems, and the importance of successfully training and crowdtesting their AI/ML models to remove any potential harmful bias.
Deriving the greatest benefits from AI requires businesses to understand and be comfortable with what it can and can’t do, and how it should be leveraged for the best results. In this article, Jonathan Zaleski, Sr. Director of Engineering at Applause and Head of Applause Labs, explores how businesses can overcome any difficulties associated with implementing AI and automated systems, and the importance of successfully training and crowdtesting their AI/ML models to remove any potential harmful bias.
Defining purpose and overcoming objections

Before deploying AI, businesses should always begin by clearly defining the need for the technology in their organization, asking themselves what purpose it will serve and how it can be used to accomplish that original objective. It’s important, too, to establish where AI isn’t needed. Many organizations, for example, don’t understand that not all their business processes can – or need to be – automated. Only once its purpose has been defined can AI begin to be used for best effect.

Any AI deployment is likely to face some resistance, of course. The age-old concern that AI represents a threat to people’s jobs can often be overcome by demonstrating the efficiency benefits that automation offers when compared to time-consuming, traditionally manual tasks. Addressing the issue of bias in AI, however, can be a little more challenging.

The issue of bias

An AI’s basic purpose and functionality are fed into its underlying algorithm. But, if the AI was to develop an inherent bias, it would have a detrimental effect on that algorithm, which could seriously impact the precision and efficiencies the AI is expected to deliver. This, in turn, can limit its ability to fulfil its commercial requirements, and that can be bad for business.

Unfortunately, despite the best intentions of developers, bias can always find a way to permeate an AI algorithm. Biases based on business decisions, training data, and even conscious bias, continue to crop up. And, as well as affecting efficiency, such bias can also negatively impact the perception of a brand. Unintended gender bias, for example, resulted in the Apple Card offering lower credit limits to female applicants than male. The resulting backlash on social media was, unsurprisingly, harsh. If a customer feels they’re being treated unfairly by an AI system, they’ll think twice about engaging with that particular brand again.

Examples like this only add to the skepticism around AI and can make it difficult for businesses to justify investing in the technology. To avoid such situations occurring in the first place, businesses should therefore place more emphasis on the training of their AI algorithms and consider a crowdtesting approach to ensure a suitably diverse data set.

Real-world training and testing

Every successful AI algorithm is built on training data. But, with AI, as with any learning process, the student is influenced by the teacher. The scope of an AI’s education is dependent on the curriculum. So, it stands to reason that a more varied and diverse curriculum will produce a more enlightened student. Likewise, using a larger and more diverse data set will help to produce more precise and efficient AI algorithms capable of making smarter decisions, and with less inherent bias.

Sourcing the data needed to meet a business’s requirements can be challenging, though, especially for mass market consumer applications and services. In-house teams of developers, software engineers, and quality assurance specialists will typically be from the same age range, gender, and socio-economic background. As a result, bias can often occur during the process of collecting and labelling data. It’s best, then, when building an AI algorithm, not to rely on a single person or small group to provide the data that’s going to be used to train that algorithm. To properly train it, and minimize the risk of bias, requires different types of data and inputs. 

It would be far more productive to use crowdtesting, a model that provides the AI algorithm with exposure to a diverse pool of people and experiences which are much closer to the customers it’s designed to serve. By using this model, businesses can train their algorithms to respond to real-world scenarios, detect where biases occur, and reduce their potential impact.

Rich variety of data and inputs

An AI algorithm needs to be tested under real-world conditions, interacting with real people that mimic a company’s target audience to ensure it works as intended.

Businesses need to source training data from a pool that provides quality and diversity – as well as quantity. Indeed, without diversity in the training data, the algorithm won’t be able to recognize an especially broad range of possibilities, thereby limiting its effectiveness. The necessary diversity and scope of data can be found in carefully vetted communities of testers offering specific demographics – including gender, race, age, geography, native language, location, and skill set, among others.

Without exposure to such a rich variety of data and inputs, AI can fail to deliver on its potential, having been limited only to in-house lab testing practices. By supplementing an organization’s in-house capabilities for training algorithms to study and recognize voices, text, images, and biometrics, for instance, this crowdtesting approach can provide businesses with strong outputs that will service the needs of a diverse customer base.

Delivering on its purpose

AI technology represents considerable efficiency benefits for businesses. It’s important to understand, though, how AI will deliver those benefits, and the issues that could hinder its efficiency and its wider acceptance.

Businesses need to appreciate that, while AI will never be perfect, it’s constantly learning, and the best machine models are those based on large and diverse data sets. Without diversity in the training data, the AI algorithm will be unable to recognize a broad range of possibilities, which risks rendering it ineffective. What’s more, inherent biases arising from limited input can impact not only the AI’s efficiency and precision, but also the reputation of the business using that AI.

READ MORE:

The best policy, then, is to take a crowdtesting approach, and source that training data from a pool that provides quantity, quality, and diversity. That way, an organization’s AI will be best placed to deliver on the purpose originally defined before its deployment.

For more news from Top Business Tech, don’t forget to subscribe to our daily bulletin!

Follow us on LinkedIn and Twitter

Amber Donovan-Stevens

Amber is a Content Editor at Top Business Tech

How Predictive AI is Helping the Energy Sector

Colin Gault head of product at POWWR • 29th April 2024

In the past year or so, we have seen the emergence of many new and exciting applications for predictive AI in the energy industry to better maintain and optimise energy assets. In fact, the advances in the technology have been nothing short of rapid. The challenge, though, has been in supplying the ‘right’ data to...

How Predictive AI is Helping the Energy Sector

Colin Gault head of product at POWWR • 29th April 2024

In the past year or so, we have seen the emergence of many new and exciting applications for predictive AI in the energy industry to better maintain and optimise energy assets. In fact, the advances in the technology have been nothing short of rapid. The challenge, though, has been in supplying the ‘right’ data to...

Cheltenham MSP is first official local cyber advisor

Neil Smith Managing Director of ReformIT • 23rd April 2024

ReformIT, a Managed IT Service and Security provider (MSP) based in the UK’s cyber-capital, Cheltenham, has become the first MSP in the local area to be accredited as both a Cyber Advisor and a Cyber Essentials Certification Body. The Cyber Advisor scheme was launched by the Government’s official National Cyber Security Centre (NCSC) and the...

How we’re modernising BT’s UK Portfolio Businesses

Faisal Mahomed • 23rd April 2024

Nowhere is the move to a digitised society more pronounced than the evolution from the traditional phone box to our innovative digital street units. Payphone usage has dropped massively since the late 1990s/2000s, with devices and smart phones replacing not only communication access, but the central community points that the payphones once stood for. Our...

How we’re modernising BT’s UK Portfolio Businesses

Faisal Mahomed • 23rd April 2024

Nowhere is the move to a digitised society more pronounced than the evolution from the traditional phone box to our innovative digital street units. Payphone usage has dropped massively since the late 1990s/2000s, with devices and smart phones replacing not only communication access, but the central community points that the payphones once stood for. Our...

What is a User Journey

Erin Lanahan • 19th April 2024

User journey mapping is the compass guiding businesses to customer-centric success. By meticulously tracing the steps users take when interacting with products or services, businesses gain profound insights into user needs and behaviors. Understanding users’ emotions and preferences at each touchpoint enables the creation of tailored experiences that resonate deeply. Through strategic segmentation, persona-driven design,...

From Shadow IT to Shadow AI

Mark Molyneux • 16th April 2024

Mark Molyneux, EMEA CTO from Cohesity, explains the challenges this development brings with it and why, despite all the enthusiasm, companies should not repeat old mistakes from the early cloud era.

Fixing the Public Sector IT Debacle

Mark Grindey • 11th April 2024

Public sector IT services are no longer fit for purpose. Constant security breaches. Unacceptable downtime. Endemic over-spending. Delays in vital service innovation that would reduce costs and improve citizen experience.