Crowds deliver diverse data sets to produce effective AI

Deriving the greatest benefits from AI requires businesses to understand and be comfortable with what it can and can’t do, and how it should be leveraged for the best results. In this article, Jonathan Zaleski, Sr. Director of Engineering at Applause and Head of Applause Labs, explores how businesses can overcome any difficulties associated with implementing AI and automated systems, and the importance of successfully training and crowdtesting their AI/ML models to remove any potential harmful bias.
Deriving the greatest benefits from AI requires businesses to understand and be comfortable with what it can and can’t do, and how it should be leveraged for the best results. In this article, Jonathan Zaleski, Sr. Director of Engineering at Applause and Head of Applause Labs, explores how businesses can overcome any difficulties associated with implementing AI and automated systems, and the importance of successfully training and crowdtesting their AI/ML models to remove any potential harmful bias.
Defining purpose and overcoming objections

Before deploying AI, businesses should always begin by clearly defining the need for the technology in their organization, asking themselves what purpose it will serve and how it can be used to accomplish that original objective. It’s important, too, to establish where AI isn’t needed. Many organizations, for example, don’t understand that not all their business processes can – or need to be – automated. Only once its purpose has been defined can AI begin to be used for best effect.

Any AI deployment is likely to face some resistance, of course. The age-old concern that AI represents a threat to people’s jobs can often be overcome by demonstrating the efficiency benefits that automation offers when compared to time-consuming, traditionally manual tasks. Addressing the issue of bias in AI, however, can be a little more challenging.

The issue of bias

An AI’s basic purpose and functionality are fed into its underlying algorithm. But, if the AI was to develop an inherent bias, it would have a detrimental effect on that algorithm, which could seriously impact the precision and efficiencies the AI is expected to deliver. This, in turn, can limit its ability to fulfil its commercial requirements, and that can be bad for business.

Unfortunately, despite the best intentions of developers, bias can always find a way to permeate an AI algorithm. Biases based on business decisions, training data, and even conscious bias, continue to crop up. And, as well as affecting efficiency, such bias can also negatively impact the perception of a brand. Unintended gender bias, for example, resulted in the Apple Card offering lower credit limits to female applicants than male. The resulting backlash on social media was, unsurprisingly, harsh. If a customer feels they’re being treated unfairly by an AI system, they’ll think twice about engaging with that particular brand again.

Examples like this only add to the skepticism around AI and can make it difficult for businesses to justify investing in the technology. To avoid such situations occurring in the first place, businesses should therefore place more emphasis on the training of their AI algorithms and consider a crowdtesting approach to ensure a suitably diverse data set.

Real-world training and testing

Every successful AI algorithm is built on training data. But, with AI, as with any learning process, the student is influenced by the teacher. The scope of an AI’s education is dependent on the curriculum. So, it stands to reason that a more varied and diverse curriculum will produce a more enlightened student. Likewise, using a larger and more diverse data set will help to produce more precise and efficient AI algorithms capable of making smarter decisions, and with less inherent bias.

Sourcing the data needed to meet a business’s requirements can be challenging, though, especially for mass market consumer applications and services. In-house teams of developers, software engineers, and quality assurance specialists will typically be from the same age range, gender, and socio-economic background. As a result, bias can often occur during the process of collecting and labelling data. It’s best, then, when building an AI algorithm, not to rely on a single person or small group to provide the data that’s going to be used to train that algorithm. To properly train it, and minimize the risk of bias, requires different types of data and inputs. 

It would be far more productive to use crowdtesting, a model that provides the AI algorithm with exposure to a diverse pool of people and experiences which are much closer to the customers it’s designed to serve. By using this model, businesses can train their algorithms to respond to real-world scenarios, detect where biases occur, and reduce their potential impact.

Rich variety of data and inputs

An AI algorithm needs to be tested under real-world conditions, interacting with real people that mimic a company’s target audience to ensure it works as intended.

Businesses need to source training data from a pool that provides quality and diversity – as well as quantity. Indeed, without diversity in the training data, the algorithm won’t be able to recognize an especially broad range of possibilities, thereby limiting its effectiveness. The necessary diversity and scope of data can be found in carefully vetted communities of testers offering specific demographics – including gender, race, age, geography, native language, location, and skill set, among others.

Without exposure to such a rich variety of data and inputs, AI can fail to deliver on its potential, having been limited only to in-house lab testing practices. By supplementing an organization’s in-house capabilities for training algorithms to study and recognize voices, text, images, and biometrics, for instance, this crowdtesting approach can provide businesses with strong outputs that will service the needs of a diverse customer base.

Delivering on its purpose

AI technology represents considerable efficiency benefits for businesses. It’s important to understand, though, how AI will deliver those benefits, and the issues that could hinder its efficiency and its wider acceptance.

Businesses need to appreciate that, while AI will never be perfect, it’s constantly learning, and the best machine models are those based on large and diverse data sets. Without diversity in the training data, the AI algorithm will be unable to recognize a broad range of possibilities, which risks rendering it ineffective. What’s more, inherent biases arising from limited input can impact not only the AI’s efficiency and precision, but also the reputation of the business using that AI.

READ MORE:

The best policy, then, is to take a crowdtesting approach, and source that training data from a pool that provides quantity, quality, and diversity. That way, an organization’s AI will be best placed to deliver on the purpose originally defined before its deployment.

For more news from Top Business Tech, don’t forget to subscribe to our daily bulletin!

Follow us on LinkedIn and Twitter

Author

  • AI, Big Data, Crowds deliver diverse data sets to produce effective AI

    Jonathan Zaleski is a highly skilled, versatile and technical leader with a demonstrated history of working in the internet industry. He has more than 15 years of engineering and technology experience across multiple verticals and platforms. Jonathan is a polyglot skilled in software development, scalability and Agile methodologies who uses his breadth of knowledge and skill to get the best out of his team. He is a dedicated leader who continuously strives for excellence. As the senior director of engineering at Applause, Jonathan and his team build best-in-class software with an eye toward innovation and next-generation concepts. They work to improve the capabilities of the Applause Platform, using cutting-edge technologies like artificial intelligence and machine learning to make the company more efficient.

How to move from CIO to Chief Customer Success Officer

Amber Donovan-Stevens • 21st October 2021

Dean Leung, Chief Customer Success Officer at iManage, reflects on his own path shifting from CIO to Chief Customer Success Officer (CCSO) and discusses both the similarities and differences of the two roles, and why it can be a natural progression when approached with the proper mindset.

The importance of edtech in the early years sector

Amber Donovan-Stevens • 18th October 2021

Technology has become an operational mainstay across a multitude of industries – helping businesses, education establishments, governments, and charities to streamline their processes and enhance communications. When it comes to the early years education sector, this is no different. Chris Reid, CEO and founder of Connect Childcare, shares his thoughts on the intrinsic link between...

A deep dive into the Scaled Agile Framework

Jeff Keyes • 14th October 2021

The Scaled Agile Framework (SAFe) was designed to help large organizations successfully adopt agile methodologies. In this article Jeff Keyes, VP of Product Marketing and Strategy at Plutora, discusses the four core values of this approach, and how and why businesses are using the SAFe framework to improve agility in software development.

How click fraud has worsened in the wake of Covid-19

Amber Donovan-Stevens • 05th October 2021

Stewart Boutcher, CTO and Data Lead at Beacon, examines how click fraud – which was already a serious threat to companies engaged in digital marketing prior to the pandemic – has worsened considerably in its wake. He seeks to provide a forecast on how the situation is likely to evolve overtime, and advice on what...

Join our webinar on 26th October: Intelligent Automation - Maintaining the competitive edge.

X