G
Blog
October 8, 2021
3 min read

What does research tell us about the future of AI Quality? 💡

We look into the latest research to understand what is the future of AI / ML Testing

Research literature
Jean-Marie John-Mathews, Ph.D.
Research literature
Research literature

Testing AI systems is an active research area. AI is often qualified as non-testable. To sum up the academic literature, here are 3 reasons.

Zhang et al. (2021)

1. AI follows a data-driven programming paradigm

According to Paleyes (2021), unlike in regular software products where changes only happen in the code, AI systems change along 3 axes: the code, the model, and the data. The model’s behavior evolves in response to the frequent provision of new data.

More information: Challenges in Deploying Machine Learning: a Survey of Case Studies

2. AI is not easily breakable in small unit components

Some AI properties (e.g., accuracy) only emerge as a combination of different components such as the training data, the learning program, and the learning library. It is hard to break the AI system into smaller components that can be tested in isolation.

3. AI errors are systemic and self-amplifying

AI is characterized by many feedback loops and interactions between components. The output of one model can be ingested into the training base of another. As a result, AI errors can be difficult to identify, measure, and correct.

At Giskard, we think testing AI systems is a solvable challenge. Want to know more?

Contact us at hello@giskard.ai.

Integrate | Scan | Test | Automate

Giskard: Testing & evaluation framework for LLMs and AI models

Automatic LLM testing
Protect agaisnt AI risks
Evaluate RAG applications
Ensure compliance

What does research tell us about the future of AI Quality? 💡

We look into the latest research to understand what is the future of AI / ML Testing

Testing AI systems is an active research area. AI is often qualified as non-testable. To sum up the academic literature, here are 3 reasons.

Zhang et al. (2021)

1. AI follows a data-driven programming paradigm

According to Paleyes (2021), unlike in regular software products where changes only happen in the code, AI systems change along 3 axes: the code, the model, and the data. The model’s behavior evolves in response to the frequent provision of new data.

More information: Challenges in Deploying Machine Learning: a Survey of Case Studies

2. AI is not easily breakable in small unit components

Some AI properties (e.g., accuracy) only emerge as a combination of different components such as the training data, the learning program, and the learning library. It is hard to break the AI system into smaller components that can be tested in isolation.

3. AI errors are systemic and self-amplifying

AI is characterized by many feedback loops and interactions between components. The output of one model can be ingested into the training base of another. As a result, AI errors can be difficult to identify, measure, and correct.

At Giskard, we think testing AI systems is a solvable challenge. Want to know more?

Contact us at hello@giskard.ai.

Get Free Content

Download our guide and learn What the EU AI Act means for Generative AI Systems Providers.