G
News
August 23, 2022
5 min read

Does User Experience Matter to ML Engineers? Giskard Latest Release

What are the preferences of ML Engineers in terms of UX? A summary of key learnings, and how we implemented them in Giskard's latest release.

Synthwave astronauts polishing the hull of a giant marine turtle in space - Generated by OpenAI DallE
Alex Combessie
Synthwave astronauts polishing the hull of a giant marine turtle in space - Generated by OpenAI DallE
Synthwave astronauts polishing the hull of a giant marine turtle in space - Generated by OpenAI DallE

"Captain's Log, Stardate 2022.8. The GISKARD crew is working hard, polishing the ship hull and tuning our warp engine."

Of course, Machine Learning Engineers need software tools with a fantastic User Experience (UX). The interesting question is: what are ML Engineers' preferences in terms of UX?

This summer, we have been working hard to understand better what it means. Here is a summary of our key learnings and how we implemented them.

🕵️ Our Methodology to Understand User Pains & Preferences

To immerse ourselves in the world of our users, we combine a qualitative and quantitative approach.

First, we commit weekly time to speaking to our first users and design partners. We offer asynchronous channels (Discord / Slack / Email) and weekly catchup meetings. It is essential to listen to their findings, frustrations & happy moments. A special thank you to the Citibeats & Altaroads teams for participating in Usability Tests of our latest feature, AI Test.

We carefully record and organize this qualitative feedback using the Harvestr.io product management platform. This feedback is our core resource to define our product roadmap. Here is a sample of our discoveries:

Giskard product discoveries on Harvestr.io
Giskard product discoveries on Harvestr.io

In parallel, we use Mixpanel to understand usage patterns at an aggregated level. 

Analyzing this quantitative feedback is very useful for improving our product. It helps us understand where users struggle and the most common errors that need fixing in priority.

Giskard product usage analytics flow on Mixpanel
Giskard product usage analytics flow on Mixpanel

This report is entirely optional for users. We take great care of privacy and are GDPR-compliant. We only collect metadata, not personal data like IP addresses. If you have any questions, feel free to contact us at privacy@giskard.ai.

🤘Improve Robustness with Better API Design & Error Handling

The Number 1 preference of ML engineers is robustness. Software used for model validation & model deployment should satisfy the following characteristics:

Simple things should be simple; complex things should be possible (Alan Kay)

The simple, standard case should work out of the box, with as little configuration as possible. Then, additional parameters should allow advanced use cases.

Fail early and help users to debug. 

If input parameters are invalid, the software should detect them as early as possible. Error messages and logs should allow the user to understand what went wrong and how to fix it.

In the past two months, we have tried our best to implement these software engineering principles into our product. For instance, we have spent a lot of time redesigning our API for classification labels. It is now much simpler to set human-readable labels that business users can interpret.

If you are curious, we made our decision process public on GitHub Discussions:

Giskard public forum to discuss product decisions on GitHub
Giskard public forum to discuss product decisions on GitHub

Next, we reduced multiple edge case problems of uploading datasets and models through the Giskard python library. When possible, we collect metadata automatically to power the generation of tests of Machine Learning models. 
In addition, we invested in error handling to detect the mistakes sooner, with better error messages. For instance, we display a custom message if ML libraries are missing from our ML inference backend. We also added sections on our Gitbook documentation to help troubleshoot common issues.

That effort is continuous, and we strive to accelerate bug fixing. If you encounter any issues, join our Discord community on our #support channel. We also welcome contributors!

⚡️Optimize Time-to-Value with Faster Navigation

If you care about user experience, speed matters. Tools should save time and deliver value quickly. We all have a lot of things to do in our work life. Every slight delay can cause mounting frustrations! The famous 2006 Amazon study proves that every 100ms delay costs 1% of sales.

The 100ms rule of latency
The 100ms rule of latency

This rule can be challenging to apply in the world of Big Data. Besides, Artificial Intelligence models are increasingly large with the Transformers revolution. Ensuring fast processing of these "Big Models" on Big Data requires optimized software and hardware.

In Giskard's latest release, we have implemented multiple optimizations to make our tool faster. First, we are now scoring data more efficiently. We use debouncing where possible and avoid sending model prediction requests when it's not needed. This technique makes navigation between samples faster when evaluating an ML model on a row-by-row basis.

Second, we have improved the performance of our dataset filtering capabilities. This feature helps ML Engineers find sensitive data slices on which to evaluate their ML models. We chose Java and TableSaw (another great Open-Source project) to run these operations with real-time streaming.

There are still many optimizations ahead of us. We are currently working to optimize Giskard for Huggingface, PyTorch and Tensorflow. Our next goal is to give the possibility to execute the ML inference backend of Giskard in the user's environment. This should facilitate dependency management.

🎩 Integrations, Integrations, Integrations

The world of Data Science and MLOps is multi-faceted. To explain it, the figure below can be helpful. I highlighted in green where Giskard can help.

What AI actually is - and where Giskard can help!
What AI actually is - and where Giskard can help!

First, Giskard helps evaluate ML models in collaboration with business teams. The goal is to avoid bias and ensure ML models respect legal, ethical & security constraints. 

Then, testing is a critical step before operationalizing ML models in production. ML Testing protect AI providers against the risk of drift, regression and bias when retraining models.

To implement this ML lifecycle, Data Scientists and ML Engineers need multiple tools. Some chose all-in-one platforms that combine proprietary tools. However, there is a rapid shift in tech companies toward a Modular AI Infrastructure. It gives you complete control, choice & flexibility of technologies.

If you look at mature spaces like software development, engineers use 10-15 technologies concurrently. But with great flexibility comes great responsibility. For this modular approach to deliver value, tools must be well integrated.

At Giskard, we are committed to being deeply integrated with the MLOps ecosystem. That is why we are proud to announce a new integration with AWS!

It is now possible to start a new Giskard instance on AWS in a few clicks:

Start your Giskard application in a few clicks on AWS
Start your Giskard application in a few clicks on AWS

📍 What's Next? 

Our latest release was an initial step to improving Giskard's User Experience. We have identified a lot of remaining areas to improve:

  • Make the experience of uploading models & data easier
  • Add a command-line interface to set up Giskard
  • Offer an option to run the ML inference backend in the user's environment
  • Improve inference speed for large Deep Learning models using caching
  • Add more integrations, in particular MLFlowHuggingface and Dagshub

We are hiring exceptional software engineers to accelerate this roadmap. If you are interested or if you know someone who may be a good fit, let's talk! Our programming languages are Java, Python and Typescript.

 👉 Apply here: https://gisk.ar/jobs

Integrate | Scan | Test | Automate

Giskard: Testing & evaluation framework for LLMs and AI models

Automatic LLM testing
Protect agaisnt AI risks
Evaluate RAG applications
Ensure compliance

Does User Experience Matter to ML Engineers? Giskard Latest Release

What are the preferences of ML Engineers in terms of UX? A summary of key learnings, and how we implemented them in Giskard's latest release.

"Captain's Log, Stardate 2022.8. The GISKARD crew is working hard, polishing the ship hull and tuning our warp engine."

Of course, Machine Learning Engineers need software tools with a fantastic User Experience (UX). The interesting question is: what are ML Engineers' preferences in terms of UX?

This summer, we have been working hard to understand better what it means. Here is a summary of our key learnings and how we implemented them.

🕵️ Our Methodology to Understand User Pains & Preferences

To immerse ourselves in the world of our users, we combine a qualitative and quantitative approach.

First, we commit weekly time to speaking to our first users and design partners. We offer asynchronous channels (Discord / Slack / Email) and weekly catchup meetings. It is essential to listen to their findings, frustrations & happy moments. A special thank you to the Citibeats & Altaroads teams for participating in Usability Tests of our latest feature, AI Test.

We carefully record and organize this qualitative feedback using the Harvestr.io product management platform. This feedback is our core resource to define our product roadmap. Here is a sample of our discoveries:

Giskard product discoveries on Harvestr.io
Giskard product discoveries on Harvestr.io

In parallel, we use Mixpanel to understand usage patterns at an aggregated level. 

Analyzing this quantitative feedback is very useful for improving our product. It helps us understand where users struggle and the most common errors that need fixing in priority.

Giskard product usage analytics flow on Mixpanel
Giskard product usage analytics flow on Mixpanel

This report is entirely optional for users. We take great care of privacy and are GDPR-compliant. We only collect metadata, not personal data like IP addresses. If you have any questions, feel free to contact us at privacy@giskard.ai.

🤘Improve Robustness with Better API Design & Error Handling

The Number 1 preference of ML engineers is robustness. Software used for model validation & model deployment should satisfy the following characteristics:

Simple things should be simple; complex things should be possible (Alan Kay)

The simple, standard case should work out of the box, with as little configuration as possible. Then, additional parameters should allow advanced use cases.

Fail early and help users to debug. 

If input parameters are invalid, the software should detect them as early as possible. Error messages and logs should allow the user to understand what went wrong and how to fix it.

In the past two months, we have tried our best to implement these software engineering principles into our product. For instance, we have spent a lot of time redesigning our API for classification labels. It is now much simpler to set human-readable labels that business users can interpret.

If you are curious, we made our decision process public on GitHub Discussions:

Giskard public forum to discuss product decisions on GitHub
Giskard public forum to discuss product decisions on GitHub

Next, we reduced multiple edge case problems of uploading datasets and models through the Giskard python library. When possible, we collect metadata automatically to power the generation of tests of Machine Learning models. 
In addition, we invested in error handling to detect the mistakes sooner, with better error messages. For instance, we display a custom message if ML libraries are missing from our ML inference backend. We also added sections on our Gitbook documentation to help troubleshoot common issues.

That effort is continuous, and we strive to accelerate bug fixing. If you encounter any issues, join our Discord community on our #support channel. We also welcome contributors!

⚡️Optimize Time-to-Value with Faster Navigation

If you care about user experience, speed matters. Tools should save time and deliver value quickly. We all have a lot of things to do in our work life. Every slight delay can cause mounting frustrations! The famous 2006 Amazon study proves that every 100ms delay costs 1% of sales.

The 100ms rule of latency
The 100ms rule of latency

This rule can be challenging to apply in the world of Big Data. Besides, Artificial Intelligence models are increasingly large with the Transformers revolution. Ensuring fast processing of these "Big Models" on Big Data requires optimized software and hardware.

In Giskard's latest release, we have implemented multiple optimizations to make our tool faster. First, we are now scoring data more efficiently. We use debouncing where possible and avoid sending model prediction requests when it's not needed. This technique makes navigation between samples faster when evaluating an ML model on a row-by-row basis.

Second, we have improved the performance of our dataset filtering capabilities. This feature helps ML Engineers find sensitive data slices on which to evaluate their ML models. We chose Java and TableSaw (another great Open-Source project) to run these operations with real-time streaming.

There are still many optimizations ahead of us. We are currently working to optimize Giskard for Huggingface, PyTorch and Tensorflow. Our next goal is to give the possibility to execute the ML inference backend of Giskard in the user's environment. This should facilitate dependency management.

🎩 Integrations, Integrations, Integrations

The world of Data Science and MLOps is multi-faceted. To explain it, the figure below can be helpful. I highlighted in green where Giskard can help.

What AI actually is - and where Giskard can help!
What AI actually is - and where Giskard can help!

First, Giskard helps evaluate ML models in collaboration with business teams. The goal is to avoid bias and ensure ML models respect legal, ethical & security constraints. 

Then, testing is a critical step before operationalizing ML models in production. ML Testing protect AI providers against the risk of drift, regression and bias when retraining models.

To implement this ML lifecycle, Data Scientists and ML Engineers need multiple tools. Some chose all-in-one platforms that combine proprietary tools. However, there is a rapid shift in tech companies toward a Modular AI Infrastructure. It gives you complete control, choice & flexibility of technologies.

If you look at mature spaces like software development, engineers use 10-15 technologies concurrently. But with great flexibility comes great responsibility. For this modular approach to deliver value, tools must be well integrated.

At Giskard, we are committed to being deeply integrated with the MLOps ecosystem. That is why we are proud to announce a new integration with AWS!

It is now possible to start a new Giskard instance on AWS in a few clicks:

Start your Giskard application in a few clicks on AWS
Start your Giskard application in a few clicks on AWS

📍 What's Next? 

Our latest release was an initial step to improving Giskard's User Experience. We have identified a lot of remaining areas to improve:

  • Make the experience of uploading models & data easier
  • Add a command-line interface to set up Giskard
  • Offer an option to run the ML inference backend in the user's environment
  • Improve inference speed for large Deep Learning models using caching
  • Add more integrations, in particular MLFlowHuggingface and Dagshub

We are hiring exceptional software engineers to accelerate this roadmap. If you are interested or if you know someone who may be a good fit, let's talk! Our programming languages are Java, Python and Typescript.

 👉 Apply here: https://gisk.ar/jobs

Get Free Content

Download our guide and learn What the EU AI Act means for Generative AI Systems Providers.