G

Demo: How to test your LLM agents 🚀

Prevent hallucinations & security issues

Blanca Rivera Campos

Secure AI Agents: Exhaustive testing with continuous LLM Red Teaming

Blog

Secure AI Agents: Exhaustive testing with continuous LLM Red Teaming

Testing AI agents presents significant challenges as vulnerabilities continuously emerge, exposing organizations to reputational and financial risks when systems fail in production. Giskard's LLM Evaluation Hub addresses these challenges through adversarial LLM agents that automate exhaustive testing, annotation tools that integrate domain expertise, and continuous red teaming that adapts to evolving threats.

Blanca Rivera Campos

Blanca Rivera Campos

Giskard integrates with LiteLLM to simplify LLM agent testing

News

[Release notes] Giskard integrates with LiteLLM: Simplifying LLM agent testing across foundation models

Giskard's integration with LiteLLM enables developers to test their LLM agents across multiple foundation models. The integration enhances Giskard's core features - LLM Scan for vulnerability assessment and RAGET for RAG evaluation - by allowing them to work with any supported LLM provider: whether you're using major cloud providers like OpenAI and Anthropic, local deployments through Ollama, or open-source models like Mistral.

Blanca Rivera Campos

Blanca Rivera Campos

Giskard integrates with NVIDIA NeMo

News

Evaluating LLM applications: Giskard Integration with NVIDIA NeMo Guardrails

Giskard has integrated with NVIDIA NeMo Guardrails to enhance the safety and reliability of LLM-based applications. This integration allows developers to better detect vulnerabilities, automate rail generation, and streamline risk mitigation in LLM systems. By combining Giskard with NeMo Guardrails organizations can address critical challenges in LLM development, including hallucinations, prompt injection and jailbreaks.

Blanca Rivera Campos

Blanca Rivera Campos

ArGiMi Consortium

News

Giskard leads GenAI Evaluation in France 2030's ArGiMi Consortium

The ArGiMi consortium, including Giskard, Artefact and Mistral AI, has won a France 2030 project to develop next-generation French LLMs for businesses. Giskard will lead efforts in AI safety, ensuring model quality, conformity, and security. The project will be open-source ensuring collaboration, and aiming to make AI more reliable, ethical, and accessible across industries.

Blanca Rivera Campos

Blanca Rivera Campos

Giskard LLM scan multi-model

News

[Release notes] LLM app vulnerability scanner for Mistral, OpenAI, Ollama, and Custom Local LLMs

Releasing an upgraded version of Giskard's LLM scan for comprehensive vulnerability assessments of LLM applications. New features include more accurate detectors through optimized prompts and expanded multi-model compatibility supporting OpenAI, Mistral, Ollama, and custom local LLMs. This article also covers an initial setup guide for evaluating LLM apps.

Blanca Rivera Campos

Blanca Rivera Campos

Why LLM evaluation is important

Blog

Guide to LLM evaluation and its critical impact for businesses

As businesses increasingly integrate LLMs into several applications, ensuring the reliability of AI systems is key. LLMs can generate biased, inaccurate, or even harmful outputs if not properly evaluated. This article explains the importance of LLM evaluation, and how to do it (methods and tools). It also present Giskard's comprehensive solutions for evaluating LLMs, combining automated testing, customizable test cases, and human-in-the-loop.

Blanca Rivera Campos

Blanca Rivera Campos

Red Teaming LLM Applications course

News

New course with DeepLearningAI: Red Teaming LLM Applications

Our new course in collaboration with DeepLearningAI team provides training on red teaming techniques for Large Language Model (LLM) and chatbot applications. Through hands-on attacks using prompt injections, you'll learn how to identify vulnerabilities and security failures in LLM systems.

Blanca Rivera Campos

Blanca Rivera Campos

Giskard's LLM Red Teaming

News

LLM Red Teaming: Detect safety & security breaches in your LLM apps

Introducing our LLM Red Teaming service, designed to enhance the safety and security of your LLM applications. Discover how our team of ML Researchers uses red teaming techniques to identify and address LLM vulnerabilities. Our new service focuses on mitigating risks like misinformation and data leaks by developing comprehensive threat models.

Blanca Rivera Campos

Blanca Rivera Campos

Giskard’s LLM Testing solution is launching on Product Hunt

News

Our LLM Testing solution is launching on Product Hunt 🚀

We have just launched Giskard v2, extending the testing capabilities of our library and Hub to Large Language Models. Support our launch on Product Hunt and explore our new integrations with Hugging Face, Weights & Biases, MLFlow, and Dagshub. A big thank you to our community for helping us reach over 1900 stars on GitHub.

Blanca Rivera Campos

Blanca Rivera Campos

Giskard team at DEFCON31

Blog

AI Safety at DEFCON 31: Red Teaming for Large Language Models (LLMs)

DEFCON, one of the world's premier hacker conventions, this year saw a unique focus at the AI Village: red teaming of Large Language Models (LLMs). Instead of conventional hacking, participants were challenged to use words to uncover AI vulnerabilities. The Giskard team was fortunate to attend, witnessing firsthand the event's emphasis on understanding and addressing potential AI risks.

Blanca Rivera Campos

Blanca Rivera Campos

White House pledge targets AI regulation

News

White House pledge targets AI regulation with Top Tech companies

In a significant move towards AI regulation, President Biden convened a meeting with top tech companies, leading to a White House pledge that emphasizes AI safety and transparency. Companies like Google, Amazon, and OpenAI have committed to pre-release system testing, data transparency, and AI-generated content identification. As tech giants signal their intent, concerns remain regarding the specificity of their commitments.

Blanca Rivera Campos

Blanca Rivera Campos

LLM Scan: Advanced LLM vulnerability detection

News

1,000 GitHub stars, 3M€, and new LLM scan feature 💫

We've reached an impressive milestone of 1,000 GitHub stars and received strategic funding of 3M€ from the French Public Investment Bank and the European Commission. With this funding, we plan to enhance their Giskard platform, aiding companies in meeting upcoming AI regulations and standards. Moreover, we've upgraded our LLM scan feature to detect even more hidden vulnerabilities.

Blanca Rivera Campos

Blanca Rivera Campos

Scan your AI model to find vulnerabilities

News

Giskard’s new beta is out! ⭐ Scan your model to detect hidden vulnerabilities

Giskard's new beta release enables to quickly scan your AI model and detect vulnerabilities directly in your notebook. The new beta also includes simple one-line installation, automated test suite generation and execution, improved user experience for collaboration on testing dashboards, and a ready-made test catalog.

Blanca Rivera Campos

Blanca Rivera Campos

Giskard interview for BFM Business' FocusPME

News

Exclusive Interview: How to eliminate risks of AI incidents in production

During this exclusive interview for BFM Business, Alex Combessie, our CEO and co-founder, spoke about the potential risks of AI for companies and society. As new AI technologies like ChatGPT emerge, concerns about the dangers of untested models have increased. Alex stresses the importance of Responsible AI, which involves identifying ethical biases and preventing errors. He also discusses the future of EU regulations and their potential impact on businesses.

Blanca Rivera Campos

Blanca Rivera Campos

SafeGPT - The safest way to use ChatGPT and other LLMs

News

🔥 The safest way to use ChatGPT... and other LLMs

With Giskard’s SafeGPT you can say goodbye to errors, biases & privacy issues in LLMs. Its features include an easy-to-use browser extension and a monitoring dashboard (for ChatGPT users), and a ready-made and extensible quality assurance platform for debugging any LLM (for LLM developers)

Blanca Rivera Campos

Blanca Rivera Campos

Giskard's turtle slicing some veggies!

News

Giskard 1.4 is out! What's new in this version? ⭐

With Giskard’s new Slice feature, we introduce the possibility to identify business areas in which your AI models underperform. This will make it easier to debug performance biases or identify spurious correlations. We have also added an export/import feature to share your projects, as well as other minor improvements.

Blanca Rivera Campos

Blanca Rivera Campos