Giskard's integration with LiteLLM enables developers to test their LLM agents across multiple foundation models. The integration enhances Giskard's core features - LLM Scan for vulnerability assessment and RAGET for RAG evaluation - by allowing them to work with any supported LLM provider: whether you're using major cloud providers like OpenAI and Anthropic, local deployments through Ollama, or open-source models like Mistral.
Giskard has integrated with NVIDIA NeMo Guardrails to enhance the safety and reliability of LLM-based applications. This integration allows developers to better detect vulnerabilities, automate rail generation, and streamline risk mitigation in LLM systems. By combining Giskard with NeMo Guardrails organizations can address critical challenges in LLM development, including hallucinations, prompt injection and jailbreaks.
The ArGiMi consortium, including Giskard, Artefact and Mistral AI, has won a France 2030 project to develop next-generation French LLMs for businesses. Giskard will lead efforts in AI safety, ensuring model quality, conformity, and security. The project will be open-source ensuring collaboration, and aiming to make AI more reliable, ethical, and accessible across industries.
Releasing an upgraded version of Giskard's LLM scan for comprehensive vulnerability assessments of LLM applications. New features include more accurate detectors through optimized prompts and expanded multi-model compatibility supporting OpenAI, Mistral, Ollama, and custom local LLMs. This article also covers an initial setup guide for evaluating LLM apps.
As businesses increasingly integrate LLMs into several applications, ensuring the reliability of AI systems is key. LLMs can generate biased, inaccurate, or even harmful outputs if not properly evaluated. This article explains the importance of LLM evaluation, and how to do it (methods and tools). It also present Giskard's comprehensive solutions for evaluating LLMs, combining automated testing, customizable test cases, and human-in-the-loop.
Our new course in collaboration with DeepLearningAI team provides training on red teaming techniques for Large Language Model (LLM) and chatbot applications. Through hands-on attacks using prompt injections, you'll learn how to identify vulnerabilities and security failures in LLM systems.
Introducing our LLM Red Teaming service, designed to enhance the safety and security of your LLM applications. Discover how our team of ML Researchers uses red teaming techniques to identify and address LLM vulnerabilities. Our new service focuses on mitigating risks like misinformation and data leaks by developing comprehensive threat models.
We have just launched Giskard v2, extending the testing capabilities of our library and Hub to Large Language Models. Support our launch on Product Hunt and explore our new integrations with Hugging Face, Weights & Biases, MLFlow, and Dagshub. A big thank you to our community for helping us reach over 1900 stars on GitHub.
DEFCON, one of the world's premier hacker conventions, this year saw a unique focus at the AI Village: red teaming of Large Language Models (LLMs). Instead of conventional hacking, participants were challenged to use words to uncover AI vulnerabilities. The Giskard team was fortunate to attend, witnessing firsthand the event's emphasis on understanding and addressing potential AI risks.
In a significant move towards AI regulation, President Biden convened a meeting with top tech companies, leading to a White House pledge that emphasizes AI safety and transparency. Companies like Google, Amazon, and OpenAI have committed to pre-release system testing, data transparency, and AI-generated content identification. As tech giants signal their intent, concerns remain regarding the specificity of their commitments.
We've reached an impressive milestone of 1,000 GitHub stars and received strategic funding of 3M€ from the French Public Investment Bank and the European Commission. With this funding, we plan to enhance their Giskard platform, aiding companies in meeting upcoming AI regulations and standards. Moreover, we've upgraded our LLM scan feature to detect even more hidden vulnerabilities.
Giskard's new beta release enables to quickly scan your AI model and detect vulnerabilities directly in your notebook. The new beta also includes simple one-line installation, automated test suite generation and execution, improved user experience for collaboration on testing dashboards, and a ready-made test catalog.
During this exclusive interview for BFM Business, Alex Combessie, our CEO and co-founder, spoke about the potential risks of AI for companies and society. As new AI technologies like ChatGPT emerge, concerns about the dangers of untested models have increased. Alex stresses the importance of Responsible AI, which involves identifying ethical biases and preventing errors. He also discusses the future of EU regulations and their potential impact on businesses.
With Giskard’s SafeGPT you can say goodbye to errors, biases & privacy issues in LLMs. Its features include an easy-to-use browser extension and a monitoring dashboard (for ChatGPT users), and a ready-made and extensible quality assurance platform for debugging any LLM (for LLM developers)
With Giskard’s new Slice feature, we introduce the possibility to identify business areas in which your AI models underperform. This will make it easier to debug performance biases or identify spurious correlations. We have also added an export/import feature to share your projects, as well as other minor improvements.