Evaluating LLM applications
One of the main use cases for LLMs is in various applications, such as chatbots, question-answering systems, and retrieval-augmented generation (RAG) models. However, these models can be vulnerable to misuse or may not perform as expected. Evaluating LLMs is crucial to identify vulnerabilities and areas for improvement, ensuring they function optimally and safely.
That's why we're excited to announce significant enhancements to our open-source LLM scan, designed to provide a comprehensive assessment of your domain-specific LLMs' vulnerabilities.
What’s Giskard's LLM scan and what's new
Giskard's LLM scan is a tool that combines heuristics-based and LLM-assisted detectors to automatically detect vulnerabilities in your LLM applications. Issues detected include hallucinations, harmful content, prompt injection, data leakage, stereotypes and many more.
Here's what's new in our latest release:
- More efficient prompts and improved detections: We've optimized our prompts for better detector accuracy. This ensures comprehensive evaluations while consuming less resources.
- New model compatibility: It now supports a wide range of LLM-assisted detectors, including OpenAI, Mistral, Ollama, local LLMs, and any other custom model. This enhancement allows you to conduct in-depth assessments of LLM applications, regardless of the model used for generation and evaluation.
For more detailed information about the LLM scan, you can visit our documentation.
Setting up the LLM Client
To start using Giskard's enhanced LLM scan, you'll first need to set up the LLM client. Here's a quick guide to get you started:
1. Install the Giskard library: You can install it via pip using the command.
2. Import the necessary modules: After installation, import the necessary modules into your Python script.
3. Initialize the LLM client: Create an instance of the LLM client, specifying the model you want to use. In this example, we’ll set up Mistral:
You can also set up other models such as Ollama, local models or any other custom models. For businesses, setting up a custom local model can be advantageous as it will allow you to maintain control over your data and model, ensuring privacy and compliance with regulations.
Check here how to set up your LLM client.
4. Wrap your model: Before running the LLM scan, you need to wrap your model to ensure a common format for your model and its metadata. You can wrap anything as long as you can represent it in a Python function (for example, an API call to Azure or OpenAI). We also have pre-built wrappers for LangChain objects, or you can create your own wrapper by extending the giskard.Model
class if you need to wrap a complex object such as a custom-made RAG communicating with a vectorstore.
When wrapping the model, it's very important to provide the name
and description
parameters describing what the model does. These will be used by our scan to generate domain-specific probes.
5. Run the LLM scan: Use the LLM client to run the scan on your model.
For a more detailed guide on setting up the LLM client and how to run the LLM scan, please refer to our documentation.
Conclusion
Giskard's enhanced LLM scan is a significant leap forward in evaluating and securing your LLMs. With its improved efficiency, accuracy, and compatibility with all LLM providers, it provides a comprehensive solution for identifying vulnerabilities and ensuring the optimal performance of your LLMs. We invite you to explore these new features and experience the benefits they bring to your applications. As always, we're here to support you on your safe LLM deployments!