G
News
August 2, 2023
4 min read

OWASP Top 10 for LLM 2023: Understanding the Risks of Large Language Models

In this post, we introduce OWASP's first version of the Top 10 for LLM, which identifies critical security risks in modern LLM systems. It covers vulnerabilities like Prompt Injection, Insecure Output Handling, Model Denial of Service, and more. Each vulnerability is explained with examples, prevention tips, attack scenarios, and references. The document serves as a valuable guide for developers and security practitioners to protect LLM-based applications and data from potential attacks.

OWASP Top 10 for LLM 2023
Matteo Dora
OWASP Top 10 for LLM 2023
OWASP Top 10 for LLM 2023

On August 1st, OWASP published the first version of their Top 10 for LLM, a reference document which identifies the 10 major risks posed by modern Large Language Models (LLMs) systems. This document serves as a valuable guide for developers, data scientists, and security practitioners to understand the most critical security issues affecting these systems.

In this article, we summarize the Top 10 for LLM and provide insights into the potential security risks associated with these systems.

What is OWASP and its Top Ten

OWASP (or ‘Open Worldwide Application Security Project’) is a well-known non-profit organization that produces guidelines, educational resources, and tools (e.g. ZAP) in the software security space. Their most famous is the OWASP Top Ten, a regularly updated list of the ten most critical security risks in web applications, which has become industry standard

OWASP Top 10 for LLM 2023

Following the incredible popularity gained by large language models (LLMs) since late 2020, OWASP has assembled a team of over 400 experts from industry and academia to develop guidance on LLM security. The team identified high-risk issues affecting LLM, evaluating their impact, attack scenarios, and remediation strategies.

The impressive outcome of this work is the Top 10 for LLM, a list of the ten most critical vulnerabilities that affect LLMs. Each vulnerability is accompanied by examples, prevention tips, attack scenarios, and references. Let’s dive in.

  1. Prompt Injection
  2. Insecure Output Handling
  3. Training Data Poisoning
  4. Model Denial of Service
  5. Supply Chain Vulnerabilities
  6. Sensitive Information Disclosure
  7. Insecure Plugin Design
  8. Excessive Agency
  9. Overreliance
  10. Model Theft

01. Prompt Injection

Prompt Injection happens when the LLM can be manipulated to behave as the attacker wishes, removing filters or security mechanisms that limited the model execution. OWASP further distinguish this vulnerability in direct and indirect prompt injection.

In direct prompt injection, the attacker overwrites or reveals the model system prompt. This is also know as jailbreaking and allows the attacker to bypass rules, limitations and security measures fixed by the operator of the LLM.

In indirect prompt injection, the attacker directs the LLM to some external source such as a webpage that embeds a adversarial instructions. In this way, the attacker can hijack the model, manipulating its behaviour.

02. Insecure Output Handling

As with any untrusted data source (e.g. user input), the output of LLM models should always scrutinized and validated before passing it to other application components. Not doing so can result in a variety of injection attacks like cross-site scripting or remote code execution, depending on the component. For example, if the LLM output is passed to a system shell to execute a command, improper validation and escaping can give an attacker the ability to execute arbitrary commands on the system by having the model generate unsafe output.

03. Training Data Poisoning

Data poisoning is a vulnerability that can occur during the training or fine-tuning stage. It refers to the tampering of the training data ("poisoning") with the objective of compromising the model's behavior, including performance degradation, introduction of biases, falsehoods, or toxicity.

04. Model Denial of Service

LLMs can be quite resource-intensive, and a malicious user can leverage this to cause the operator to incur extreme resource usage and costs, potentially leading to a collapse of the service. For example, an attacker can flood the system with requests, craft expensive queries (for example very long inputs), or induce the LLM to perform costly chains of tasks.

05. Supply Chain Vulnerabilities

Supply chain vulnerabilities refer to risks introduced by unvetted dependencies on third-party resources. In software, this is typically represented by dependencies on third-party libraries that can potentially be compromised, introducing unwanted behavior. This issue also exists for LLMs and machine learning in general, particularly regarding the usage of third-party pre-trained models or datasets, which are susceptible to tampering and poisoning.

Such type of attack was recently demonstrated by Mithril Security, which distributed an open-source LLM through the Hugging Face Hub which was poisoned to provide false information on a specific task, while maintaining normal behaviour on the rest.

06. Sensitive Information Disclosure

LLMs can inadvertently leak confidential information. Such confidential data could have been memorized during traning or fine-tuning, provided in the system prompt, or is accessible by the LLM from internal sources which are not meant to be exposed. Since LLMs output is in general unpredictable, LLM operators should both avoid confidential data being exposed to the LLM and control for the presence of sensitive information in the model output before passing it over.

07. Insecure Plugin Design

In the context of LLMs, plugin are extensions that can be automatically called by the model to perform some tasks. A notable example is ChatGPT plugins. Badly designed plugins can constitute an opportunity for attackers to bypass protections and cause undesired behaviour ranging from data exfiltration to code execution.

08. Excessive Agency

Agency refers to the ability of the LLM to interface and control other systems, increasing the attack surface. OWASP declines excessive agency in three categories: excessive functionality, excessive permission, or excessive autonomy.

Excessive functionality describe the situation where the LLM is given access to functionalities which are not needed for its operation, and that can cause significant damage if exploited by an attacker. For example, the ability to read and modify files.

Excessive permissions denote the case in which the LLM has unitended permissions that allows it to access information or perform actions . For example, an LLM that retrieves information from a dataset that also has write permission

Excessive autonomy is the situation in which the LLM can take potentially destructive or high-impact actions without an external control, for example a plugin that can send emails on behalf of the user without any confirmation.

09. Overreliance

Overreliance refers to usage of LLMs without oversight. It is well known that LLMs can generate inaccurate or inappropriate content, hallucinate, or produce incoherent responses. For this reason, its operation should be overseen, monitored and validated during operation. For example, code generated by an LLM may contain bugs or vulnerabilities, requiring a review process to be put in place to guarantee safe and correct operation.

10. Model Theft

Model theft simply refers to possibility of proprietary LLM models to be stolen, either by gaining physical possession, copying algorithms or weights. As a form of valuable intellectual property, loss of an LLM model can cause significal economic loss, erosion of competitive advantages, or access to confidential information within the model.

Conclusions

As any other software, LLM models can be affected by security vulnerabilities and need to be such issues need to be assessed before and after deployment. The OWASP Top 10 for LLM serves as a valuable guide for developers, data scientists, and security practitioners to understand the most critical security issues affecting these systems.

Given the increasing reliance on LLMs in various applications, it is essential to be aware of these vulnerabilities and take preventive measures to mitigate the risks. By following the recommendations provided in the Top 10 for LLM, organizations can better protect their systems and data from potential attacks and ensure the security and reliability of their LLM-based applications.

In this spirit, Giskard can assist organizations and data scientists in ensuring that their LLM and machine learning systems behave as expected. This can be achieved through a preventive automated vulnerability scan, and through the implementation of systematic and continuous testing of AI models.

Integrate | Scan | Test | Automate

Giskard: Testing & evaluation framework for LLMs and AI models

Automatic LLM testing
Protect agaisnt AI risks
Evaluate RAG applications
Ensure compliance

OWASP Top 10 for LLM 2023: Understanding the Risks of Large Language Models

In this post, we introduce OWASP's first version of the Top 10 for LLM, which identifies critical security risks in modern LLM systems. It covers vulnerabilities like Prompt Injection, Insecure Output Handling, Model Denial of Service, and more. Each vulnerability is explained with examples, prevention tips, attack scenarios, and references. The document serves as a valuable guide for developers and security practitioners to protect LLM-based applications and data from potential attacks.

On August 1st, OWASP published the first version of their Top 10 for LLM, a reference document which identifies the 10 major risks posed by modern Large Language Models (LLMs) systems. This document serves as a valuable guide for developers, data scientists, and security practitioners to understand the most critical security issues affecting these systems.

In this article, we summarize the Top 10 for LLM and provide insights into the potential security risks associated with these systems.

What is OWASP and its Top Ten

OWASP (or ‘Open Worldwide Application Security Project’) is a well-known non-profit organization that produces guidelines, educational resources, and tools (e.g. ZAP) in the software security space. Their most famous is the OWASP Top Ten, a regularly updated list of the ten most critical security risks in web applications, which has become industry standard

OWASP Top 10 for LLM 2023

Following the incredible popularity gained by large language models (LLMs) since late 2020, OWASP has assembled a team of over 400 experts from industry and academia to develop guidance on LLM security. The team identified high-risk issues affecting LLM, evaluating their impact, attack scenarios, and remediation strategies.

The impressive outcome of this work is the Top 10 for LLM, a list of the ten most critical vulnerabilities that affect LLMs. Each vulnerability is accompanied by examples, prevention tips, attack scenarios, and references. Let’s dive in.

  1. Prompt Injection
  2. Insecure Output Handling
  3. Training Data Poisoning
  4. Model Denial of Service
  5. Supply Chain Vulnerabilities
  6. Sensitive Information Disclosure
  7. Insecure Plugin Design
  8. Excessive Agency
  9. Overreliance
  10. Model Theft

01. Prompt Injection

Prompt Injection happens when the LLM can be manipulated to behave as the attacker wishes, removing filters or security mechanisms that limited the model execution. OWASP further distinguish this vulnerability in direct and indirect prompt injection.

In direct prompt injection, the attacker overwrites or reveals the model system prompt. This is also know as jailbreaking and allows the attacker to bypass rules, limitations and security measures fixed by the operator of the LLM.

In indirect prompt injection, the attacker directs the LLM to some external source such as a webpage that embeds a adversarial instructions. In this way, the attacker can hijack the model, manipulating its behaviour.

02. Insecure Output Handling

As with any untrusted data source (e.g. user input), the output of LLM models should always scrutinized and validated before passing it to other application components. Not doing so can result in a variety of injection attacks like cross-site scripting or remote code execution, depending on the component. For example, if the LLM output is passed to a system shell to execute a command, improper validation and escaping can give an attacker the ability to execute arbitrary commands on the system by having the model generate unsafe output.

03. Training Data Poisoning

Data poisoning is a vulnerability that can occur during the training or fine-tuning stage. It refers to the tampering of the training data ("poisoning") with the objective of compromising the model's behavior, including performance degradation, introduction of biases, falsehoods, or toxicity.

04. Model Denial of Service

LLMs can be quite resource-intensive, and a malicious user can leverage this to cause the operator to incur extreme resource usage and costs, potentially leading to a collapse of the service. For example, an attacker can flood the system with requests, craft expensive queries (for example very long inputs), or induce the LLM to perform costly chains of tasks.

05. Supply Chain Vulnerabilities

Supply chain vulnerabilities refer to risks introduced by unvetted dependencies on third-party resources. In software, this is typically represented by dependencies on third-party libraries that can potentially be compromised, introducing unwanted behavior. This issue also exists for LLMs and machine learning in general, particularly regarding the usage of third-party pre-trained models or datasets, which are susceptible to tampering and poisoning.

Such type of attack was recently demonstrated by Mithril Security, which distributed an open-source LLM through the Hugging Face Hub which was poisoned to provide false information on a specific task, while maintaining normal behaviour on the rest.

06. Sensitive Information Disclosure

LLMs can inadvertently leak confidential information. Such confidential data could have been memorized during traning or fine-tuning, provided in the system prompt, or is accessible by the LLM from internal sources which are not meant to be exposed. Since LLMs output is in general unpredictable, LLM operators should both avoid confidential data being exposed to the LLM and control for the presence of sensitive information in the model output before passing it over.

07. Insecure Plugin Design

In the context of LLMs, plugin are extensions that can be automatically called by the model to perform some tasks. A notable example is ChatGPT plugins. Badly designed plugins can constitute an opportunity for attackers to bypass protections and cause undesired behaviour ranging from data exfiltration to code execution.

08. Excessive Agency

Agency refers to the ability of the LLM to interface and control other systems, increasing the attack surface. OWASP declines excessive agency in three categories: excessive functionality, excessive permission, or excessive autonomy.

Excessive functionality describe the situation where the LLM is given access to functionalities which are not needed for its operation, and that can cause significant damage if exploited by an attacker. For example, the ability to read and modify files.

Excessive permissions denote the case in which the LLM has unitended permissions that allows it to access information or perform actions . For example, an LLM that retrieves information from a dataset that also has write permission

Excessive autonomy is the situation in which the LLM can take potentially destructive or high-impact actions without an external control, for example a plugin that can send emails on behalf of the user without any confirmation.

09. Overreliance

Overreliance refers to usage of LLMs without oversight. It is well known that LLMs can generate inaccurate or inappropriate content, hallucinate, or produce incoherent responses. For this reason, its operation should be overseen, monitored and validated during operation. For example, code generated by an LLM may contain bugs or vulnerabilities, requiring a review process to be put in place to guarantee safe and correct operation.

10. Model Theft

Model theft simply refers to possibility of proprietary LLM models to be stolen, either by gaining physical possession, copying algorithms or weights. As a form of valuable intellectual property, loss of an LLM model can cause significal economic loss, erosion of competitive advantages, or access to confidential information within the model.

Conclusions

As any other software, LLM models can be affected by security vulnerabilities and need to be such issues need to be assessed before and after deployment. The OWASP Top 10 for LLM serves as a valuable guide for developers, data scientists, and security practitioners to understand the most critical security issues affecting these systems.

Given the increasing reliance on LLMs in various applications, it is essential to be aware of these vulnerabilities and take preventive measures to mitigate the risks. By following the recommendations provided in the Top 10 for LLM, organizations can better protect their systems and data from potential attacks and ensure the security and reliability of their LLM-based applications.

In this spirit, Giskard can assist organizations and data scientists in ensuring that their LLM and machine learning systems behave as expected. This can be achieved through a preventive automated vulnerability scan, and through the implementation of systematic and continuous testing of AI models.

Get Free Content

Download our guide and learn What the EU AI Act means for Generative AI Systems Providers.