-
-

Understanding Model Threats

This resource is designed to provide detailed information on various threat categories, helping you understand and mitigate potential risks in AI and machine learning systems.

Deserialization Threats

Deserialization threats occur when untrusted data or code is used to reconstruct objects, leading to potential exploitation. In AI and machine learning systems, this can result in malicious actors injecting harmful code during the deserialization process, exploiting vulnerabilities to gain unauthorized access or manipulate your systems behavior. Understanding deserialization threats is crucial for securing data integrity and preventing unauthorized code execution in your AI models.

Overview

Machine learning models are traditionally developed using PyTorch or another framework, and then converted to GGUF. GGUF is a binary format optimized for quick loading and storing models. GGUF format is developed by llama.cpp team, and is efficient for inference purposes.

The chat template is often used with large language models for prompt formatting. A potential security risk arises when Jinja chat template is not rendered in a sandboxed environment, leading to possible arbitrary code execution.

When a Jinja template is rendered in a sandboxed environment, any security concerns found in the template would raise an exception. Hence rendering a Jinja template in sandboxed environment allows developers to ensure Jinja template is safe to be loaded for any downstream tasks.

If a model reportedly has this issue it means:

  1. The model is serialized using GGUF format.
  2. The model contains potentially malicious code in its Jinja template which will execute when the model is loaded.

Key Points

  1. GGUF models consists of tensors and a standardized set of metadata. Chat template can be added as part of GGUF model metadata.
  2. GGUF uses Jinja2 templating to format the prompt.
  3. Attackers can insert malicious code in Jinja template.
  4. Loading GGUF model which uses Jinja template will execute any code (malicious or otherwise)
  5. Only load models from trusted sources.

Jinja Template rendering in Sandbox Environment

Here is a sample code that we can run to ensure a GGUF model’s chat template is safe”

  1. First we retrieve chat template string from GGUF models metadata using get_chat_template_str() :

    Python
    1from gguf.gguf_reader import GGUFReader
    2import jinja2.sandbox
    3
    4def get_chat_template_str(file_path):
    5    reader = GGUFReader(file_path)
    6
    7    for key, field in reader.fields.items():
    8        if key == "tokenizer.chat_template":
    9            value = field.parts[field.data[0]]
    10
    11    # Convert the list of integers to a string
    12    chat_template_str = ''.join(chr(i) for i in value)
    13    return chat_template_str
    14
    15gguf_model_path = "enter model path here"
    16gguf_chat_template = get_chat_template_str(gguf_model_path)
    17
  2. Load the retrieved chat template in Jinjas’ sandboxed environment using run_jinja_template():

    Python
    1
    2def run_jinja_template(chat_template: str) -> bool:
    3    try:
    4        sandboxed_env = jinja2.sandbox.SandboxedEnvironment()
    5        template = sandboxed_env.from_string(chat_template)
    6        template.render()
    7    except jinja2.exceptions.SecurityError:
    8        return True
    9    except Exception:
    10        pass
    11
    12    return False
    13
    14print(f"Testing GGUF model:\nFound security error in Jinja template? {run_jinja_template(gguf_chat_template)}")
    15
  3. The output of running the last cell will either return True or False indicating whether a security error was found in Jinja template.

Further reading:

  1. CVE-2024-34359
  2. Jinja template
  3. GGUF
  4. llama.cpp

Impact

An attacker could exploit a compromised template to:

  1. Access sensitive information (e.g., SSH keys, cloud credentials)
  2. Execute malicious code on your system
  3. Use the compromised system as a vector for broader attacks

Note: Malicious code execution using Jinja template can be achieved without impacting a models performance - the user may never know that the attack has happened or is ongoing

How The Attack Works:

Best Practices

You should:

  1. Only load and execute models from trusted sources
  2. Implement a vetting process for third-party models before use
  3. Use sandboxing techniques when loading untrusted models
  4. Regularly update GGUF and related libraries to benefit from security patches

Remediation

GGUF models often come with Jinja template. If possible load GGUF models using Jinja inside sandboxed environments.

If not possible, reach out to the model creator and alert them that the model has failed our scan. You can even link to the specific page on our Insights Database to provide our most up to date findings.

The model provider should also report what they did to correct this issue as part of their release notes.

Protect AI's security scanner detects threats in model files
With Protect AI's Guardian you can scan models for threats before ML developers download them for use, and apply policies based on your risk tolerance.
Learn more