Understanding Model Threats

This resource is designed to provide detailed information on various threat categories, helping you understand and mitigate potential risks in AI and machine learning systems.

Deserialization Threats

Backdoor Threats

Runtime Threats

Deserialization Threats

Deserialization threats occur when untrusted data or code is used to reconstruct objects, leading to potential exploitation. In AI and machine learning systems, this can result in malicious actors injecting harmful code during the deserialization process, exploiting vulnerabilities to gain unauthorized access or manipulate your systems behavior. Understanding deserialization threats is crucial for securing data integrity and preventing unauthorized code execution in your AI models.

Overview

Some machine learning frameworks, in particular NeMo, utilize compressed archives such as ZIP or TAR formats to store data. These formats are vulnerable to archive slip attacks, which utilize malicious members of the archive containing paths that can escape the intended extraction destination.

These paths can include patterns such as “..” indicating the parent directory, “/” or “\” indicating absolute paths, “~” indicating the home directory, “$” indicating environment variables, or names like “C:” indicating a Windows drive letter. They can also include hard links or symlinks to other directories on the system.

When these malicious member files are extracted from the archive, either manually or during the model deserialization process, they can traverse directories to the path indicated and can access the user’s file system in the new location. This creates the ability to read or overwrite critical files there or create new executables, leading to remote command execution.

If a model reportedly has this issue, it means:

The model is serialized using an archive format.
The model contains a member file capable of executing an archive slip attack and traversing the file system upon extraction.

Key Points

Machine learning models using compressed archives are vulnerable to archive slip attacks.
Archive slips execute when the model is deserialized and the members are extracted.
Malicious archive member files can traverse directories and write to the user’s file system outside of the intended extraction directory.
Attacks can result in critical files being overwritten or executables being created, leading to remote command execution.
Mitigation strategies include using safe formats that do not use archives, using thorough vetting processes for models, or removing detected malicious members from the model archive.

Background Information

Impact

This attack can harm the organization in the following ways:

Executing Unauthorized Commands - The attack can create or overwrite executables on the system or other critical files used to execute code such as programs or libraries, leading to arbitrary code execution.
Modifying Files or Directories - The attack can overwrite configurations, credentials, files for security measures, other models on the system, etc.
Reading Existing Files - The attack can read the contents of existing files on the system and steal sensitive data such as credentials.
Program Corruption or Denial of Service - The attack can delete or corrupt critical files needed for the system or programs on it to run, or lock out users from the system by corrupting credentials or authentication.

How the Attack Works

Best Practices

You should:

Only load and execute models from trusted sources
Implement a vetting process for third-party models before use
Use sandboxing techniques when loading untrusted models
Check all member file names in archives for patterns indicating archive slip attacks. Remove malicious member files if possible. If not possible, do not use the archive.
Use model formats that don't use archives, such as SafeTensors, which provides a safe alternative to traditional serialization formats

Remediation

Remove detected malicious member files from the model archive if possible before use. If not possible, do not use the archive. Note that in some cases removing the detected malicious file may render the model unusable.

If possible, use a different model format like SafeTensors that does not utilize archives, in order to remove archive slip attacks from impacting your work entirely.

If not possible, reach out to the model creator and alert them that the model has failed our scan. You can even link to the specific page on our Insights Database to provide our most up to date findings.

The model provider should also report what they did to correct this issue as part of their release notes.

Protect AI's security scanner detects threats in model files

With Protect AI's Guardian you can scan models for threats before ML developers download them for use, and apply policies based on your risk tolerance.

Learn more