Understanding Model Threats

This resource is designed to provide detailed information on various threat categories, helping you understand and mitigate potential risks in AI and machine learning systems.

Deserialization Threats

Backdoor Threats

Runtime Threats

PAIT-LITERT-300: LiteRT Model Contains Unknown Operators PAIT-LITERT-301: LiteRT Model Contains Unsafe Operator Execution at Model Run Time PAIT-LITERT-302: LiteRT Model Contains Suspicious Operator Execution at Model Run Time PAIT-LMAFL-300: Llamafile Can Execute Malicious Code During Inference PAIT-KERAS-301: Keras Model Custom Layer Detected at Model Run Time PAIT-TF-300: TensorFlow SavedModel Contains Unknown Operators PAIT-TF-301: TensorFlow SavedModel Contains Unsafe Operator Execution at Model Run Time PAIT-TF-302: TensorFlow SavedModel Contains Suspicious Operator Execution at Model Run Time PAIT-TMT-300: Transitive Model Threat Detected with A Suspicious Model Dependency PAIT-TMT-301: Transitive Model Threat Detected with Unsafe Model Dependency PAIT-TCHST-300: TorchScript Model Arbitrary Code Execution Detected at Model Load Time PAIT-TCHST-301: TorchScript Model Arbitrary Code Execution Suspected at Model Load Time

Runtime Threats

Like a deserialization threat, runtime threats occur when untrusted data or code is used to reconstruct objects, leading to potential exploitation. The specific difference occurs in how the malicious code is triggered to execute. With a basic deserialization threat, this happens at model load time. A runtime threat is triggered when the model is used for inference or any form of execution. In AI and machine learning systems, this can result in malicious actors injecting harmful code during the deserialization process, exploiting vulnerabilities to gain unauthorized access or manipulate your systems behavior. Understanding deserialization threats is crucial for securing data integrity and preventing unauthorized code execution in your AI models.

Overview

The SavedModel format saves models’ architecture (such as layers) as a graph. The graph represents the computation and flow of data in terms of nodes (operators) and edges (flow). In this sense, a model saved using SavedModel does not depend on the original model building code to run, i.e. SavedModel format is inclusive of all model building code as well as any trained parameters.

The SavedModel extends model portability since it does not require model building code. Though at the same time, attackers can exploit the code-inclusive serialization format of SavedModel to ship suspicious code to users.

Please note that SavedModel format is deprecated with the introduction of Keras 3. The recommended format is .keras

Models flagged for this threat meet the following criteria:

The model format is detected as TensorFlows’ (TF) SavedModel.
The model contains a potentially suspicious operator which reads arbitrary files and when combined with other techniques could lead to sensitive data leakage.

Key Points

TensorFlow models saved using SavedModel should be deemed as running “packaged code”.
The SavedModel format saves model code and trained parameters in a graph data structure.
Some of the standard/known TensorFlow operators can be exploited by attackers such as ReadFile to read contents of arbitrary files on a user’s machine
Only use/load models from trusted sources.

Impact

Subject to attacker’s proficiency but any of the following is possible:

Read the contents of arbitrary system files using known TensorFlow operators such as ReadFile
Read sensitive model information or dataset content
Change model configuration settings

Note: Suspicious code execution using standard TF operators can be achieved without impacting a model’s performance - the user may never know that the attack has happened or is ongoing.

How The Attack Works

Remediation

If possible, avoid using a TF SavedModel format model since it contains code that will get executed when the model is loaded.

If not possible, reach out to the model creator and alert them that the model has failed our scan. You can even link to the specific page on our Insights Database to provide our most up to date findings.

The model provider should also report what they did to correct this issue as part of their release notes.

Protect AI's security scanner detects threats in model files

With Protect AI's Guardian you can scan models for threats before ML developers download them for use, and apply policies based on your risk tolerance.

Learn more

Runtime Threats

Overview

Key Points

Further reading:

Impact

How The Attack Works

Remediation

Protect AI's security scanner detects threats in model files