Understanding Model Threats

This resource is designed to provide detailed information on various threat categories, helping you understand and mitigate potential risks in AI and machine learning systems.

Deserialization Threats

Backdoor Threats

Runtime Threats

PAIT-LITERT-300: LiteRT Model Contains Unknown Operators PAIT-LITERT-301: LiteRT Model Contains Unsafe Operator Execution at Model Run Time PAIT-LMAFL-300: Llamafile Can Execute Malicious Code During Inference PAIT-KERAS-301: Keras Model Custom Layer Detected at Model Run Time PAIT-TF-300: TensorFlow SavedModel Contains Unknown Operators PAIT-TF-301: TensorFlow SavedModel Contains Unsafe Operator Execution at Model Run Time PAIT-TMT-300: Transitive Model Threat Detected with A Suspicious Model Dependency PAIT-TMT-301: Transitive Model Threat Detected with Unsafe Model Dependency

Runtime Threats

Like a deserialization threat, runtime threats occur when untrusted data or code is used to reconstruct objects, leading to potential exploitation. The specific difference occurs in how the malicious code is triggered to execute. With a basic deserialization threat, this happens at model load time. A runtime threat is triggered when the model is used for inference or any form of execution. In AI and machine learning systems, this can result in malicious actors injecting harmful code during the deserialization process, exploiting vulnerabilities to gain unauthorized access or manipulate your systems behavior. Understanding deserialization threats is crucial for securing data integrity and preventing unauthorized code execution in your AI models.

Overview

The SavedModel format saves models’ architecture (such as layers) as a graph. The graph represents the computation and flow of data in terms of nodes (operators) and edges (flow). In this sense, a model saved using SavedModel does not depend on the original model building code to run, i.e. SavedModel format is inclusive of all model building code as well as any trained parameters.

The SavedModel extends model portability since it does not require model building code. Though at the same time, attackers can exploit the code-inclusive serialization format of SavedModel to ship malicious code to users.

Models flagged for this threat meet the following criteria:

The model format is detected as Tensorflows’ (TF) SavedModel.
The model contains an unknown/custom operator which will execute code when the model is used for inference.

Since Guardian has detected an operator that is not a standard TF operator hence raised an issue to made users aware of the potential security risk.

Key Points:

Tensorflow models saved using SavedModel should be deemed as running “packaged code”.
The SavedModel format saves model code and trained parameters in a graph data structure.
TF allows for custom operators - which are user defined operators to give additional flexibility to developers. Though custom operators offer powerful functionality, attackers can exploit them to inject malicious code.
Only use/load models from trusted sources.

Background Information

SavedModel Format

For TF framework, SavedModel remains the universal serialization format. Also JAX programs can be natively serialized to SavedModel format using jax2tf. Though please note that for Keras 3 framework SavedModel format is deprecated, and the recommended format is .keras

Custom Operators

Guardian reports an issue if any of the operators in TF SavedModel is not a standard TF operator. This is because unknown TF operators can be used by attackers to inject malicious code which would execute when an unsuspecting user would load the model.

TensorFlow custom operators extend the framework's native functionality by allowing users to implement specialized operations not available in the standard library. These user-defined operators integrate seamlessly with TensorFlow's existing ecosystem, enabling the development of more efficient and tailored machine learning solutions.

Though custom operators provide additional flexibility when implementation of novel algorithms or integration of domain specific operations is required, it also gives attackers an opportunity to export malicious code to a users/victims ML system/pipeline.

Impact

While custom operators provide powerful capabilities, they can potentially be misused by malicious actors for any of the following

Code Injection: Attackers might attempt to inject malicious code into custom operators, potentially compromising an entire ML pipeline.
Resource Exhaustion: Poorly implemented or intentionally malicious operators could consume excessive computational resources, leading to denial-of-service.
Data Exfiltration: Custom operators with access to sensitive data could be designed to leak information covertly.
Model Manipulation: Malicious operators could alter model behaviour in subtle ways, potentially introducing backdoors or biases.
Versioning Attacks: Discrepancies between operator versions could be exploited to introduce vulnerabilities or unexpected behaviours.

Note: Malicious code execution using custom TF operators can be achieved without impacting a models performance - the user may never know that the attack has happened or is ongoing.

How The Attack Works

Remediation

To mitigate risks introduced by using custom operators, implement strict code review processes, use trusted sources for custom operators, and regularly audit your TensorFlow environment for any suspicious activities or unexpected behaviors.

If possible, avoid using a TF SavedModel format model since it contains code that will get executed when the model is loaded.

Protect AI's security scanner detects threats in model files

With Protect AI's Guardian you can scan models for threats before ML developers download them for use, and apply policies based on your risk tolerance.

Learn more

Runtime Threats

Overview

Key Points:

Background Information

SavedModel Format

Custom Operators

Further reading:

Impact

How The Attack Works

Remediation

Protect AI's security scanner detects threats in model files