Introduction
Artificial Intelligence and Machine Learning (AI/ML) is becoming increasingly democratized and accessible. This trend is partially due to the availability of powerful open source “Foundational Models” hosted on model hubs such as Hugging Face. At Protect AI we believe this trend, and the service that Hugging Face provides, are critical pieces in fostering open, safe, and secure AI/ML.
As a consequence of the freely exchanged files hosted on model repositories, there is inherent risk that some users will utilize the space to propagate malicious executable code to the community. Last year at Protect AI, we released ModelScan, an open source tool to scan AI/ML model artifacts to help secure systems from supply chain attacks.
Since then, we’ve used ModelScan to evaluate over 400,000 model artifacts hosted on Hugging Face in order to identify malicious models and compare our findings with the existing security scans Hugging Face performs. What we’ve found is that the existing security measures from Hugging Face can provide a false sense of security and in turn jeopardize the use of open-source machine learning in enterprises. In this blog, we’ll expand on those findings and discuss the often hidden risks of using models from open repositories.
Background
The security posture of openly shared machine learning models has been identified as a critical risk to enterprise use of public models. Malicious actors can upload useful, but also compromised, models that execute arbitrary code when the model is loaded or executed. Models can covertly make calls to sensitive functions like os.system or builtins.exec with payloads that put your organization at risk for credential theft, data theft, or any other attack which uses the privileges of the user running the model.
Certain model formats such as python’s pickle (PyTorch’s default format) are known to be unsafe, yet models continue to be distributed in this format since it is a convenient and flexible way to share models. Other model formats carry their own risks for arbitrary code execution as well. Multi-Backend Keras (TensorFlow, PyTorch, and JAX) models support Lambda layers to run user defined code as a layer within your model, while TensorFlow and ONNX support custom C++ operators that can be distributed as accompanying shared object files.
Users can try to avoid certain file formats, but doing this requires constant vigilance from everyone in your organization. In a space that moves as quickly as AI/ML, enterprises would be remiss to think that every employee will prioritize safety while trying new models without having proper enterprise controls. As an example, one of the most popular open source models (Mistral 7B) is still distributed using a pickle file format:
https://mistral.ai/news/announcing-mistral-7b/
https://files.mistral-7b-v0-1.mistral.ai/mistral-7B-v0.1.tar
It was only available on Hugging Face as a pickle format for several months after its release.
Current State of Hugging Face Repository Security
Hugging Face uses a version of picklescan to evaluate uploaded pickle files, but it does not scan other popular model formats. When calls to a dangerous function is found, it’ll be reported on the repository webpage:
By admission in the documentation: This is not 100% foolproof. It is your responsibility as a user to check if something is safe or not. We are not actively auditing python packages for safety, the safe/unsafe imports lists we have are maintained in a best-effort manner.
Sometimes this will trigger a banner to be applied on the model page to highlight the risk, but as we’ll show, this is done in an inconsistent manner and should not be depended on:
More importantly, if a user pulls and loads a model programmatically, no warning is given and the model (plus executable code) is simply loaded onto the unsuspecting victim’s machine:
The lack of programmatic warning is also true for datasets on Hugging Face that have been flagged by ClamAV as containing a virus:
Threat Research Findings
Note: the following results are at the time of this publication.
Protect AI’s ModelScan also searches for dangerous function calls, but has a more comprehensive set of operator codes that we include in frequently published ModelScan versions. Additionally, ModelScan detects potential arbitrary code execution by detecting Keras lambda layers and TensorFlow custom operators.
Protect AI has scanned over 400,000 Hugging Face models since ModelScan’s release. During this evaluation, we found 3354 models that use functions which can execute arbitrary code on model load or inference. 1347 of those models are not marked as “unsafe” by the current Hugging Face security scans. Example repositories that have the ability to execute arbitrary code are akhaliq/frame-interpolation-film-style , ongkn/attraction-classifier, and timinar/baby-llama-58m.
While the majority of the 1347 missed models are because of insufficient scanner capabilities, we found a few examples that make function calls that should have been detected with the current security, but simply were not: badmoh/testfirst (builtins.exec) and admko/evil_test (builtins.__eval__). Additionally, some repositories have detected dangerous calls, but do not receive a security banner: sheigel/best-llm, ModelsForAll/First_Clean_Experiment. Lastly, we found examples of likely malicious actors that are name squatting on established organizations such as Meta: facebook-llama.
Lambda Layer Example Deep Dive
The Protect AI open-source ModelScan will flag models with lambda layers as having the ability to execute arbitrary code. Going a step further, Protect AI has additional scanners available in our commercial offering that can deserialize the Lambda code to determine what is actually being run.
This is how we flagged opendiffusion/sentimentcheck’s lambda layer for running 7 unsafe functions which will:
- Reach out to an AWS lambda url
- Download an executable
- Run the executable using a base64 encoded command
When deserialized and decoded, the payload was:
How to secure your workloads
As discussed in the Hugging Face documentation: “It is your responsibility as a user to check if something is safe or not”. Simply trusting that the current security scans are sufficient is dangerous and doesn’t align with what Hugging Face and other repositories can commit to.
Individual users are advised to be vigilant by implementing open source scanners like ModelScan, use safer model formats when available, and checking the origin of models using gpg signing, etc..
For enterprises, the risk of using publicly exchanged files as foundational components in AI/ML workloads is not something that can be solved by trusting individual users. Enforcing that models come from trusted sources and are safe requires controls at the administrative level. For that reason Protect AI has launched Guardian as a secure gateway to enforce ML model security.