AIØD: MLflow Could Expose Entire ML System

 

TL;DR: MLflow versions prior to 2.2.2 have AI Zero Days (AIØD).

  • Protect AI uncovers two critical vulnerabilities in MLflow, a popular MLOps System, with AI Zero Days (AIØD) prior to version 2.2.2

  • The CVEs are: “CVE-2023-1176 - Blind LFI in register-model/get?name=” fixed in version 2.2.2 and “CVE-2023-1177 - LFI/RFI in register-model/get-artifact” fixed in version 2.2.1.

  • Remotely test multiple MLflow Servers by downloading our CVE-2023-1177-scanner tool.

  • Hackers can gain access and take complete system control or perform a cloud provider takeover exploit, and also walk off with some valuable IP in the form of your ML system and its models. 

  • Protect AI offers significant bounties for AI vulnerability research and collaboration, and you can find our latest bounty prices by email.

  • From a business lens, trained models can be worth millions in revenue and much more in enterprise value to any organization that relies on ML for a critical business operation or function.

  • From an AppSec/InfoSec perspective, this vulnerability creates a new threat surface that is often unknown, unseen, and hard to patch due to a lack of current frameworks, tools, and scripts that can help blue teams easily and quickly patch this AI Zero Day.

  • Protect AI shields your ML. Contact us to learn more about this and other ML Supply Chain security considerations, and how our technologies help you build a more secure AI application environment. 

Preventing AI Zero Days: A Key Focus of Protect AI

When Protect AI launched in December 2022, one of our core missions for customers was to make the ML ecosystem more secure from AI Zero Days. You can read more about AI Zero Days, here. Today, we announce our discovery of MLflow’s vulnerability that allows for a Local File Inclusion/Remote File Inclusion exploit, which can lead to complete system or cloud provider takeover. We detail the tools, attack methods, and provide suggestions for remediation in accompanying blogs, listed at the end of this piece.

Our research and this vulnerability highlights the gaps that exist between the existing security tools, processes, frameworks, and the unique needs of ML systems and MLOps processes. Vulnerabilities such as those we discovered and the growing challenge of other ML Supply Chain risks will accelerate as AI adoption continues to grow, exponentially. Developers who write to the platform abstraction layer of models such as OpenAI's GPT will expand the ML software supply chain, thereby increasing the likelihood of security vulnerabilities and threats in ML systems and AI applications.

Protect AI extensively researches popular open source assets in the ML ecosystem. The use of open source packages in the ML software supply chain is pervasive, with packages like MLflow having over 13 million monthly downloads being among the most popular.

Context: MLflow and OSS in ML Development

ML 2023 is reminiscent of Mobile Web 2010, when companies were fixated on creating mobile apps. Today, many C-suite executives are similarly obsessed with AI. ML development is an emerging technology, characterized by swift innovation and a lack of emphasis on security. This is typical of new tech sectors, which are often plagued by security threats due to the speed of development and the challenges encountered by pioneering engineers.

OSS (Open Source Software) packages, such as TensorFlow, PyTorch, Apache Spark, and MLflow, are commonly used in academia, research, and industry to develop and deploy AI models. Although these tools are relatively new, they have already experienced rapid success in terms of widespread adoption and development.

However, security is frequently overlooked in these cutting-edge OSS tools, much like in the early days of mobile web OSS frameworks. This is because the primary objective is to accelerate production cycles, rather than building security features into the architecture of these tools. As a result, even commercial providers may not adequately address vulnerabilities, leaving popular ML platforms susceptible to AI Zero Days and ML software supply chain vulnerabilities. Rectifying these issues can be difficult, particularly when dealing with customized toolchains for MLOps pipelines and ML systems.

When Protect AI notified the MLflow maintainers under our standard Responsible Disclosure Policy, they immediately began working to repair and patch the code. The latest and patched version can be downloaded here. Our engagement with the maintainers and the Databricks personnel is a great model for what the ecosystem needs between ML OSS maintainers and researchers working to build a safer AI-powered world. We commend The MLflow team for implementing such a quick, robust fix.

MLflow, created by Databricks in 2018 and released as open-source, is a popular OSS platform for managing ML experiments and models. Data scientists can track experiments and models as well as package and deploy models with ease. It has become a go-to platform for managing critical steps in ML experiments. Given its flexibility, the speed at which Databricks and the MLflow maintainers fixed the vulnerability is noteworthy.

MLflow has several primary use cases, including experiment tracking, model packaging, reproducibility, and model deployment. The open source variant is used by several publicly referenced companies, including Red Hat, Intel, H&M, and Siemens and has a large and active community of contributors and users, making it one of the most popular platforms for managing the machine learning lifecycle. 

Our scan showed over 800 publicly accessible entities and corporations using MLflow, including some of the Fortune 500. However, it is likely that the pervasiveness of this vulnerability extends to even more entry points if one assumes there could be rogue ML actors with appropriate permissions and a desire to inflict harm. (As my uncle used to say, “Always prepare for the ‘Oh (*&)! moment.’”) In short, all of these publicly identified entities, and likely many more, have (or, had) this vulnerability unless the pipeline is patched. But, closing this vulnerability quickly and easily is made more complex by the lack of visibility and tools that help with patch automation.

OSS in ML: Supply Chain Vulnerabilities Often Unable to be Seen or Fixed

OSS democratizes machine learning by making cutting-edge tools and frameworks accessible to anyone with an internet connection, the necessary data sources, and infrastructure capabilities. However, when OSS meets ML, there are critical gaps between the tools used to create AI applications and the current security landscape designed to prevent system vulnerabilities.

As seen with vulnerable OSS tools such as WordPress, Jenkins, and Tomcat, organizations often deploy OSS services before implementing security measures. Many OSS tools lack automatic updates, and end-users are responsible for implementing complicated patch management systems. Common tools used to identify out-of-date servers, such as Nmap or Nessus, lack a fingerprint of ML services, making it difficult to identify which version is in use and leaving the burden of tracking installations and updates on end-users. This results in a security team and company that are blind to potentially catastrophic vulnerabilities. Put simply, automated and digestible frameworks for insight into the ML workflow ingredients do not exist, leaving ML systems vulnerable to security threats. This is the reason we created Protect AI: To help customers secure their AI applications and ML investments.

Patching MLflow: Finding and Testing this AIØD

Currently, common network scanners like Nmap do not recognize MLflow service fingerprints. We have submitted the fingerprint to the database.

One way to find MLflow servers on your network is to perform an Nmap service scan, e.g. nmap -sV 192.168.0.0/24 -oX nmapscan.xml, and search the output for <title>MLflow</title> to identify relevant servers on your network.

 

Testing for this vulnerability in your organization can be done by logging into the MLflow server and running mlflow --version . If the response has a version number below 2.2.1, your server is vulnerable.

 

To remotely test a single or multiple servers, download and run our CVE-2023-1177-scanner tool.

Protect AI: Shielding Your ML

Protect AI offers products, services, and research insights that help our customers understand their AI and ML system threat surfaces. Our clear visualization and context mapping for unique ML assets helps deliver a stronger security posture over entire ML systems, helping our customers shield their ML with improved threat visibility, tailored ML security testing, and remediation run-books. 

Protect AI offers rewards for AI vulnerability research and collaboration in cooperative bug bounty hunting. Contact our security research leadership to learn more, see if you qualify, and receive our latest bounty rewards list via email.

Contact us to learn more and build a safer AI-powered world. To ensure your MLflow-powered system is secure, read our blogs below and patch all MLflow instances and installations immediately. If you have questions, email us or connect with us on any of our social media channels.

Further reading: 

Hacking AI: Steal Models from MLflow, No Exploit Needed
Hacking AI: System and Cloud Takeover via MLflow Exploit