What’s Old is New - Natural Language as the Hacking Tool of Choice
May 25, 2023 •Daryan Dehghanpisheh • 4 min read
What we’re reading: We came across Daniel Miessler’s excellent blog and framework, The AI Attack Surface Map v1.0. Since we focus on the security of AI applications and ML systems, we could not pass this up and we’re glad we didn’t. Neither should you. Daniel’s elegant and simple method of structuring the surface for many forms of possible attacks on an AI system is a master class in “Invent and Simplify.” The best part about this, is that the tool of choice is (drum roll!): Natural Language! With this backdrop, his map is akin to a flow chart of the AI Wild West, but instead of bandits and outlaws, those of us in AI development need to worry about poisoned data and abuse from rogue APIs, where all of this menace is unleashed with the simplicity of “plain speak.” If you're building AI, be sure to arm yourself with this knowledge before you head out into the wilderness.
Relevance to ML Security: The piece gives a typical developer and offensive security practitioners a way to think about the various attack surfaces within an AI system. The AI Attack Surface Map is divided into three main sections: models, components, and attacks. Each section covers the various ways that AI systems can be attacked. He also discusses the importance of AI assistants, which combine knowledge and access, making them a new target for the next generation of Social Engineering attacks. After reading this, we’re left feeling that attacking people's AI assistants will have high impact and be emotionally unsettling because they need to know massive amounts about you and your mannerisms to be useful, which makes them - and us - vulnerable.
For the ML Team: ML systems are sophisticated and widely used across various applications, but their complexity also renders them more susceptible to attacks. Attackers can exploit ML systems in several ways: Data Poisoning is the injection of malicious data during training to induce incorrect behavior. API abuse involves leveraging vulnerabilities in ML system APIs to gain unauthorized access or manipulate behavior. Lastly, there's targeted system tampering where attackers physically tamper with storage, CPU or memory errors to gain control of the system.To safeguard ML systems, understanding these attack vectors is crucial, along with implementing appropriate security measures. These measures include secure data collection and storage practices, such as encryption and access controls, security controls at all layers of the ML stack (API, hardware, and software), and monitoring for signs of attack like unusual activity or performance degradation. By following these steps, ML teams can fortify their systems. Moreover, additional tips involve keeping ML systems updated, adhering to a secure development lifecycle, training models on secure data, and deploying them in secure environments, collectively bolstering protection against attacks.
For the SEC Team: Securing an ML system in an enterprise starts with understanding the vulnerabilities and security risks associated with those systems. ML environments, with their increasing sophistication and widespread use, have become attractive targets for attackers using techniques and tools that are both old, and new. Being aware of the potential new attack vectors is essential. For example, ML teams often store sensitive data (PII, PHI, and secrets) in Jupyter notebooks, but these notebooks are often outside of the scope of normal data discovery processes. To protect ML systems from such attacks, the AppSec team should focus on implementing appropriate security measures which includes fine grain roles and permissions alongside more secure data collection and storage practices, while also ensuring encryption and access controls are in place, prioritizing security controls at all layers of the ML stack, including APIs, hardware, and software components. You can get started on this at nbdefense.ai which is an open source software solution for scanning Jupyter notebooks, helping build accelerating ML security.
Our thoughts: Protect AI is dedicated to building a platform that fosters a safer AI-powered world. Recognizing the immense value of Daniel's map and framework, we understand their significance in constructing secure ML systems and applications. His invaluable resource provides a structured framework for comprehending vulnerabilities and attack vectors within ML systems, particularly emphasizing the vulnerability of AI assistants as a new form of Social Engineering.
While implementing security measures like secure data collection, encryption, access controls, and layered security controls can bolster the security of ML systems, current tools often fall short in seamlessly and intuitively providing these capabilities to ML engineers and AppSec teams. These tools frequently lack the ability to effectively reconstruct and map version differences across ML pipelines and systems. This limitation not only hinders the identification of threats but also impedes the remediation process. After all, one cannot address a problem they can’t identify. Vulnerabilities remain when changes in the model and the entire ML system, including who modified what, when, and where, are not visible.This situation leads to the emergence of AI Zero Days and a host of problems, carrying substantial risks for enterprises.
Protect AI recognizes the urgent need to address these challenges and is developing a platform that empowers both security professionals and ML engineers to enhance their security posture. We provide a solution that doesn't hinder innovation and productivity in the AI field but instead enables secure practices to thrive. Our platform easily integrates with your existing security tools and extends your perimeter of defenses to your ML system, fosters visibility of your entire ML threat surface, and helps you mitigate risks by moving from MLOps to MLSecOps in a seamless and intuitive way, using AI Radar. Contact us to learn more.