Second in a five-part series on implementing Secure by Design principles in AI system development

In part 1, we explored the evolving AI security landscape and introduced CISA's Secure by Design framework. Now, we'll dive deeper into how organizations can implement these principles through a comprehensive security strategy that spans the entire AI development lifecycle.

Fundamental Security Requirements for AI

The implementation of secure design principles for GenAI systems requires a focused approach to security fundamentals. The CIA triad—Confidentiality, Integrity, and Availability—forms the cornerstone of this framework when adapted to AI contexts.

1/ Confidentiality in GenAI

GenAI systems demand robust access controls and encryption for both training data and model parameters. This prevents unauthorized exposure of sensitive information that might be embedded within models or extracted through sophisticated prompting techniques.

2/ Integrity for AI Systems

Integrity requires mechanisms to verify that AI outputs remain accurate and unaltered. This includes defense against adversarial attacks that could subtly manipulate model responses, as well as enabling traceability between inputs and outputs.

3/ Availability Considerations

Availability focuses on maintaining consistent AI system performance while preventing denial of service (DoS) through resource exhaustion or prompt injection attacks.

AI Model and Data Governance

Effective model and data governance complements these principles through:

  • Comprehensive inventories of models and datasets
  • Clear documentation of data provenance and model limitations
  • Regular security assessments focusing on AI-specific vulnerabilities
  • Robust change management protocols
  • Continuous monitoring for model drift or anomalous behavior

Machine Learning Security Operations (MLSecOps) creates a layered architecture where security is woven into every phase of the AI lifecycle. Organizations that implement security at only one stage leave critical vulnerabilities elsewhere in the pipeline—like securing your front door with four deadbolts while leaving the windows unlocked. To be truly effective, security for AI must span from the very initial scoping phase all the way through continuous monitoring in deployment.

Secure by Design in the MLSecOps Lifecycle

Building Secure by Design AI systems requires a defense in depth (DiD) approach, integrating security controls at every phase of the MLSecOps lifecycle.

As agentic AI systems—those capable of autonomous decision-making—become more prevalent, the convergence of MLSecOps with DevSecOps practices is crucial to manage the expanded attack surface and leverage consistent security policies across both AI-specific and traditional software risks. This integration enables comprehensive monitoring, policy enforcement, and incident response capabilities, which are essential for mitigating the unique vulnerabilities associated with agentic AI.

Let's explore how security tasks within the MLSecOps lifecycle map to the OWASP® Top 10 for LLMs and GenAI, MITRE ATLAS™, and NIST AI Risk Management Framework (AI-RMF) bodies of work, providing practical guidance on how to build AI systems that are Secure by Design.

1/ Scope Phase

The Scope phase aligns with NIST AI-RMF's "Map" function, focusing on identifying attack surfaces and defining security requirements early. This phase includes threat modeling tailored to AI systems, which helps anticipate potential risks like prompt injection (OWASP LLM01) and supply chain vulnerabilities (OWASP LLM03).

Threat models should consider the entire AI pipeline, from data ingestion to deployment, highlighting risks such as data poisoning and model inversion attacks. Key ML techniques from MITRE ATLAS that can be threat modeled during this phase include:

  • ML Supply Chain Compromise: identifying vulnerabilities in pre-trained models, datasets, and dependencies
  • Model Reconnaissance: probing to understand model boundaries and behaviors
  • Exfiltration via ML Inference: extracting sensitive information through model outputs

Security requirements must specify controls for confidentiality, integrity, and availability, ensuring that AI systems can protect sensitive information embedded in training data (OWASP LLM02) and maintain accurate, unaltered outputs. Additionally, policy considerations must address regulatory compliance and ethical use of AI, aligning with NIST's "Governance" function.

2/ Data Preparation Phase

The Data Preparation phase focuses on maintaining data integrity and privacy, aligning with the NIST AI-RMF "Measure" function. Key controls include data validation and labeling, which help prevent data and model poisoning (OWASP LLM04).

Validating that data sources are vetted and trustworthy is crucial for mitigating misinformation risks and adversarial inputs designed to corrupt model training. Privacy considerations must include techniques like differential privacy and encryption to protect sensitive information during both the training and inference phases.

Threat modeling should be refined here to include specific risks associated with the data supply chain, such as unauthorized access and tampering. Aligning these practices helps build robust defenses against techniques described in the MITRE ATLAS matrix, such as:

  • Data Manipulation
  • Data Poisoning
  • Exfiltration of Sensitive Information

3/ Model Training Phase

The Model Training phase is where secure coding practices and rigorous testing come into play. Aligning with NIST's "Manage" function, this phase must incorporate model risk assessments to identify vulnerabilities like improper output handling (OWASP LLM05) and vector and embedding weaknesses (OWASP LLM08).

AI pipeline security, including dependency management and validation of pre-trained models, is critical for preventing supply chain risks. Appropriate controls and defenses around techniques in the ATLAS matrix that map to this phase include:

  • AI Model Inference API Access
  • ML Model Access
  • ML Supply Chain Compromise
  • Poison Training Data

Key practices include secure coding standards tailored for AI, such as input validation and output sanitization, to defend against excessive agency risks (OWASP LLM06). Integrating security testing, including adversarial robustness assessments and model scanning for known vulnerabilities, help confirm that models do not produce unsafe or biased outputs.

Moreover, evaluating the trade-offs between performance and security during this phase helps in balancing risk and efficiency.

4/ Testing Phase

In the Testing phase, aligning with the NIST AI-RMF "Measure" function, security testing must be comprehensive and continuous. This includes adversarial testing (red teaming/penetration testing) to uncover vulnerabilities such as system prompt leakage (OWASP LLM07) and unbounded consumption (OWASP LLM10).

Security testing methodologies should also validate compliance with regulatory standards and internal policies, ensuring that AI systems are robust against both technical and operational threats. Protection against ATLAS ML techniques that should be tested for in this phase include:

  • Model Evasion
  • Prompt Extraction
  • Prompt Injection
  • Inference Manipulation

Testing agentic AI systems presents unique challenges, as the behavior of AI agents can differ significantly when deployed as part of a larger ecosystem. Comprehensive testing must cover not only individual model components but also the interactions between them, identifying risks that emerge only in full-system operations.

This phase should also incorporate behavioral analysis to detect anomalies and verify that AI agents act within predefined policy boundaries.

5/ Deployment and Monitoring Phases

The Deployment and Monitoring phases align with the NIST "Govern" function, emphasizing secure deployment patterns and continuous oversight. Security controls must include model signing to verify model authenticity and prevent unauthorized modifications.

Additionally, supply chain vulnerability management should address risks associated with third-party libraries and pre-trained models integrated into the AI pipeline. Continuous monitoring is critical for detecting emerging threats, such as misinformation (OWASP LLM09), and for keeping AI systems in compliance with evolving regulations.

Monitoring should include anomaly detection mechanisms to identify deviations in model behavior, potentially indicating adversarial attacks or data drift. Policy enforcement must extend to incident response, ensuring that security breaches are promptly detected, contained, and addressed. MITRE ATLAS techniques to monitor for include:

  • Denial of Service
  • Model Tampering
  • Exfiltration via both Inference APIs and Cyber Means
  • Jailbreaks

For agentic AI systems, monitoring requirements are more stringent due to the autonomous nature of decision-making processes. Effective monitoring must track not only model outputs but also the decision pathways and external interactions, providing comprehensive oversight of AI behaviors.

It is also critical to watch for lateral movement and attempts to access systems beyond authorized boundaries, particularly in agentic AI systems that interact with multiple environments. Lateral movement can occur when an AI agent leverages initial access to one system to navigate to adjacent systems, potentially expanding its reach beyond intended operational constraints.

Such movement might manifest as an agent using credentials or permissions granted for one task to access unrelated databases, APIs, or computing resources. In agentic AI, this lateral movement presents unique challenges as the agent's autonomous nature means it may discover and exploit pathways that weren't anticipated during system design.

Monitoring within these systems will need to focus on mapping the complete operational graph of agent activities, tracking not just what resources are accessed but the sequential relationship between access events.

Conclusion

As AI systems grow more complex and autonomous, securing them requires a holistic approach that extends beyond individual safeguards. Integrating security throughout the MLSecOps lifecycle is essential to address vulnerabilities at every phase—from initial scoping to continuous monitoring.

By aligning security tasks with established frameworks like the 2025 OWASP Top 10 for LLMs and GenAI, MITRE ATLAS, and NIST AI-RMF, organizations can build AI solutions that are not only Secure by Design but also resilient to evolving threats.

In the next part of our series, we'll focus on agentic AI systems and the unique security challenges they present, exploring how MLSecOps and DevSecOps must converge to create truly secure autonomous AI.

 


 

Stay tuned for Part 3 of our series: Securing Agentic AI: Where MLSecOps Meets DevSecOps

Ready to dive deeper? Get the full white paper: Securing AI’s Front Lines: Implementing Secure by Design Principles in AI System Development