This is the third in a five-part series on implementing Secure by Design principles in AI system development

In our previous articles, we explored the evolving AI security landscape and detailed how to build Secure by Design AI systems through the MLSecOps lifecycle. Now, we'll focus on a particularly challenging category of AI systems that demands special security consideration: agentic AI.

Understanding Agentic AI

Agentic AI systems go beyond traditional AI models that simply respond to queries or perform single-task predictions. Instead, agentic AI systems can:

  • Act autonomously with limited human oversight
  • Make decisions based on goals and constraints
  • Take actions that impact their environment
  • Persist across multiple interactions and contexts
  • Learn and adapt their behavior over time

Unlike conventional AI that waits for human prompts, agentic AI proactively pursues objectives, leveraging various tools and information sources to achieve its goals. Examples include AI assistants that can book appointments, virtual agents that manage customer service interactions across multiple channels, and autonomous systems that optimize business processes with minimal supervision.

What makes these systems particularly powerful—and uniquly challenging from a security perspective—is their ability to chain together multiple capabilities and interactions with external systems to achieve complex outcomes.

When AI agents connect multiple AI systems, the attack surface broadens in a uniquely complex way—inaccurate information can propagate and amplify through the sequential action path of the system. For example, if any one AI in the system has an error or supplies inaccurate information, that can become the training data for the next system's processing, potentially creating a cascade of increasingly flawed outputs and actions.

This "error propagation" is particularly concerning when later systems in the chain lack mechanisms to verify or validate incoming information. Without appropriate checks, they may take action based on incorrect data, leading to real-world consequences ranging from minor inefficiencies to potentially harmful outcomes.

The challenge intensifies when systems operate across different domains with varying levels of transparency. Each handoff introduces potential for misinterpretation or context loss, especially when systems have different training foundations or operational parameters.

These complex risks highlight the need for reliable validation, meaningful metrics, policy decision and enforcement points, and even human “in the loop” oversight in multi-agent AI architectures and workloads.

Here’s a plausible real-world scenario to help illustrate:
Imagine a bank's AI misinterpreting a customer's transaction history during routine monitoring and incorrectly flagging legitimate international payments as "suspicious wire transfers" instead of "recurring business expenses," which leads to freezing their accounts and blocking a $10 million transaction - all because the original error got worse as it passed through each connected AI system. This highlights one reason why companies need safeguards between their AI tools to prevent small mistakes from causing big real-world consequences.

The Dual Nature of Agentic AI's Attack Surface

Agentic AI presents a fundamentally different security challenge than traditional AI systems because it embodies a hybrid nature: it contains both AI components (models, embeddings, vector stores) and traditional software elements (APIs, databases, authentication systems).

This dual composition creates a unique attack surface that cannot be secured through either traditional application security or AI security approaches alone. Instead, it requires a convergence of both disciplines:

  • AI-specific vulnerabilities: Prompt injection, training data poisoning, model inversion attacks
  • Traditional software vulnerabilities: API exploits, authentication bypasses, SQL injection
  • Hybrid vulnerabilities: Novel attack patterns that emerge from the interaction between AI and traditional components

For example, an attacker might use prompt injection to manipulate an AI agent into making API calls that appear legitimate but actually exfiltrate sensitive data. This attack vector doesn't exist in either traditional software or standalone AI models—it emerges specifically from their interaction within an agentic system.

The Convergence of MLSecOps and DevSecOps

This hybrid nature of agentic AI systems means that security teams must bring together two previously separate domains: MLSecOps and DevSecOps. While these approaches share common principles, they focus on different artifacts, tools, and vulnerabilities.

Why Convergence Is Necessary

Attempting to secure agentic AI using only traditional DevSecOps practices would miss critical AI-specific vulnerabilities like prompt injection, model poisoning, or adversarial attacks. Similarly, applying only MLSecOps would fail to address the traditional software vulnerabilities in the APIs, databases, and integration points that agentic AI relies upon.

True security for agentic systems requires applying Secure by Design principles across both domains simultaneously, creating a unified approach that protects the entire system, not just its individual components.

From Parallel Tracks to Integrated Security

Traditional organizations often have separate security workflows for application development and AI/ML development:

  • DevSecOps teams focus on code security, API protection, and traditional application vulnerabilities
  • MLSecOps teams concentrate on AI-supply chain security including machine learning models, training data validation, and AI-specific risks

For agentic AI, these parallel tracks must converge into an integrated security approach. This means:

  1. Unified threat modeling that considers both AI and software attack vectors
  2. Comprehensive security testing that evaluates the entire system, not just its components
  3. Integrated security controls that protect both AI and traditional elements
  4. Holistic monitoring that can detect anomalies across the full system

Secure by Design for Agentic AI: A Unified Approach

Implementing Secure by Design for agentic AI requires adapting the Cybersecurity and Infrastructure Security Agency's three principles to address the unique challenges of these hybrid systems.

1. Taking Ownership Across the Full Stack

Organizations developing agentic AI must take ownership of security outcomes across the entire technology stack—from the foundational models to the APIs that connect them to external systems. This means:

  • Clarifying responsibility for security across AI and traditional components
  • Establishing security requirements that span both domains
  • Creating unified security processes that cover the entire agentic system
  • Building security expertise that bridges AI and application security

In practice, this might mean forming cross-functional security teams with expertise in both domains, or establishing clear handoffs and joint reviews between AI and application security specialists.

2. Radical Transparency Across Boundaries

Transparency becomes even more critical—and more challenging—in agentic systems where decision-making spans multiple components. Organizations must document:

  • How agentic systems make decisions and what factors influence them
  • What external systems and data sources the agent can access
  • What guardrails and controls limit the agent's actions
  • How the organization monitors and governs agent behavior

This transparency creates accountability and enables effective security governance of agentic AI throughout its lifecycle.

3. Leadership That Bridges Domains

Effective security for agentic AI requires leadership that understands both AI risks and traditional application security. Executive teams must:

  • Champion integrated security that spans traditional and AI domains
  • Allocate resources for securing the full agentic AI system
  • Establish governance that addresses the unique risks of autonomous systems
  • Create organizational structures that bridge traditional silos

Implementing Defense in Depth for Agentic AI

Defense in Depth (DiD) for agentic AI means implementing security controls at multiple layers across both AI and traditional software components:

Layer 1: Data Security

  • Validate training data for both security and quality concerns
  • Implement proper access controls for all data sources
  • Protect against data poisoning in both initial training and ongoing learning

Layer 2: Model Security

  • Secure the underlying models against adversarial attacks
  • Implement guardrails to prevent harmful outputs
  • Validate model updates and fine-tuning processes

Layer 3: Agent Logic Security

  • Secure the decision-making logic that governs agent behavior
  • Implement intention validation to verify agent actions align with intended goals
  • Apply least privilege principles to agent capabilities

Layer 4: API and Integration Security

  • Secure all connections between the agent and external systems
  • Implement proper authentication and authorization for all API calls
  • Validate inputs and outputs at every integration point

Layer 5: Action Execution Security

  • Monitor and validate all actions taken by the agent
  • Implement approval workflows for high-risk actions
  • Create rollback mechanisms for problematic agent behaviors

Practical Security Controls for Agentic AI

Moving from principles to practice, here are specific security controls that address the unique risks of agentic AI:

Input Validation and Sanitization

Beyond basic input validation, agentic AI requires semantic analysis of inputs to detect potential prompt injections or manipulative queries. This means implementing:

  • Intent classification to categorize incoming requests
  • Content filtering to detect potentially malicious instructions
  • Context-aware validation that considers the agent's current state and permissions

Execution Monitoring

Unlike traditional AI that simply generates outputs, agentic AI executes actions that must be monitored and controlled:

  • Action logging that records all agent decisions and their outcomes
  • Anomaly detection to identify unusual patterns of behavior
  • Rate limiting to prevent excessive or rapid-fire actions

Permission Boundaries

Agentic AI requires sophisticated permission models that go beyond simple role-based access:

  • Contextual permissions that adapt based on the user, task, and environment
  • Capability control that explicitly defines what actions an agent can perform
  • Temporal restrictions that limit when certain actions can be taken

Human-in-the-Loop Controls

For high-risk operations, human oversight provides an essential security layer:

  • Approval workflows for sensitive actions
  • Explainability features that help humans understand agent decisions
  • Override mechanisms that allow human intervention when needed

Threat Modeling for Agentic AI

Effective threat modeling for agentic AI must consider unique attack vectors that span the AI-software boundary:

Lateral Movement

Agentic AI systems often have access to multiple systems, creating risk of lateral movement where an attacker compromises one component to access others. Security teams must:

  • Map all connections between the agent and other systems
  • Implement strict access controls and authentication at each boundary
  • Monitor for unusual patterns of access across systems

Prompt Injection Leading to Code Execution

A sophisticated attack might use prompt injection to manipulate an agent into executing malicious code. Defenses include:

  • Separating instruction processing from code execution contexts
  • Implementing approval workflows for code generation and execution
  • Validating all generated code before execution

Data Exfiltration via Chained Actions

Attackers might trick agents into exfiltrating data through a series of seemingly benign actions. Mitigations include:

  • Monitoring data access patterns across agent interactions
  • Implementing data loss prevention specialized for agentic systems
  • Enforcing least privilege for data access

Conclusion: Unified Security for a Hybrid Future

Agentic AI represents a fundamental shift in how we build and deploy AI systems—moving from passive models to active agents that make decisions and take actions. This shift demands an equally fundamental evolution in our security approach, bringing together MLSecOps and DevSecOps into a unified framework.

By applying Secure by Design principles across both AI and traditional software domains, organizations can build agentic systems that harness the power of autonomous AI while maintaining robust security. This doesn't require inventing entirely new security paradigms. It means thoughtfully adapting and combining proven practices from both domains to address the unique challenges of agentic AI.

As AI continues to evolve toward greater agency and autonomy, this unified security approach will become increasingly essential for protecting not just the AI components but the entire ecosystem in which they operate.

In the next part of our series, we'll examine specialized tools and technologies needed to implement Secure by Design principles effectively throughout the AI lifecycle, from testing and vulnerability management to monitoring and incident response.


 

Stay tuned for Part 4 of our series: Tools and Technologies for Secure by Design AI Systems

Ready to dive deeper? Get the full white paper: Securing AI’s Front Lines: Implementing Secure by Design Principles in AI System Development