AI-Powered App Security — The New Attack Surface Startups Are Ignoring | Kuboid Blog

AI-Powered App Security — The New Attack Surface Startups Are Ignoring

TLDR: Your team shipped an AI feature last sprint — a chatbot, a document analyser, an AI agent that can query your database. You tested it for bugs. You tested it for accuracy. But did anyone test it for security? AI-powered features introduce a class of vulnerabilities that traditional scanners were never built to detect. Prompt injection topped OWASP's 2025 LLM Top 10 as the #1 risk. GitHub Copilot was hit with a remote code execution vulnerability via prompt injection in 2025. Real incidents are no longer theoretical. If you're shipping AI features without dedicated security testing, you've opened an attack surface that no one on your team is watching.

The Deployment Speed Gap

The pace at which engineering teams are embedding AI features into production applications has far outrun the security community's ability to keep up. What used to be a six-month research project — integrating a language model into a product — is now a two-week sprint. Frameworks like LangChain and the Model Context Protocol (MCP) make it straightforward to connect an LLM to your database, your email system, your file storage, and your APIs.

That accessibility is genuinely powerful. It's also creating a security blindspot at scale.

An Orca Security report found that 84% of organisations now use AI-related tools in the cloud, and 62% had at least one vulnerable AI package in their environments. A separate Cloud Security Alliance report found that one third of organisations experienced a cloud data breach involving an AI workload. These numbers reflect an industry that is shipping faster than it is securing.

The problem isn't that developers are being reckless. It's that the security playbook most teams follow — penetration testing, SAST scanning, dependency audits — was written for traditional web applications. It doesn't cover what an LLM introduces.

Two Attack Surfaces, Not One

When your application gains an AI layer, your attack surface doubles — and the second layer is one most security tools completely ignore.

The first layer is the traditional web application: authentication, access control, input validation, API security. These risks haven't gone away. If anything, AI features often expand this surface — new API endpoints, new third-party integrations, new data pipelines.

The second layer is LLM-specific. And it operates on entirely different principles. LLM vulnerabilities fundamentally differ from traditional application vulnerabilities due to their reliance on large, complex data sets and unique interactions with users. The nature of AI systems introduces risks such as biased outputs based on training data discrepancies, unlike more predictable software flaws found in traditional applications.

Traditional security tools scan for patterns in code. LLM vulnerabilities often aren't in the code at all — they emerge from how the model interprets natural language inputs it was never designed to anticipate. Your SAST scanner will not catch a prompt injection attack. Your WAF will not flag it. Your penetration tester, if they're testing your app without LLM-specific methodology, will miss it entirely.

The New Attack Categories You Need to Know

Prompt Injection — The #1 Risk

In 2025, the OWASP Gen AI Security Project listed prompt injection as the #1 security risk for LLM applications.

Here's the core problem: an LLM cannot reliably distinguish between instructions it was given by your system and instructions embedded in content it's processing. If your AI agent summarises documents, browses URLs, or reads emails — an attacker can embed hidden instructions in that content and manipulate the model's behaviour.

In May 2025, a vulnerability in the ChatGPT connector — which allows ChatGPT to read content from Google Drive and SharePoint documents — was exploited via prompt injection. Sensitive data, including API keys, access tokens, and confidential business files stored in connected cloud services, was exposed because the AI treated malicious instructions hidden inside processed documents as legitimate user commands.

This is not a niche research scenario. It is happening in production.

Excessive Agency

Granting LLMs unchecked autonomy to take action can lead to unintended consequences, jeopardising reliability, privacy, and trust. As teams build AI agents that can send emails, query databases, trigger workflows, or modify files, the blast radius of a successful prompt injection grows from "the model said something odd" to "the model exfiltrated data and sent it externally."

In 2025, GitHub Copilot suffered from CVE-2025-53773, allowing remote code execution through prompt injection, potentially compromising the machines of millions of developers. The vulnerability exploited the agent's ability to modify configuration files without user approval — a capability that was useful by design, and catastrophic under attack.

Sensitive Information Disclosure

LLMs can leak data in ways that are surprisingly difficult to anticipate. System prompts — the internal instructions that define how your AI behaves — can often be extracted by a determined user through carefully crafted inputs. If your system prompt contains API credentials, internal logic, or configuration details (and many do), that information can be surfaced.

Beyond prompts, a LayerX report in 2025 found that 77% of enterprise employees who use AI have pasted company data into a chatbot query, and 22% of those instances included confidential personal or financial data. Your AI feature may be the most convenient data exfiltration channel your employees have ever used — without any of them realising it.

Vector and Embedding Weaknesses

If your AI application uses Retrieval-Augmented Generation (RAG) — where the model pulls context from an internal knowledge base before responding — your vector database is now part of your attack surface. Vectors often aren't encrypted or access-controlled like raw data, creating an unexpected backdoor. In some cases, even partial data from the model's training set can be extracted via embeddings. OWASP added this as a new entry in the 2025 LLM Top 10 specifically because RAG adoption has outpaced awareness of the risks it introduces.

Why Traditional Security Tools Miss All of This

The reason this matters so much for business leaders and CTOs is that you cannot buy your way out of this problem with the same tools you've been using. Your existing security stack was not designed for it.

Traditional perimeter defences fail against prompt injection because the attack vector operates at the semantic layer, not the network or application layer. A firewall inspects packets. A WAF looks for known malicious patterns in HTTP requests. A SAST scanner analyses your code's syntax and logic. None of these have visibility into what happens when your LLM interprets a natural language instruction embedded in a PDF that a user uploaded.

This doesn't mean your existing security investments are wasted. It means they need to be complemented with testing methodologies specifically designed for AI systems.

What the OWASP LLM Top 10 Signals

The OWASP Top 10 for Large Language Model Applications started in 2023 as a community-driven effort to highlight and address security issues specific to AI applications. Since then, the technology has continued to spread across industries and applications, and so have the associated risks.

The 2025 edition of the list — the most comprehensive update yet — reflects three major shifts in the threat landscape: the rise of agentic AI with real-world permissions, the widespread adoption of RAG pipelines, and the growing exploitation of system prompt leakage. Several entries have been significantly reworked or added, addressing emerging risks and community feedback. Prompt Injection maintained its position at the top of the list.

The OWASP LLM Top 10 is the clearest signal the security community has sent to engineering teams: AI features are not just a product concern. They are a security concern, and they require their own testing framework.

What AI Security Testing Actually Involves

Testing an AI-powered application for security requires a different approach from traditional penetration testing — though it runs alongside it, not instead of it.

Prompt injection testing involves systematically attempting to manipulate the model's behaviour through crafted inputs — both directly through the user interface, and indirectly through data sources the model processes (documents, URLs, emails, database records).

Privilege and access boundary testing examines whether the AI agent's permissions are correctly scoped. If the model can query your database, can it query tables it shouldn't be able to access? If it can send emails, can it be manipulated into sending them to external addresses?

System prompt extraction attempts probe whether an attacker can recover your internal instructions through conversation manipulation — which would expose your security controls, API keys, or proprietary logic.

RAG pipeline integrity testing assesses whether your knowledge base can be poisoned — whether injecting manipulated content into your document store can alter the model's behaviour for all users.

Shadow AI discovery examines whether employees are using unsanctioned AI tools that process company data outside of your security controls — a risk that is frequently overlooked and consistently underestimated.

How Kuboid Secure Layer Can Help

At Kuboid Secure Layer, we work with engineering teams building AI-powered products who recognise that their existing security testing doesn't cover this new layer.

Our AI security assessments apply LLM-specific testing methodologies alongside traditional application security testing — covering both the web layer your team knows and the AI layer that most security teams don't yet have a methodology for.

If you're shipping AI features and want to understand what's actually exposed, let's talk. The attack surface exists whether you're testing it or not.

Final Thought

Every AI feature your team ships is a tradeoff between capability and exposure. That tradeoff isn't a reason not to build — it's a reason to build with eyes open. The teams that understand what they've introduced, test it deliberately, and design their AI systems with security boundaries in place are the ones that will ship fast without paying for it later.

The attack surface is new. The principle isn't: understand what you've built, then test whether someone else can break it.

Kuboid Secure Layer provides AI security assessments, application penetration testing, and security advisory services for businesses building on modern technology. Learn more at www.kuboid.in or explore our full services page.

AI-Powered App Security — The New Attack Surface Startups Are Ignoring