DerScanner > Blog > There’s AI in My Soup: A Critical Look at Artificial Intelligence in AppSec

Over the past three years, artificial intelligence has crossed from “a shiny new feature” to default business expectation. Product roadmaps across nearly every cybersecurity domain — from penetration testing to application security — now include an “AI agent” or “AI report generator.”
And application security domain (AppSec) has not been left out. Once defined by scanners and rule sets, the field is now framed by LLM's, Agentic AI, and predictive analytics. For investors, AI features signal growth. For enterprises — pressure to modernise.
However, how much of AI’s expansion in Application Security actually reflects necessity, and how much of it reflects market gravity?
AI in Application Security has drastically changed how developers and security specialists work and think. The growth of modern software (with their microservices, rapid development cycles, and complex dependency networks), have already surpassed the capabilities of regular manual security management.
AI benefits modern business with its obvious scalability capacity:
Major vendors like Snyk, Checkmarx, Semgrep, and others have already started implementing AI: in code scanning, bug triage, different kinds of runtime analysises, code quality, etc.
The latest 2025 Fastly survey says that over 25% of companies are planning a near-term investment in AI-powered AppSec tools, which doubles in the Asia-Pacific market.
With that being said, the trend becomes clear to notice — AI is already written in the roadmaps of modern Application Security (AppSec) tools.
The implementation and popularization of AI certainly has financial implications beyond the development of bots and process workflows. Let's consider the categories that AI also directly or indirectly impacts.
| Costs | In reality | Why it hits | External research |
| Enterprise licensing | Migrating from standard licenses to enterprise grade costs a fortune with its private endpoints, throttling rules (SLA), and additional governance features. | Using AI at scale is always a paid feature. Using the product for small-scale use is one price; however, building your own AI add-on and/or a separate GenAI provider is a completely different matter. | A GenAI pricing analysis showed that enterprise offerings easily reach six figures for an annual subscription, especially when combined with additional security and compliance features |
| Usage-based pricing (tokens / generations) | Billing is typically charged per 1,000 tokens, requests, or generations, including regeneration attempts and background AI calls. | Daily workloads, especially in AppSec (with this amount of datasets), require significant resources: large codebases, multiple branches, regular scans, continuous integration runs, and agent invocations. Every scan, comparison, or sorting suggestion burns tokens, and therefore money. | GenAI's token-based pricing poses a serious challenge when planning AI implementation budgets, as the "context window tax" and hidden usage patterns quickly increase service usage costs |
| Compute & infrastructure | GPU/CPU for inference, vector stores, indexing, storage of code embeddings and security logs. | AI engines that analyse full repositories, SBOMs, and runtime telemetry are compute-heavy by design. | IBM’s compute report expects average cost of compute to climb 89% between 2023 — 2025, with generative AI identified as a core driver and enough pain that some organisations are cancelling or delaying AI projects entirely |
| Hallucinations and rework (“AI clean-up tax”) | Security analysts are meant to double-check AI triage, re-run scans, correct remediation advice and false correlations. | In AppSec, a “confident but wrong” suggestion is annoying: wasted time, missed issues, and mistrust for the tool. | “AI workslop” study shows low-quality AI output clogging workflows and costing companies millions in wasted effort; some studies estimate no measurable ROI for the majority of organizations using GenAI today |
| Experimentation & iteration | Trying different models, prompt designs, thresholds, UX patterns for developers and AppSec teams, then re-wiring pipelines. | AI in AppSec is not plug-and-play. Getting acceptable signal-to-noise often means multiple iterations across tools, policies, and teams. | IBM notes that rising compute and experimentation costs are already pushing organisations to pause GenAI initiatives; FinOps communities stress the need for dedicated cost controls just for AI workloads |
| Integration & data plumbing | Connecting code repos, CI/CD, SBOMs, CMDB, ticketing, and runtime telemetry into something a model can actually reason over. | Without rich context, AI in AppSec either over-flags or under-prioritises. That context lives in systems that don’t always talk to each other. | FinOps and AI cost-overview papers repeatedly point to poor data and integration maturity as the top reason PoCs/PoVs fail to scale or deliver value, regardless of model quality |
| People & change (the PwC angle) | Training security and dev teams, redesigning roles, building trust, handling scepticism and fear of automation. | If engineers don’t trust AI triage or remediation, they route around it. The tool technically “exists” but adoption is shallow. | PwC stresses that AI value is fundamentally human-led and tech-powered: organizations see returns only when people understand, trust, and actively use AI in their day-to-day work. |
| Governance, risk & compliance | Policies for AI usage, audit logs, model explainability, bias checks, sector-specific rules (finance, gov, healthcare), external audits. | In AppSec, AI systems influence risk registers and remediation decisions, so regulators and auditors will eventually ask “why did the model say that?” | FinOps for AI guidance and responsible-AI frameworks highlight governance and audit as a non-optional cost layer: bias testing, logging, retention, and third-party tools all add to the bill |
The total cost of ownership often diverges from initial projections, particularly as models require retraining or infrastructure scaling.
The global industrial restructuring occurring in 2025 has led to the layoff of more than 100,000 employees, leading to predictable budget cuts and a shift toward rethinking current company processes, particularly a focus on automation. Certainly, a large number of processes can and should be automated, but such an obsession with AI implementation leads to crises like Dot-com once was.
In this context, such massive investments in AI, data centers, and process restructuring risk complicating rather than improving, potentially undermining the productivity gains that many companies are counting on.
Fortunately, not every security-related issue requires an LLM solution. In many cases, mature automation or a workflow overview can achieve similar (if not more sustainable) results at a much lower cost.
In the most successful application security implementations, AI is viewed more as a targeted tool for solving problems than as a universal, let alone ubiquitous, solution.
Any strategic planning requires a certain sequence:
Coordination criteria between automation and human oversight have become common practice for successful implementations. In reality, AI actually improves speed and reach, while expert analysts interpret, validate, and make decisions based on this data.
By maintaining this AI <> human balance, the success of AI integration becomes inevitable.
The results of using AI to ensure application security are mixed. In some areas, tangible improvements are observed. In others, the promise is less clear and reflects only the results of localized experiments.
The table below summarizes data from independent research (not vendor marketing).
| Use case | What studies found | Caveats |
|
Vulnerability prioritization & prediction
|
A 2025 study on CVSS-based vulnerability prioritization shows that estimated remediation work hours reduce by up to 8% compared to traditional CVSS v2.0 — indicating more efficient patching when ML is used to re-rank vulnerabilities. A separate 2025 paper on automated vulnerability prioritiz ation reported that ML-based ranking outperformed static scoring in accuracy, reduced false positives, and accelerated SOC response time in experimental setup. Systematic work on software vulnerability prediction likewise concludes that prediction models can help engineers focus review and testing effort on components most likely to be vulnerable. |
Most studies are evaluated on curated datasets. Several mapping and benchmarking studies highlight issues with dataset bias, lack of context (e.g. deployment criticality), and challenges in reproducing results across projects. This supports the idea that ML can improve prioritisation, but not yet as a universally “solved” problem.
|
|
Developer-integrated security feedback
|
In the “Developers Deserve Security Warnings, Too” study (SOUPS 2018), researchers patched a Python crypto library to provide integrated security advice directly in the API. In a controlled experiment with 53 developers, participants whose insecure code triggered those warnings were 15× more likely to turn that code into a secure solution than those using the unmodified library. Earlier work on just-in-time vulnerability prediction and IDE-integrated feedback also indicates that surfacing security issues close to the point of coding helps focus developer attention on risky changes |
Most of this research concerns design of feedback and warnings. It shows that integrated, context-aware security guidance can significantly improve code security, but the “AI” part (i.e. which model generates that guidance) is less studied than the UX and integration aspects.
|
|
AI-based vulnerability detection
|
A systematic review of AI-based software vulnerability detection from 2018 to 2023 concludes that AI/ML approaches can match or exceed traditional static analysis techniques on several benchmarks, and that research interest in AI-driven detection has grown sharply. Empirical studies on just-in-time vulnerability prediction show that ML models can flag commits with higher likelihood of containing vulnerabilities, allowing teams to focus review effort. |
Models trained on one project or dataset often degrade on others. Benchmarks frequently use open-source code and labelled datasets that do not represent the full complexity of enterprise AppSec pipelines. |
|
Supply-chain / dependency risk
|
Outside of software security specifically, multiple studies across logistics and manufacturing show that AI and ML improve supply-chain risk prediction and resilience, enabling earlier detection of disruptions and more targeted mitigation strategies.
These results support the general claim that complex, graph-like risk surfaces benefit from predictive models. |
There is limited independent research that evaluates AI specifically for software supply-chain or dependency risk in AppSec terms (e.g. SBOM-driven, CVE-driven AppSec pipelines). AI helps other types of supply chains and risk graphs, so it is plausible it can help, but hard outcome data for AppSec use cases is still sparse. |
|
Overall impact and maturity
|
A recent systematic review of vulnerability detection and prediction studies notes a clear trend toward AI methods and a rich ecosystem of models and feature sets. Empirical work continues to show that ML-based approaches can outperform or complement traditional techniques on specific tasks and datasets.
|
The same body of work consistently warns about: - generalization beyond benchmark datasets - reproducibility and data availability - the need for explainable methods in security-critical contexts. Overall, research supports targeted application of AI in AppSec rather than blanket claims of “proven” value in every domain. |
From a security analytics standpoint:
In other words, it is fair and honest to say that there is growing support for certain AI use cases in AppSec, but it is not accurate (yet) to describe these areas as universally “proven” or mature. The research backs targeted, use-case-driven adoption, not “AI everywhere by default”.
Algorithms still learn patterns, and security specialists still interpret them manually. The correlation between automation and expert human knowledge and experience remains key.
No model replicates the contextual reasoning of experienced analysts — for example, business impact assessment, regulatory requirements, and the subtleties of human mistakes — which are inaccessible to AI. AI is a tool — not a panacea for all vulnerabilities.
Establishing processes with a human-manual review step ensures interpretability, adherence to ethical boundaries, and accountability. In application security, where false negatives can be equated with security breaches, such oversight is vital.
One of the most sustainable and practical uses of AI in AppSec is trained AI agents, which rely on local and autonomous deployment, strict data control, and workflow orchestration.
DerCodeFix provides users with recommendations for remediating code vulnerabilities using a model specifically trained in secure coding practices and integrated into the development environment, ensuring that code never leaves your company's infrastructure.
Similarly, DerTriage locally prioritizes detected vulnerabilities by comparing them with internal risk models and context, rather than relying on open cloud APIs.
This approach circumvents key challenges related to AI implementation costs, management, token use, and the risk of data leakage, as already occurred with Antropic.
Artificial intelligence has changed modern understanding of application security processes, but its widespread adoption shouldn't call into question its fundamental principles. The challenge isn't to implement AI everywhere, but to use it for its intended (and proven) purpose.
The most tangible benefits accrue to those companies that take a pragmatic approach to AI: deploying it where it clearly enhances security, and avoiding using it where it merely meets market expectations.