---
title: "Token Torching: How I’d burn your AI budget (so you can fix it)"
page_name: "Token Torching: How I’d burn your AI budget (so you can fix it)"
type: "blog"
slug: "token-torching-ai-attack"
published_at: "2026-01-15"
modified_at: "2026-01-15"
url: "https://www.sumologic.com/blog/token-torching-ai-attack"
canonical: "https://www.sumologic.com/blog/token-torching-ai-attack"
markdown_url: "https://www.sumologic.com/blog/token-torching-ai-attack.md"
lang: "en"
excerpt: "Learn more about Token Torching, a new MCP security risk where attackers exploit valid AI requests to amplify costs. Discover how it works and how to prevent it."
taxonomy_blog_category:
  - "AI"
  - "SecOps &amp; Security"
---

[ All blogs ](https://www.sumologic.com/blog "blog")[AI](https://www.sumologic.com/blog/ai), [SecOps &amp; Security](https://www.sumologic.com/blog/secops-security)

# Token Torching: How I’d burn your AI budget (so you can fix it)

[David Girvin](#blog-author-block-331)

January 15, 2026

3 min read 

[AI](https://www.sumologic.com/blog/ai), [SecOps &amp; Security](https://www.sumologic.com/blog/secops-security)

##### Table of contents

 

 

 

I spend most of my time thinking like a criminal.

Not because I’m edgy, but because that’s literally the job. And lately, everywhere I look, I see the same thing:

People are exposing MCP endpoints like they’re REST APIs, and forgetting they’re actually money execution engines.

So let’s talk about Token Torching. Yes, I invented another name.

This isn’t data theft. It’s not taking your service down.

It’s quietly, methodically, and legitimately making your AI system cost so much that someone disables it.

This is the kind of attack no one models because:

- nothing crashes,
- nothing looks “malicious,”
- and the requests are all technically valid.

Which is exactly why it works.

## **MCP changed the threat model (and most teams missed it)**

Traditional abuse assumes I want your data, your uptime, or your credentials.

With MCP, I don’t need any of that.

I just need you to:

1. accept an external request,
2. do “helpful AI things,”
3. and pay for them.

That’s it.

Every MCP-enabled system is now:

- externally triggerable,
- internally paid,
- and often cost-amplifying by design.

If you don’t believe me, keep reading.

## **Two ways Token Torching shows up in the real world**

I’ve seen both patterns. Neither is hypothetical.

### **Pattern A: “We pay for the model.”**

This one’s obvious.

An external request flows through your MCP, your LLM, then your bill.

If I can trigger

- long reasoning,
- multi-step planning,
- retries,
- retrieval,
- tool calls,

I don’t need volume. I need **complexity**.

### **Pattern B: “Bring your own key (but we still pay).”**

This is where teams get smug and wrong.

Sure, the caller brings their own model key.

But you still pay for:

- embeddings,
- vector search,
- reranking,
- orchestration,
- downstream SaaS APIs,
- retries,
- workflow execution.

Congrats. I outsourced 20% of the bill and kept the other 80%.

## **How an attacker thinks about this (at a high level)**

No step-by-step exploitation. Just mindset.

When I look at a public or semi-public MCP surface, I ask:

### **“Where does cost amplify?”**

- One request that leads to many agent steps
- One request that triggers many tool calls
- One request that leads to a large retrieval scope

### **“What retries automatically?”**

- Tool failures
- Schema mismatches
- Partial successes
- Timeouts

Retries are just *polite token burners*.

### **“What looks reasonable but is worst-case?”**

- Broad semantic queries
- High top-k retrieval
- Large structured outputs
- Inputs that sit right on validation boundaries

Nothing illegal. Nothing malformed. Just expensive.

### **“What keeps running if I walk away?”**

Streaming responses.

Background tasks.

Async workflows.

If generation continues after disconnect, that’s not resilience — that’s a billing leak.

## **How to test your own system like an adult**

If you run MCP in production, your security team should explicitly test the following.

### **1. Cost-per-request testing**

Pick a single identity and ask:

- What’s the maximum cost of one valid request?
- How many tokens?
- How many tool calls?
- How many retries?

If you don’t know, that’s already a finding.

### **2. Complexity skew testing**

Compare a “normal” user request vs a valid but pathological one.

If the cost delta is 10x, 50x, or “uhhh wow” — congratulations, you found your torch.

### **3. Retry abuse testing**

Intentionally induce near-schema failures, slow tools, and partial tool errors.

Then watch how many retries fire, how much they cost, and whether there’s a hard stop.

Hope is not a control.

### **4. Retrieval blast radius testing**

Test:

- max top-k
- cross-namespace queries
- ambiguous semantic searches

If one request fans out across half your vector store, that’s not “powerful AI.” That’s an unbounded cost surface.

### **5. Disconnect behavior**

Start a request.

Disconnect early.

Watch billing.

If the system keeps thinking after you leave, you’re paying for ghosts.

## **What defenders should be logging (and probably aren’t)**

If I were attacking this quietly, these are the signals I’d try to stay just under.

Which means these are exactly what you should alert on.

- cost per request (not just RPS)
- tool calls per request
- agent step count
- retries per request
- retrieval scope metrics
- spend per identity / key / IP
- endpoints ranked by *cost*, not traffic

If Finance sees this before Security does, you’ve already lost the argument.

## **Controls that actually stop Token Torching**

Not vibes. Not “we’ll watch it.”

**Hard budgets**

- per request
- per identity
- per tool
- per tenant

No budget? No execution.

**Cheap gates before expensive brains**

Auth, validation, size limits, retrieval caps — *before* the LLM ever wakes up.

**Progressive trust**

Public MCPs should start weak.

Power is earned, not exposed.

**Per-tool quotas**

Some tools should never be callable from untrusted MCP traffic.

That’s not restrictive, that’s sane.

**Kill switches**

If you can’t shut off an expensive tool in seconds, you don’t control your system.

**Control Planes**

I wrote a blog about this that you should read to learn more: [MCP vs MoCoP](https://www.sumologic.com/blog/mcp-vs-mcp2)

## **Final thought**

MCP didn’t just make AI more capable. It made “cost” an attack surface. Talk about security as a business enablement tool.

Token Torching isn’t hypothetical.

If you expose MCP publicly and don’t test for this, you’ve built a very polite way for someone else to light your money on fire.

Curious to see how Sumo Logic protects your AI systems? [Sign up for our 30-day free trial.](https://www.sumologic.com/sign-up/)

### Article Tags

- [AI](https://www.sumologic.com/blog/ai)
- [SecOps &amp; Security](https://www.sumologic.com/blog/secops-security)

David Girvin

Lead Technical Advocate

David Girvin is a Technical Advocate at Sumo Logic, facilitating technical accuracy in the cloud of marketing. Previously, he was an AppSec / offensive security architect for places like 1Password and Red Canary. When not working, David travels to surf destinations for surfing and foiling.

[](https://www.sumologic.com/feed "RSS Feed")[](https://twitter.com/intent/tweet?text=Token%20Torching%3A%20How%20I%E2%80%99d%20burn%20your%20AI%20budget%20%28so%20you%20can%20fix%20it%29&url=https%3A%2F%2Fwww.sumologic.com%2Fblog%2Ftoken-torching-ai-attack "X")[](https://www.facebook.com/sharer/sharer.php?u=https%3A%2F%2Fwww.sumologic.com%2Fblog%2Ftoken-torching-ai-attack "Facebook")[](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fwww.sumologic.com%2Fblog%2Ftoken-torching-ai-attack "Linkedin")

[Previous blog

Top 10 SIEM best practices for modern security operations](https://www.sumologic.com/blog/10-best-practices-cloud-siem)[Next blog

New Databricks and Snowflake apps strengthen cloud data security and data pipeline visibility](https://www.sumologic.com/blog/databricks-snowflake-apps-cloud-data-security)

People who read this also enjoyed

[  

AI across the security lifecycle

June 18, 2026

 

 ](https://www.sumologic.com/blog/ai-across-security-lifecycle)[  

Balance AI innovation and governance with Sumo Logic AI and ML apps

June 10, 2026

 

 ](https://www.sumologic.com/blog/sumo-logic-ai-ml-apps-governance)[  

Meet the new Mobot: Your log analysis partner

May 21, 2026

 

 ](https://www.sumologic.com/blog/mobot-your-log-analysis-partner)[  

Before you replace your SIEM: AI-driven security requires operational context, not just centralized data

May 21, 2026

 ](https://www.sumologic.com/blog/before-you-replace-your-siem)

[AI Instructions](https://www.sumologic.com/ai-instructions.md)
