Skip to main content
PROMPT INJECTION DEFENSE

SSL for AI agents.
One call. Sub-second.

AgentShield scans web content for prompt injection, hidden instructions, and tool-abuse patterns before your agent acts on it. 85% of attacks succeed against unprotected agents — yours doesn't have to.

See pricing
$ npm i @nuketk1809/agentshield· 3.5kb · zero deps · TS-native
LIVE PLAYGROUND

Try it on real attacks.

Pick a preset or paste your own. Detection runs against the same rule engine as production.

POST /api/scan
Request body · content115 chars
x-api-key: as_live_•••••
Response · 200 OK
Press Run scan to evaluate
CAPABILITIES

Built for the agents you're shipping today, not the chatbots from 2023.

Sub-500ms detection

Rule-based regex catches the top 25 injection patterns before your model wastes a single token.

25 threat classes

Role hijacking, invisible text, encoding tricks, tool abuse — covered, scored, deduplicated.

One HTTP call

POST /api/scan with your API key. Works from any language — Node, Python, Go, cURL. No SDK required.

URL fetch built-in

Pass a URL, we fetch and scan rendered content — including the parts an LLM would actually consume.

Deterministic scores

Same input, same score. No probabilistic drift. Auditable, reproducible, version-pinned.

Zero data retention

Content is hashed, never stored. Your scanned data never leaves the request lifecycle.

INTEGRATION

One call. That's the deal.

Drop shield.scan() before any action that consumes external content. We return a score, you set a threshold, your agent stays safe.

Works with any language — Node, Python, Go, cURL
Synchronous response for low-latency tool gates
URL or raw content scanning in one endpoint
import { AgentShield } from "@nuketk1809/agentshield";

const shield = new AgentShield(process.env.AS_KEY);

// Before your agent acts on web content:
const result = await shield.scan(url);

if (result.score < 80) {
  agent.refuse({
    reason: result.verdict,
    threats: result.threats,
  });
  return;
}

// Safe to proceed
agent.run(url);
THREAT TAXONOMY

25 attack patterns. All scored.

A sample of what we detect. Every threat type is documented with example payloads and severity scoring.

HIGH
invisible_text
-30 pts per detection
HIGH
role_hijacking
-30 pts per detection
HIGH
instruction_override
-30 pts per detection
HIGH
data_exfiltration
-30 pts per detection
MEDIUM
encoding_tricks
-15 pts per detection
HIGH
agent_tool_abuse
-30 pts per detection
MEDIUM
markdown_injection
-15 pts per detection
MEDIUM
persona_injection
-15 pts per detection
START FREE

1,000 scans a month.
No credit card.

Sign up, get an API key, and ship safer agents in the next ten minutes.