AI Security Resources
This page distills the highlights
For the full, continuously updated list, see jassics/awesome-genai-security on GitHub - a curated list of links, books, videos, tools, CTFs, and incidents covering GenAI, LLM, RAG, MCP, Agents, and Agentic AI security.
Use this page as the "go deeper" reference for the whole AI Security section - organized so you can jump straight to standards, papers, tools, or hands-on practice depending on what you need right now.
Standards & Frameworks
| Framework | What It Covers |
|---|---|
| OWASP GenAI Security Project | The umbrella project behind the LLM Top 10, Agentic Top 10, and related guides |
| OWASP Top 10 for LLM Applications | The canonical LLM risk taxonomy - see LLM Security for a full deep dive |
| OWASP Top 10 for Agentic Applications | Risk taxonomy specific to autonomous, tool-using agents - see Agentic AI & Agent Security |
| OWASP LLM AI Security and Governance Checklist | Practical checklist bridging security and governance teams |
| MITRE ATLAS | Adversarial Threat Landscape for AI Systems - real-world adversary tactics/techniques against ML systems, ATT&CK-style |
| NIST AI Risk Management Framework (AI RMF) + Playbook | US voluntary framework for identifying and managing AI risk across the lifecycle |
| NIST AI 600-1: Generative AI Profile | GenAI-specific companion profile to the AI RMF |
| Google Secure AI Framework (SAIF) | Google's conceptual framework for securing AI systems |
| ISO/IEC 42001:2023 | AI Management System standard - the "ISO 27001 for AI governance" |
| Databricks AI Security Framework (DASF) | Practical controls mapped to AI system components and risks |
| CSA MAESTRO | Seven-layer threat modeling framework specifically for agentic AI |
Key Papers (arXiv & Research)
| Paper | Relevance |
|---|---|
| Attention Is All You Need (Vaswani et al., 2017) | The transformer architecture paper underlying every modern LLM - see Preliminary AI/ML Concepts |
| Extracting Training Data from Large Language Models (Carlini et al.) | Foundational paper on training-data extraction attacks |
| Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications | Early, influential paper on indirect prompt injection against real deployed systems |
| Prompt Injection Attacks and Defenses in LLM-Integrated Applications | Systematic treatment of injection attack/defense patterns |
| Red-Teaming LLM Multi-Agent Systems via Communication Attacks | Attacks that exploit inter-agent communication in multi-agent pipelines - see Agentic AI & Agent Security |
| Adaptive Red-Teaming Agent Against Multimodal Models | Automated/AI-driven red teaming approach - see AI Red Teaming |
| Agentic JWT: A Secure Delegation Protocol for Autonomous AI Agents (Goswami, 2025) | Proposes cryptographically binding agent actions to a verified user intent + workflow step, addressing how OAuth 2.0's deterministic-client assumptions break down for autonomous, prompt-driven agents |
| Google DeepMind: Evaluating Frontier Models for Dangerous Capabilities | Methodology for evaluating frontier-model risk before release |
Books
| Book | Author |
|---|---|
| The Developer's Playbook for Large Language Model Security | Steve Wilson |
| Not with a Bug, But with a Sticker | Ram Shankar Siva Kumar & Hyrum Anderson |
| Generative AI Security | Ken Huang & Yang Wang |
| Adversarial AI: Attacks, Mitigations, and Defense Strategies | John Sotiropoulos |
| Red Teaming AI: A Field Manual for Attacking Intelligent Systems | Philip A. Dursey (No Starch, early access) |
Videos & Talks
- Intro to LLM Security - WhyLabs
- OWASP Top 10 for LLM Applications Explained - OWASP
- Hacking LLMs and Prompt Injection - LiveOverflow
- AI Red Teaming - DEFCON AI Village
- Securing LLM Applications - SANS Institute
Tools
Defensive / Scanning
| Tool | Purpose |
|---|---|
| LLM Guard | Information-extraction protection and input/output security for LLMs |
| ModelScan | Scans models for serialization attacks |
| Rebuff | Prompt injection detection |
| Cisco MCP Scanner | Scans MCP servers/tools for poisoning and prompt injection |
| Snyk Agent Scan | Inventories and scans AI agents, MCP servers, and skills for 15+ risk types |
| Fickling (Trail of Bits) | Decompiler/analyzer/safety-scanner for malicious Pickle/PyTorch model files |
| ModelAudit | Static scanner for malicious code/backdoors across 40+ model file formats |
Offensive / Red Teaming
| Tool | Purpose |
|---|---|
| Garak (NVIDIA) | LLM vulnerability scanner |
| PyRIT (Microsoft) | Python Risk Identification Toolkit for GenAI |
| ART - Adversarial Robustness Toolbox (IBM) | Adversarial-example generation and robustness testing |
| promptmap | Prompt injection testing |
| DeepTeam | LLM & AI-agent red teaming - 50+ vulnerability types mapped to OWASP/NIST/MITRE |
| Promptfoo | LLM testing/red-teaming with CI/CD integration |
Guardrails & Firewalls
| Tool | Purpose |
|---|---|
| Guardrails AI | Input/output validation for LLMs |
| NeMo Guardrails (NVIDIA) | Programmable guardrails for LLM applications |
| Vigil | LLM prompt injection detection |
| Trylon Gateway | Self-hosted AI firewall/proxy (prompt-injection defense, PII redaction) |
| Bifrost AI Gateway | High-performance gateway unifying 20+ LLM providers with governance/policy enforcement |
Hands-On Labs & CTFs
| Lab / CTF | Focus |
|---|---|
| Gandalf (Lakera AI) | Prompt-injection challenge, beginner-friendly |
| Prompt Airlines | AI security challenges, CTF style |
| Damn Vulnerable MCP Server | Deliberately vulnerable MCP implementation - see MCP Security |
| Vulnerable MCP Servers Lab | Collection of vulnerable MCP servers |
| FinBot Agentic AI CTF (OWASP) | Agentic-security-focused CTF |
| OWASP WrongSecrets | Includes an LLM/AI secrets-leakage challenge |
| HackAPrompt | Prompt hacking competition |
| Crucible (Dreadnode) | AI/ML security challenges and CTFs |
| AI Goat | Vulnerable LLM CTF built on AWS |
| Microsoft AI Red Teaming Playground Labs | Hands-on red-teaming challenges with Docker/Kubernetes deployment |
| PortSwigger Web Security Academy: Web LLM Attacks | Free official hands-on labs on LLM API exploitation and excessive agency |
| Huntr.com | Bug bounty platform specifically for AI/ML |
Courses & Certifications
| Course / Cert | Provider |
|---|---|
| SANS SEC545: GenAI and LLM Application Security | SANS (maps to GIAC GAIPS) |
| Microsoft AI Red Teaming 101 | Microsoft Learn, free |
| Coursera: Generative AI for Cybersecurity Professionals | IBM |
| Certified AI Security Professional (CAISP) | Practical DevSecOps |
| GIAC AI Platform Security (GAIPS) | GIAC - hands-on CyberLive certification |
| GIAC AI Security Automation Engineer (GASAE) | GIAC |
| ISC2 AI Security Certificate | ISC2 |
Communities & Newsletters
- OWASP GenAI Slack -
#project-top10-for-llmchannel - AI Village (DEF CON) - AI security research community
- MLSecOps Community
- Embrace the Red - Johann Rehberger's AI security research blog
- LLM Security - curated LLM security news and research
- Protect AI Blog (now part of Palo Alto Networks)
This Site's Own GenAI Security Toolkit
jassics/awesome-claude-security is a Claude Code plugin marketplace covering the full security lifecycle, with dedicated GenAI-security plugins you can install directly into a Claude Code session:
llm-security- OWASP LLM Top 10, prompt injection testingrag-security- retrieval poisoning and isolation checksagentic-ai-security- tool-permission audits, autonomy-boundary reviewmultimodal-security- cross-modal injectionmlops-security- ML supply chain and pipeline securityai-safety- harm modeling, safety evals, responsible red-teaming (a distinct discipline from AI security - see the taxonomy note)
Install with /plugin marketplace add jassics/awesome-claude-security inside a Claude Code session, then /plugin install llm-security@awesome-claude-security (or any of the plugins above).
Where to Go Next on This Site
- Start from zero: Preliminary AI/ML Concepts
- Core attack surface: LLM Security, RAG Security, MCP Security, Agentic AI & Agent Security
- Practice: AI Red Teaming
- Learn from the past: Real-World AI Security Incidents
- Program-level: AI Threat Modeling, AI Data Security, AI Model Security, AI Governance
Credits/References
- jassics/awesome-genai-security - the full, continuously updated source this page distills
- jassics/awesome-claude-security - Claude Code plugin marketplace for security work, including GenAI security
- OWASP GenAI Security Project
- MITRE ATLAS