Skip to Content
Alprina is in active development. Join us in building the future of security scanning.
Security AgentsLLM Security Agent

LLM Security Agent

Specialized agent for detecting security vulnerabilities in Large Language Model (LLM) applications and AI systems.

Overview

The LLM Security Agent analyzes applications that integrate Large Language Models (like GPT, Claude, or other AI systems) for security vulnerabilities specific to AI/ML systems. As LLM applications become more prevalent, new security challenges emerge that traditional security tools don’t address.

Agent Type: Code & Application Security
Complexity: Medium
Scan Duration: 10-30 seconds
Best For: AI/ML applications, LLM integrations, chatbots, AI agents

What It Detects

1. Prompt Injection Vulnerabilities

Detects code vulnerable to prompt injection attacks where malicious user input can manipulate LLM behavior.

Example:

# VULNERABLE user_input = request.json['message'] prompt = f"You are a helpful assistant. User says: {user_input}" response = llm.complete(prompt) # SECURE user_input = sanitize_input(request.json['message']) prompt = format_prompt(user_input, role="user") response = llm.complete(prompt)

2. Insecure LLM Output Handling

Identifies cases where LLM outputs are used without proper validation.

Example:

# VULNERABLE - executing LLM-generated code code = llm.generate_code(user_request) exec(code) # Dangerous! # SECURE code = llm.generate_code(user_request) if validate_generated_code(code): exec(code, safe_globals, safe_locals)

3. API Key Exposure

Finds hardcoded or improperly stored LLM API keys.

Example:

# VULNERABLE openai.api_key = "sk-proj-abc123..." # Hardcoded! # SECURE openai.api_key = os.getenv("OPENAI_API_KEY")

4. Excessive LLM Permissions

Detects LLM agents with overly broad permissions or tool access.

Example:

# VULNERABLE - agent can execute any shell command tools = [ShellTool(), FileTool(), NetworkTool()] # SECURE - restricted toolset tools = [ReadOnlyFileTool(), RestrictedShellTool(allowed_commands=["ls", "cat"])]

5. Training Data Poisoning Risks

Identifies potential for training data contamination in fine-tuning pipelines.

Example:

# VULNERABLE - no validation training_data = fetch_user_submissions() model.fine_tune(training_data) # SECURE training_data = fetch_user_submissions() validated_data = validate_and_sanitize(training_data) model.fine_tune(validated_data)

6. Model Denial of Service

Detects patterns that could enable attackers to cause expensive LLM calls.

Example:

# VULNERABLE - no rate limiting @app.route('/ask') def ask(): question = request.json['question'] return llm.complete(question, max_tokens=4000) # Expensive! # SECURE @app.route('/ask') @rate_limit(max_calls=10, period=60) def ask(): question = request.json['question'] return llm.complete(question, max_tokens=500)

7. Insecure LLM Dependencies

Checks for vulnerable versions of LLM libraries and frameworks.

Example:

# VULNERABLE langchain==0.0.100 # Old vulnerable version # SECURE langchain>=0.1.0 # Updated secure version

8. Data Leakage via LLM

Identifies code that may leak sensitive data through LLM prompts.

Example:

# VULNERABLE - sending PII to LLM prompt = f"Analyze this user: {user.email}, SSN: {user.ssn}" # SECURE - redact sensitive data prompt = f"Analyze this user: {redact_pii(user)}"

Supported Frameworks & Libraries

  • ✅ OpenAI API (GPT-3.5, GPT-4)
  • ✅ Anthropic Claude API
  • ✅ LangChain
  • ✅ LlamaIndex
  • ✅ Hugging Face Transformers
  • ✅ Google PaLM / Gemini
  • ✅ Cohere
  • ✅ Custom LLM integrations

Scan Examples

Basic Scan

alprina scan ./ai-app --agent llm_security

With Specific Profile

alprina scan ./chatbot --profile code-audit --agent llm_security

Scan LLM Configuration

alprina scan ./llm-config.yaml --agent llm_security

Sample Output

🛡️ LLM Security Agent Scan Results ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Target: ./ai-chatbot Files Scanned: 23 Duration: 15.3s ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 📊 Findings Summary ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Total: 4 vulnerabilities Critical: 1 High: 2 Medium: 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 🔍 Vulnerabilities ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1. CRITICAL - Prompt Injection Vulnerability 📍 chatbot/main.py:45 💡 User input directly concatenated into prompt 🔗 CWE-94: Improper Control of Generation of Code 2. HIGH - Hardcoded API Key 📍 config.py:12 💡 OpenAI API key stored in source code 🔗 CWE-798: Use of Hard-coded Credentials 3. HIGH - Unvalidated LLM Output Execution 📍 agents/code_executor.py:89 💡 LLM-generated code executed without validation 🔗 CWE-94: Improper Control of Generation of Code 4. MEDIUM - Excessive LLM Agent Permissions 📍 agents/assistant.py:34 💡 Agent has unrestricted file system access 🔗 CWE-269: Improper Privilege Management

OWASP LLM Top 10 Coverage

The LLM Security Agent covers all OWASP Top 10 for LLM Applications:

  1. ✅ LLM01: Prompt Injection
  2. ✅ LLM02: Insecure Output Handling
  3. ✅ LLM03: Training Data Poisoning
  4. ✅ LLM04: Model Denial of Service
  5. ✅ LLM05: Supply Chain Vulnerabilities
  6. ✅ LLM06: Sensitive Information Disclosure
  7. ✅ LLM07: Insecure Plugin Design
  8. ✅ LLM08: Excessive Agency
  9. ✅ LLM09: Overreliance
  10. ✅ LLM10: Model Theft

Best Practices

Secure Prompt Engineering

# Use structured prompts from alprina.llm import SafePromptTemplate template = SafePromptTemplate( system="You are a helpful assistant.", user_input_sanitizer=sanitize_input, output_validator=validate_response ) response = template.complete(user_message)

API Key Management

# Store keys securely from alprina.secrets import SecretManager secrets = SecretManager() api_key = secrets.get("OPENAI_API_KEY") openai.api_key = api_key

Output Validation

# Validate all LLM outputs from alprina.llm import OutputValidator validator = OutputValidator( max_length=1000, forbidden_patterns=[r'<script>', r'eval\('], require_json=True ) response = llm.complete(prompt) safe_response = validator.validate(response)

Integration Examples

LangChain

from langchain import OpenAI, PromptTemplate from alprina.langchain import SecureChain # Wrap your chain with security llm = OpenAI(temperature=0.7) secure_chain = SecureChain( llm=llm, input_sanitizer=sanitize_input, output_validator=validate_output, rate_limit=RateLimit(max_calls=100, period=3600) )

Custom LLM Integration

from alprina.llm import SecureLLMWrapper class MyLLM(SecureLLMWrapper): def __init__(self): super().__init__( provider="custom", validate_prompts=True, sanitize_outputs=True ) def complete(self, prompt): # Your LLM logic with built-in security return super().complete(prompt)

Common Vulnerabilities Fixed

Before

# MULTIPLE ISSUES @app.route('/chat', methods=['POST']) def chat(): user_msg = request.json['message'] prompt = f"User: {user_msg}" # Prompt injection! response = openai.Completion.create( engine="gpt-4", prompt=prompt, max_tokens=4000, # DoS risk! api_key="sk-..." # Hardcoded key! ) return exec(response.choices[0].text) # Code injection!

After

# SECURE from alprina.llm import SecureLLM, sanitize_input, validate_code import os llm = SecureLLM( api_key=os.getenv("OPENAI_API_KEY"), rate_limiter=RateLimit(max_calls=10, period=60) ) @app.route('/chat', methods=['POST']) @llm.rate_limit def chat(): user_msg = sanitize_input(request.json['message']) response = llm.complete( prompt=user_msg, max_tokens=500, temperature=0.7 ) validated = validate_code(response.text) return {"response": validated}

Performance

  • Scan Speed: 10-30 seconds for typical applications
  • False Positive Rate: ~5%
  • Detection Accuracy: ~95%
  • Supported Languages: Python, JavaScript, TypeScript, Java

Limitations

  • Cannot detect runtime prompt injection attacks
  • Limited analysis of proprietary LLM implementations
  • Requires application code (not black-box testing)
  • May not catch all novel attack vectors

Further Reading

Feedback

This is a new agent! Help us improve:

Last updated on