Introduction
In my previous post on MCP, I explored the Model Context Protocol and how it bridges the gap between AI and external systems. Now, let’s put that knowledge into practice by building RedTeam MCP - an MCP server that enables AI assistants like Claude to orchestrate penetration testing tools.
This project serves a dual purpose:
- Learning TypeScript from a Python background (see my TypeScript for Pythonistas guide)
- Building a practical security tool that demonstrates MCP’s power
⚠️ Disclaimer: This tool is for authorized security testing only. Always ensure you have explicit permission before scanning any target.
The Vision
Imagine doing HackTheBox machines with an AI assistant that understands your workflow:
User: "I'm starting a new HTB machine at 10.10.10.123. Help me enumerate it."
Claude: I'll begin reconnaissance.
[Uses port_scan] → Found: 22/SSH, 80/HTTP, 443/HTTPS
[Uses tech_detect] → Apache 2.4.41, PHP 7.4.3, WordPress 5.9.1
[Uses subdomain_enum] → Found: admin.target.htb, dev.target.htb
[Uses vuln_scan] → Critical: CVE-2024-XXXX on admin.target.htb
Recommendation: Research CVE-2024-XXXX for potential exploitation.
The AI handles the tedious enumeration while you focus on the interesting parts: understanding the target and crafting exploits.
Architecture
RedTeam MCP follows a layered architecture:
┌─────────────────────────────────────────────────────────────┐
│ MCP CLIENT (Agent) │
│ (Claude Desktop / LangGraph Agent) │
└─────────────────────────────────────────────────────────────┘
│
MCP Protocol
│
▼
┌─────────────────────────────────────────────────────────────┐
│ MCP SERVER (redteam-mcp) │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Task Tools │ │ Low-Level │ │ Prompts │ │
│ │ │ │ Tools │ │ (Playbooks) │ │
│ │ port_scan │ │ nmap │ │ │ │
│ │ vuln_scan │ │ nuclei │ │ htb_recon │ │
│ │ dir_fuzz │ │ httpx │ │ web_enum │ │
│ │ subdomain │ │ whatweb │ │ quick_scan │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Security Guardrails │ │
│ │ • Target Whitelisting • Rate Limiting │ │
│ │ • Audit Logging • Input Validation │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
CLI Wrappers
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Security Tools │
│ nmap | nuclei | ffuf | httpx | whatweb │
└─────────────────────────────────────────────────────────────┘
Design Decisions
Task-Oriented vs Tool-Oriented
A key insight: the AI doesn’t need to know which tool to use, just what task to accomplish.
❌ Tool-Oriented (Bad):
# AI must know ffuf's Host header fuzzing syntax
ffuf -u http://target.htb -H "Host: FUZZ.target.htb" -w wordlist.txt
✅ Task-Oriented (Good):
subdomain_enum --domain target.htb
We built 6 task-oriented tools that abstract the underlying complexity:
| Tool | Underlying Implementation |
|---|---|
subdomain_enum | ffuf with Host header fuzzing |
directory_fuzz | ffuf with path fuzzing |
vhost_fuzz | ffuf targeting IP with Host variations |
port_scan | nmap with version detection |
tech_detect | httpx + whatweb combination |
vuln_scan | nuclei with severity filters |
For power users, we also expose the low-level tools (nmap, nuclei, httpx, whatweb) with full options.
Safety Guardrails
With great power comes great responsibility. We implemented multiple layers of protection:
1. Target Whitelisting
const ALLOWED_TARGETS = [
"*.htb", // HackTheBox domains
"*.thm", // TryHackMe domains
"10.10.10.*", // HTB IP range
"10.10.11.*", // HTB IP range
"*.lab.internal", // Local labs
];
Any target not matching these patterns is rejected before execution.
2. Rate Limiting
Token bucket algorithm prevents abuse:
const LIMITS = {
port_scan: { maxTokens: 5, refillRate: 0.1 }, // 5 scans, refill 1/10sec
vuln_scan: { maxTokens: 3, refillRate: 0.05 }, // 3 scans, refill 1/20sec
};
3. Audit Logging
Every invocation is logged with full context:
{
"id": "audit-1706000000000-abc123",
"timestamp": "2025-01-26T12:00:00Z",
"action": "scan_started",
"tool": "port_scan",
"target": "10.10.10.123",
"result": "success",
"duration": 45200
}
Implementation Highlights
TypeScript for Security Tools
Coming from Python, TypeScript offered surprising benefits for security tooling:
- Zod Schemas for input validation:
const NmapInputSchema = z.object({
target: z.string().describe("Target IP or hostname"),
scanType: z.enum(["quick", "default", "version", "aggressive"]),
timing: z.enum(["T0", "T1", "T2", "T3", "T4", "T5"]).default("T4"),
});
- Type-safe CLI wrappers:
async function executeCommand(
command: string,
options?: { timeout?: number }
): Promise<CommandResult> {
// Type-checked inputs and outputs
}
- MCP SDK integration:
server.registerTool("port_scan", {
description: "Scan for open ports and services",
inputSchema: { target, scanType, timing },
}, async (input) => {
const result = await runNmapScan(input);
return { content: [{ type: "text", text: formatResult(result) }] };
});
Workflow Prompts
MCP supports prompts - pre-defined workflows the AI can follow:
const PROMPTS = [{
name: "htb_initial_recon",
description: "Complete HTB machine reconnaissance",
template: `
1. Port scan with version detection
2. Technology detection on web services
3. Subdomain enumeration if domain found
4. Vulnerability scan on discovered services
`
}];
Project Structure
redteam-mcp/
├── src/
│ ├── index.ts # MCP server (12 tools + 4 prompts)
│ ├── tools/
│ │ ├── fuzzing.ts # subdomain, directory, vhost fuzzing
│ │ ├── nmap.ts # Port scanning
│ │ ├── nuclei.ts # Vulnerability scanning
│ │ └── httpx.ts # HTTP probing
│ ├── prompts/
│ │ └── playbooks.ts # Enumeration workflows
│ └── utils/
│ ├── validation.ts # Target whitelisting
│ ├── rateLimiter.ts # Token bucket rate limiting
│ └── logger.ts # Audit logging
├── exercises/ # TypeScript learning exercises
├── docs/
│ └── SECURITY.md # Security guardrails documentation
└── examples/
├── htb-machine-workflow.md # Demo walkthrough
└── quick-examples.md # Quick reference
Testing with MCP Inspector
Before connecting to an AI agent, test your server with MCP Inspector:
cd redteam-mcp
npm run build
npx @anthropic-ai/mcp-inspector ./build/index.js
This opens a web UI where you can:
- View all registered tools and prompts
- Execute tools with custom parameters
- Inspect JSON-RPC messages
What’s Next
This post covered the MCP Server - the tool provider. In the next post, we’ll build the Agent using LangGraph that:
- Connects to our MCP server
- Decides which tools to use based on context
- Analyzes outputs and plans next steps
- Maintains state across the reconnaissance workflow
┌─────────────────────────────────────────────────────────────┐
│ Part 3: LangGraph Agent │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ StateGraph │ │
│ │ [Recon] → [Enum] → [Analyze] → [Report] │ │
│ │ ↑ ↓ │ │
│ │ └────── [Need More Info] ←──────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
│ ↓ │
│ MCP Client │
│ ↓ │
│ RedTeam MCP Server (This Post) │
└─────────────────────────────────────────────────────────────┘
Stay tuned!
Resources
- Source Code: github.com/manulqwerty/redteam-mcp
- MCP Documentation: modelcontextprotocol.io
- TypeScript for Pythonistas: Previous Post
- MCP Introduction: Previous Post
Conclusion
RedTeam MCP demonstrates how MCP can bridge AI assistants and specialized security tools. The key takeaways:
- Task-oriented design makes tools AI-friendly
- Safety guardrails are non-negotiable for security tools
- TypeScript + Zod provides excellent input validation
- MCP’s architecture cleanly separates concerns
With proper safeguards, AI-assisted penetration testing isn’t just possible - it’s practical. The tedious enumeration phase becomes conversational, letting you focus on the creative aspects of security research.
In the next post, we’ll complete the picture with a LangGraph agent that orchestrates this MCP server autonomously.