redteam-ai-benchmark

Visit Website
GitHub RepoAI Security / Cybersecurity Tools (Red Teaming) / DevToolsIdeaUnknown

Description

Evaluate uncensored LLMs for offensive security with targeted questions and clear criteria to ensure effectiveness in real-world penetration testing.

Founders

lpr021 (GitHub owner; individual maintainer, real name not provided)

Discovered

July 28, 2025

Added to Database

January 26, 2026

Notes

Open-source benchmark focused on measuring how well 'uncensored' LLMs perform on practical offensive-security tasks, with explicit criteria geared toward real pentesting workflows. If adopted by red teams and model builders, it could become a de facto evaluation standard for security-capable agentic LLMs.

Related Links