Hydrox AI Hugging Face

Auditor

BETA

Evaluate the security of your website Chatbots, models, and code under several compliance frameworks.

Website
Website
Website
Evaluate security of chatbot AI models on websites
API
API
API
Evaluate your local or server-deployed AI models via API
GitHub
GitHub
GitHub
Upload GitHub code to evaluate code security
Overview
Frameworks
Attacks
News
Agents-Risks
Data

AI Security Risks & Threats

Understanding potential dangers in AI systems

Updated 3 days ago
LLM Vulnerabilities

Prompt injection attacks, data poisoning, and model manipulation can expose sensitive information or generate harmful content.

View Details
Updated 1 week ago
Agent Security

AI agents with excessive permissions may leak credentials, execute unauthorized commands, or cause system compromises.

View Details
Updated 5 days ago
MCP Tool Risks

Model Context Protocol tools can be exploited to access banking systems, steal credentials, or perform unauthorized transactions.

View Details
Updated last month
Code Vulnerabilities

Traditional security flaws in applications that integrate AI systems, creating attack vectors through conventional weaknesses.

View Details
Updated 2 days ago
Privacy Exposure

Personal data leakage through training data memorization, prompt engineering, or inadequate output filtering mechanisms.

View Details
Updated 6 days ago
Harmful Content

Generation of inappropriate material including violence, terrorism, pornography, or other content that violates safety policies.

View Details
Updated 2 weeks ago
GenAI RMF

NIST's Generative AI Risk Management Framework provides guidelines for identifying, assessing, and mitigating risks in generative AI systems.

View Details
Updated 3 months ago
NIST CSF

Cybersecurity Framework that provides guidelines, standards, and best practices to manage cybersecurity risks including AI security concerns.

View Details
Updated 1 month ago
OWASP Top 10

Open Web Application Security Project's Top 10 for Large Language Model Applications - identifying the most critical security risks.

View Details
Updated 4 weeks ago
ATLAS Matrix

MITRE's Adversarial Threat Landscape for Artificial-Intelligence Systems - a comprehensive framework for AI threat modeling.

View Details
Updated yesterday
Prompt Exfiltration

Advanced techniques to extract system prompts and internal instructions from AI models through carefully crafted input sequences.

View Details
Updated 4 days ago
Jailbreaking

Sophisticated methods to bypass safety guardrails and content filters to make AI models generate prohibited or harmful content.

View Details
Updated 1 week ago
Self-Modification

Attacks that manipulate AI models to modify their own behavior, instructions, or operational parameters during runtime.

View Details
Updated 3 days ago
Covert Channel

Hidden communication methods through AI model outputs that can exfiltrate data or transmit unauthorized information.

View Details
Updated 5 days ago
Agent Privilege Escalation

Autonomous agents gaining unauthorized access levels, potentially compromising entire systems through elevated permissions.

View Details
Updated 1 week ago
Inter-Agent Communication

Security risks in agent-to-agent communication including message interception, manipulation, and unauthorized coordination.

View Details
Updated 3 days ago
Credential Exposure

Agents inadvertently exposing API keys, passwords, or sensitive authentication tokens through logs or outputs.

View Details
Updated 2 weeks ago
Recursive Execution

Agents creating infinite loops or recursive operations that can cause denial of service or resource exhaustion attacks.

View Details
Updated 1 week ago
Training Data Poisoning

Malicious manipulation of training datasets to introduce backdoors, bias, or vulnerabilities into AI models.

View Details
Updated 4 days ago
Data Leakage

Unintended exposure of sensitive training data through model outputs, inference attacks, or membership inference.

View Details
Updated 6 days ago
Model Extraction

Adversaries reverse-engineering or stealing proprietary models through carefully crafted queries and analysis.

View Details
Updated 2 days ago
Adversarial Examples

Specially crafted inputs designed to fool AI models into making incorrect predictions or classifications.

View Details