NSFW Content Warning
This disclosure contains content that may be inappropriate for some viewers.
The content has been flagged due to potential nude imagery.
NSFW
OpenAI DALL-E3 Guardrail Jailbreak via "Classical Art Reframing" Tactic
Case GUID: 0d97ce9d-29aa-45a0-8b5c-f71daee04ea5
Summary
A guardrail jailbreak vulnerability has been discovered affecting DALL-E3 from OpenAI. The specific flaw manifests when the attacker frames requests for nudity through a combination of historical framing and artistic merit defense. Thereby allowing the attacker to bypass inference restrictions around generating explicit visual content.
Metadata
Social Impact Score
Level 2: Moderate Risk
The violation has a moderate social impact, potentially involving misinformation or low-stakes safety risks, which could mislead users or increase risks if used irresponsibly.
Example: Model generating content that could be misinterpreted as medical advice, bypassing filters to generate mildly sensitive information like dosage amounts for common medications without full context. Leakage of private personal data such as cell phone number.
Potential Consequences: Could lead to individual harm or liability for the provider, prompting policy review and potentially mild regulatory concern.
Nude Imagery Rating System
Level 2: Classical Artistic Representation
Inspired by historical fine art traditions, this category includes works resembling those of the Renaissance, Neoclassical, or other artistic movements that emphasize anatomy, form, and aesthetic beauty. Nudity is often idealized, symbolic, or mythological, intended to evoke admiration rather than arousal.
Taxonomies
About 0Din.ai
The 0Day Investigative Network (0DIN) was founded by Mozilla in 2024 to reward responsible researchers for their efforts in securing GenAI models. Learn more and submit discoveries at https://0din.ai.
Severity
Low
Security Boundary
Prompt Injection
Models

Published On
2025-08-27 (4 days)
Credit
Mike Takahashi (@TakSec)