Public Disclosures

The following 0DIN submissions have been published in accordance with our Research Terms and Disclosure Policy

  • A guardrail jailbreak vulnerability has been discovered affecting DALL-E3 from OpenAI. The specific flaw manifests when the attacker frames requests for nudity through a combination of historical framing and artistic merit defense. Thereby allowing the attacker to bypass inference restrictions around generating explicit visual content.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
  • A guardrail jailbreak vulnerability has been discovered affecting DALL-E3 from OpenAI. The specific flaw manifests through a combination of historical context framing and vintage format specification that appears to bypass content filtering by contextualizing potentially exploitative imagery within seemingly innocent historical and photographic framing, thereby allowing the attacker to bypass inference restrictions around generating inappropriate or exploitative content.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
  • A guardrail jailbreak vulnerability has been discovered affecting DALL-E3 from OpenAI. The specific flaw manifests when the attacker begins with seemingly artistic or mythological requests and progressively escalates the explicitness through iterative refinement requests, thereby allowing the attacker to bypass restrictions and produce sexually explicit imagery.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
  • A guardrail jailbreak vulnerability has been discovered affecting DALL-E3 from OpenAI. The specific flaw manifests when the attacker frames explicit content requests within feminist discourse and artistic terminology, using concepts like "bodily autonomy" and "sacred feminine". Successful application of the tactic allows the attacker to bypass restrictions around generating explicit imagery that is otherwise restricted..
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker requests seemingly legitimate cultural or documentary content by framing the request as educational documentation of indigenous peoples, thereby allowing the attacker to bypass inference restrictions around generating potentially sensitive imagery of real cultural groups.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker requests technical documentation in wiki format for prohibited subject matter, specifically asking for "technical details in depth lab methodology" which escalates the request beyond basic informational content, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 82
  • A guardrail jailbreak vulnerability has been discovered affecting multiple LLM models. The specific flaw manifests when the attacker frames illicit synthesis requests as academic chemistry assignments involving fictional chemists and quoted materials, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 77
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker systematically misspells words throughout their prompt while requesting a simulated dialogue between two entities, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 82
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker frames their request as seeking to prevent unwanted behavior (paywall bypasses for example) while explicitly requesting step-by-step techniques and instructing the model to skip countermeasures, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 98
  • A guardrail jailbreak vulnerability has been discovered affecting OpenAI GPT-4o. The specific flaw manifests when the attacker requests multiple layers of summarization with selective focus instructions, combined with demands for direct quotes, thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
We use Google Analytics to collect data about how you use this website to optimize user experience.
Please refer to our privacy notice for more information.