Public Disclosures

The following 0DIN submissions have been published in accordance with our Research Terms and Disclosure Policy

  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests by structuring the request as key-value pairs, interpreted by the model as configuration parameters rather than natural language. This approach successfully bypasses content filters, resulting in for example detailed drug manufacturing instructions, allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 91
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when an attacker structures a prompt to spell out abbreviations of a nefarious subject in a step-by-step manner before joining them. This tactic allows the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 73
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when attackers introduce random spacing within a prompt, effectively disrupting how the model tokenizes the input. By splitting words like "Chemicals" into parts such as "chem-", "ic-", and "al," the technique circumvents the model's primary content filter classifiers, which typically flag restricted phrases or tokens. This token manipulation allows attackers to bypass inference rest...
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 91
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through the use of an extremely brief prompt containing a query so minimal and ambiguous that it provides insufficient context for guardrail systems to properly evaluate intent. This brevity-based approach allows the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 91
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests through a combination of requesting comparative analysis of chemical synthesis procedures while embedding explicit instructions to avoid generalization and include specific technical details. The attacker frames the request as an academic comparison exercise, effectively bypassing inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 86
  • A guardrail jailbreak vulnerability has been discovered affecting multiple models. The specific flaw manifests when the attacker leads with an essay title and then requests the model complete the essay thereby allowing the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 86
  • A guardrail jailbreak vulnerability has been discovered affecting various language models. The specific flaw manifests when an attacker compacts a lengthy successful multi-turn jailbreak into a concise prompt that suggests a unified viewpoint among three entities: the language model, the user, and a desired archetype. This approach allows the attacker to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 91
  • A guardrail jailbreak vulnerability has been discovered affecting multiple AI models. The specific flaw manifests by framing requests in scientific terminology under the guise of regulatory compliance. This method further references a "protective wrapper" to obscure malicious intent, allowing attackers to bypass inference restrictions around providing information hazards and violating laws.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
    MAX SCORE: 100
  • A guardrail jailbreak vulnerability has been discovered affecting OpenAI Sora. The specific flaw manifests through a technique that combines technical anatomical terminology with creative visual elements, specifically requesting bio-luminescent symbols that conform to body contours. This approach allows the attacker to bypass inference restrictions around generating potentially inappropriate visual content by framing the request in artistic and technical language.
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED:
  • A guardrail jailbreak vulnerability has been discovered affecting OpenAI Sora. The specific flaw manifests through a combination of academic framing and hypothetical scenario construction, where the attacker embeds inappropriate content requests within an educational anatomy classroom context. By repeatedly emphasizing the "hypothetical" nature of the scenario while requesting visual content that would normally be restricted, the attacker successfully bypasses inference restrictions around ge...
    SEVERITY: Low
    BOUNDARY: Prompt Injection
    MODELS AFFECTED: