Question
When an AI video provider returns a sensitive-content error, the cause is not always the text prompt. It may be the media reference itself.
GenFlick investigated a production failure where a user-visible error implied sensitive content. The underlying question was whether the failure came from the prompt, generated audio, account status, text-to-video unavailability, or reference images.
Reconstructed Payload
The affected production clip used Seedance 2.0 through fal.ai. The request was reconstructed from persisted project state using the same application helpers used by the generation path:
- generation prompt composition
- labeled video reference construction
- clip lookup
- Seedance provider reference-binding behavior
The reconstructed request used Seedance reference-to-video with a 16:9, 12-second payload, a long production prompt, an approved start frame, and a character portrait reference.
The exact replay reproduced an HTTP 422 failure on image references. Replaying with generated audio disabled failed with the same class of image-reference error, ruling out audio generation as the root cause.
Sensitivity Matrix
The investigation then varied endpoint, image inputs, and audio.
Requests using human visual references failed:
- exact production replay with start frame plus character portrait
- exact production replay with audio disabled
- start frame only with a generic safe prompt
- individual generated human character portraits as reference images
Requests without photorealistic human references completed:
- the full production prompt as text-to-video with no images
- reference-to-video with a non-person room or set reference
- reference-to-video with a clearly stylized flat avatar reference
The apparent boundary was not “unsafe prompt” or “voiceover.” It was photorealistic human-likeness media passed in image reference fields.
Product Implication
AI video applications often use start frames and character portraits to preserve continuity. That is normally desirable: identity references help keep characters stable across shots. But provider policy can make that path brittle when references contain photorealistic human likenesses.
The failure mode is especially confusing because the written prompt may be safe and the text-to-video endpoint may work. If the application retries the same media payload, it can repeatedly fail without giving the user a useful recovery path.
Recommended Error Classification
The application should detect this refusal class precisely:
- HTTP 422 or equivalent provider validation failure
- content-policy or safety violation type
- error location includes image URL or image URL list
- message mentions likeness, people, privacy, or private information
The user-facing message should name the failing modality:
Seedance rejected the human reference image, not the written prompt.
That distinction matters. It tells the user and the agent what to change.
Recovery Options
Useful recovery paths include:
- retry the same written prompt without human references
- keep non-person set and prop references while dropping human portraits
- switch to a provider that accepts the desired human-reference workflow
- regenerate character references in a clearly stylized or non-photorealistic look
- ask the user to confirm a provider switch or reference-removal tradeoff
The product should not blindly retry the same media reference payload. A deterministic policy refusal is not transient.
Routing Policy
A media generation system can avoid some failures by classifying reference assets before provider selection.
For Seedance-like behavior observed here:
- photorealistic human start frames and portraits should be treated as high-risk references
- non-person sets and props can remain useful references
- stylized avatars may be acceptable, but should still be classified conservatively
- text-to-video fallback should remain available when identity continuity is less important than successful generation
The reference router should understand role and content class. A human identity reference, a room reference, and a style board should not be routed as interchangeable images.
Limitations
This was a provider-specific sensitivity test, not a universal safety benchmark. It does not prove that all Seedance deployments or all providers behave the same way. It also does not determine whether account allowlisting, different image styles, or different vendors would change the boundary.
The supporting repository and production observability exports are private, limiting external reproduction. The methodology remains useful: reconstruct the payload, isolate prompt from media from audio, and vary one modality at a time.
Conclusion
Sensitive-content errors in AI video generation should be debugged by modality. In this Seedance case, safe text prompts completed, but photorealistic human image references failed consistently. Products that depend on visual continuity should classify human-reference risk, route provider calls accordingly, and offer recovery actions that change the rejected input rather than repeating it.