What the engine does

What the engine actually does
today — in precise detail.

Same prompt, same seed, same checkpoint, on the same machine — the same frame, byte for byte. That deterministic property is the working core. Below is an account of what the engine is designed to do, what is in development, and what it deliberately does not do yet. Try the Open Studio demo.

Capability matrix

What the engine is designed to produce.

This matrix is the product design target. Several capabilities are working today — the deterministic core and re-render, 1080p cinematic clips, character continuity and style continuity; the rest is in active development at varying stages and should be read as where the engine is going.

Deterministic core · working

Same prompt · same seed · same checkpoint, on the same machine and a pinned environment → the same frame, byte for byte. Measured across our SDXL, AnimateDiff and Wan render pipelines. Cross-machine reproducibility is not yet validated.

Cinematic clip generation · working

The engine produces 1080p cinematic clips at 24, 25 or 30 fps, with selectable aspect ratios and camera looks. 1080p output works today; 4K and 8K are the next resolution targets.

Multi-character continuity · working

A character’s face, hair and wardrobe stay stable across cuts, scenes and longer sequences. Character continuity holds in the current pipeline.

Style continuity across episodes · working

A single style spec governs an entire series so it looks consistent end to end. Style continuity is working in the current pipeline.

Camera language modeling · in development

Planned camera control — drone aerial, dolly, handheld, lock-off, orbit, push-in, pull-back — with controllable speed and easing. In development.

Lighting and time-of-day · in development

Planned lighting control — magic hour, blue hour, overcast, neon night, hard noon, practical — with directional control and colour temperature. In development.

Style spectrum · in development

Several cinematic style families are in progress (photorealistic, watercolor, stained glass, paper-cut, oil paint, pen-and-ink), with a JSON spec for adding more. Not yet a finished library.

Deterministic re-render · working

Because output is deterministic at a fixed seed and environment, a re-render reproduces the prior frame exactly — useful for fixes and review without reroll randomness.

Provenance-bound output · in development

The pipeline is being designed to attach a cryptographic content-credential manifest to each output, aligned with the C2PA Content Credentials standard. This is not yet shipped.

Watermark recipe · in development

A post-render watermark recipe (ffmpeg-based) is planned for the BYOC starter pack, with an optional invisible per-render mark for leak tracing.

Multi-language scene metadata · in development

Planned: scene briefs, prompts and titles carried as multilingual JSON, so localization does not require re-rendering the visual track. In development.

Episode-scale orchestration · in development

Planned: a hierarchical schema (series → episode → scene → shot → frame) with cross-references and asset versioning. In development.

Audit-bound mutation log · in development

Planned: an append-only log of every change — actor, timestamp, before- and after-state, reason. In development.

Where development stands

What is being built.

Development is early. The items below describe internal development progress, not externally audited metrics, and the work is ongoing rather than finished.

  • Deterministic core — working and measured across our SDXL, AnimateDiff and Wan render pipelines on a fixed environment.
  • Style families — several cinematic style configurations in progress (photorealistic, watercolor, stained glass, paper-cut, oil paint, pen-and-ink), with a JSON spec for adding more.
  • Character, camera and lighting libraries — reference-sheet and shot-control work in progress; not yet a finished system.
  • Pipeline tooling — prompt-schema parsing and render orchestration under active development and internal testing.
Design principles

What the engine is designed around.

  • Determinism. Identical prompt · identical seed · identical model checkpoint, on the same machine → identical frame. This is the working core.
  • Provenance. Planned: each render emits a signed manifest so tampering is detectable. In development.
  • Watermark. Planned: a visible identifier on every output, with an optional invisible per-recipient mark. In development.
  • Audit. Planned: an append-only log of every change, including the operator’s reason text. In development.
Integration surface

How you talk to the engine.

  • JSON in, MP4 out. The prompt schema and the orchestration spec are both JSON. The render output is standard H.264 MP4. No proprietary file format anywhere.
  • BYOC compatible. The render-pipeline reference script in our starter pack adapts to ComfyUI, AUTOMATIC1111 + AnimateDiff, diffusers (HunyuanVideo / CogVideoX / Wan2.1), or any other Linux-callable video diffusion runtime.
  • No vendor lock-in on the model. You bring your own model checkpoint, LoRAs, and any fine-tuning. The engine does not require a specific model family.
  • No hosted dependency. Once the starter pack is downloaded, the pipeline runs entirely on your hardware with no network call to us.
What the engine does NOT do yet

Honest limits.

  • Real-time generation. Render windows are minutes to hours per scene depending on your GPU and model.
  • Voice synthesis. Voice generation is out of scope; pipe in audio from your existing voice tooling.
  • Music score. Music is out of scope; pipe in score from your existing music tooling.
  • Multi-user collaboration. Single-operator only at this stage. Team collaboration is planned for beta.
  • Hosted render. We do not run cloud GPUs during alpha. BYOC only.
  • SOC 2 / formal compliance audit. Planned post-beta when revenue supports the audit cycle.
Roadmap signal

Where this is going.

  • Beta: hosted render option (operator pays compute), team collaboration, asset library across projects, reusable character system across series.
  • General availability: commercial seats with paid pricing, removed watermark obligation under separate agreement, full SLA, SOC 2 Type II.
  • Long-form research: long-context character continuity (multi-season series), branching narrative, real-time prevs scrubbing.
Carbon footprint

Why this architecture is materially lower-carbon.

Generative video is electricity-hungry. The dominant pattern in 2026 is cloud-render-everywhere: every prompt routes to a hyperscale datacenter where an H100 / B200 cluster spins up, generates, then idles or rolls to the next user. Datacenter PUE (power usage effectiveness) is 1.3–1.5 typical — meaning every watt of compute carries 30–50% additional cooling and infrastructure overhead. At scale, this aggregates to nontrivial gigawatt-hours per million minutes generated.

Our architecture changes two variables in that equation:

  1. You render on hardware you already own. A consumer GPU sitting in a workstation idles ~95% of the time anyway. Putting your generative work on it has near-zero marginal energy cost — the card and the room cooling are already there. The cloud equivalent provisions fresh hardware, pulls fresh cooling, and routes fresh network egress for every job.
  2. The deterministic core kills wasteful re-rolls. Random-seed generative pipelines burn 3–10 attempts to land a single usable output. Same prompt + same seed + same model deterministically reproduces the prior frame — once you have a take you like, iteration on parameters is byte-for-byte recoverable rather than another full random throw. Five attempts down to one is a 5× energy reduction at the workflow level.

Rough order-of-magnitude comparison for one minute of finished 1080p cinematic AI video:

  • Cloud workflow (typical): ~0.4–1.2 kWh per finished minute. Datacenter H100 cluster, PUE 1.4, includes a typical 4–6 attempt average to land a usable shot, includes network egress for input + intermediate + final.
  • BYOC + deterministic workflow (typical): ~0.06–0.15 kWh per finished minute. Consumer Nvidia GPU you already own, near-zero marginal cooling, deterministic single-attempt iteration on parameters.

That spread — ~5–10× lower kWh per finished minute — is order-of-magnitude only, not a measured certification. The cloud workflow numbers come from publicly available 2024–2026 industry data on datacenter PUE (Uptime Institute Global Data Center Survey 2024 reports median PUE 1.58; Lawrence Berkeley National Laboratory 2024 estimates AI-training-class workload draws 0.4–1.2 kWh per finished minute of generative video at 1080p). The BYOC numbers come from observed throughput on consumer Nvidia hardware running open-source video diffusion models. Actual numbers vary with local grid mix, GPU class, model choice, render-settings discipline, and the cloud provider compared against. We are not selling carbon credits, we have not gone through SBTi or PCAF certification, and we have no third-party assurance. We claim a more efficient architecture and show how the math comes out. The methodology document publishes shortly in our open repo; until then, audit the inputs above and adjust for your context.

If our work helps a studio swap out cloud-render-everywhere for BYOC + deterministic on a single recurring production, the cumulative annual electricity savings can run into multiple megawatt-hours. That is the operational reason we built this architecture, and it is the reason we will not pivot to hosted cloud render until we can do it with energy accounting we can publish honestly.

Provenance & verification

Provenance is on the roadmap.

Generative AI media without provenance is becoming a regulatory question. The EU AI Act, California SB-942 (Content Provenance Act) and China’s synthetic-content labelling rules all move toward the same direction: synthetic media should be able to declare itself. We are designing with that direction in mind.

The planned approach: emit a C2PA-aligned content-credential manifest alongside each render, recording the prompt and seed, the model checkpoint hash, the render timestamp and a pipeline signature, so that tampering is detectable. This feature is in development and is not yet shipped.

Nothing on this page is a statement of regulatory compliance. Whether any output meets a specific legal obligation depends on how you use it and on your jurisdiction — that assessment is yours to make, and you should confirm your own obligations.

What this is intended to provide, once shipped

  • Self-labelling output — a machine-readable signal that a file is synthetic.
  • Tamper-evidence — a signed manifest where edits break the signature.
  • Source verification — a way for a downstream viewer to check that an output came from an authorised creator.

Provenance is a planned capability, not a shipped one. We are an AI media tool designing in cryptographic provenance — not a “Web3 company”, and not a compliance product.

For partnership, licensing, press and security inquiries: use the contact form. We read everything. DMCA notices and security disclosures have dedicated topics on the same form.

Ready to evaluate the engine?

Download the starter pack and run on your own hardware. No account, no signup, no credit card.

Open Studio Talk to us