Frequently Asked Questions
Which --mask_type should I pick?
The default is mematte (Memory Efficient Matting, ViT-B
Composition-1k). It produces the cleanest hair and silhouette edges of the
three model-based options and is the right starting point.
- mematte — default. Good edges, moderate VRAM.
- vit — HuggingFace ViTMatte, tiled. Higher VRAM cost, useful if MEMatte underperforms on a specific frame.
- sam3 — raw SAM3 mask, no alpha refinement. Fastest; use only when you want SAM3's coarse output and will refine downstream.
- user — read pre-computed mattes from
--mattes_folder. Use when you have alpha mattes from a different tool and only want the spline JSON from Rotobot Next.
When do I need --depth_folder, and what --z_threshold should I use?
Supply --depth_folder whenever you want depth-aware occlusion
gating — typically multi-person scenes where one person is in front of
another, or anywhere a body part crosses another body. The depth folder
must contain one EXR per input frame, in sorted 1:1 order.
The default --z_threshold 0.025 (metres at the limb midline,
scaled per-segment) is the result of a five-value wedge run on 4K clips at
[0.2, 0.1, 0.075, 0.05, 0.025]. Findings:
- 0.025 and 0.05 — no artefacts versus the no-depth baseline; the recommended range.
- 0.17 (the historical default) — leaves visible artefacts; do not use.
- Tighter thresholds also run roughly 20% faster at 4K because rays terminate earlier.
How are licenses counted?
Per process. One rotobot_next worker = one seat. Running two
workers in parallel on the same machine consumes two seats. See the
Relay Server page for details.
Where does output land if I omit --output_folder?
In ./output/<image_folder_name>/ relative to the current
working directory. The JSON file plus any requested sidecar folders
(mattes/, debug/, trimaps/,
sam3_masks/, filled_shapes/) are written under
that root.
Can I run multiple workers on one GPU?
Yes, depending on input resolution and available VRAM:
- HD (1080p) / 1440p — two workers fit comfortably on a 24 GB+ GPU (each worker takes around 14–15 GB).
- 4K / UHD — single worker. Two workers will OOM the ViTMatte / MEMatte tile pass on a 24 GB card and run unreliably on 48 GB.
Most of the batch scripts in visualisation/ use a flock-based
claim pattern to share a queue between worker processes safely.
What input formats does Rotobot Next accept?
JPG, PNG, and EXR sequences. EXR input goes through OpenColorIO; the
bundle ships the ACES 1.0.3 config so ACEScg-linear EXRs are colour-managed
correctly by default. Set OCIO=/path/to/config.ocio to point
at a different config.
Why was my first run slow on a cold machine?
The bundled binary materialises the SAM3, SAM3D-Body, ViTMatte / MEMatte, and MoGE-2 weights on first call (about 30–60 seconds on a typical NVMe). Subsequent runs in the same process tree reuse cached weights and start instantly.
--help itself returns in well under a second — it skips
model loading entirely.
What environment variables does the binary read?
TOKGAN_RELAY_URL- License relay URL. Default
http://0.0.0.0:6349. See the Relay Server page. OCIO- Path to an OpenColorIO config file (required for non-default colour management; the bundled ACES 1.0.3 config covers most VFX workflows).
SAM3D_DETECTOR_PATH- Override the bundled vitdet detector folder.
SAM3D_SEGMENTOR_PATH- Override the bundled SAM3 segmentor folder.
SAM3D_FOV_PATH- Override the bundled MoGE-2 FOV estimator folder.
SAM3D_MHR_PATH- Override the bundled Momentum Human Rig assets folder.