pipeline
Pipeline stage functions, runner classes, and supporting contracts exported from corridorkey.
Stage Functions
Scan a path for processable clips.
Accepts: - A clips directory containing multiple clip subfolders - A single clip folder (must contain Input/ and optionally AlphaHint/) - A single video file (reorganised in-place into a clip folder structure)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to a clips directory, a single clip folder, or a video file. |
required |
reorganise
|
bool
|
If True (default), loose video files are moved into an Input/ subfolder in-place. If False, loose videos are reported as skipped rather than silently ignored. |
True
|
events
|
PipelineEvents | None
|
Optional PipelineEvents for streaming clip discovery to a GUI. on_clip_found fires for each valid clip as it is discovered. on_clip_skipped fires for each path that could not be used. |
None
|
Returns:
| Type | Description |
|---|---|
ScanResult
|
ScanResult with clips ready for the loader stage and any skipped paths. |
Raises:
| Type | Description |
|---|---|
ClipScanError
|
If the path does not exist or is an unrecognised file type. |
PermissionError
|
If the top-level directory cannot be read. |
OSError
|
If video reorganisation fails. |
Validate a clip and return its manifest.
For image sequence inputs, reads directly from Input/ and AlphaHint/. For video inputs, extracts frames into Frames/ and AlphaFrames/ without touching the original files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
clip
|
Clip
|
A Clip from stage 0. |
required |
events
|
PipelineEvents | None
|
Optional PipelineEvents for extraction progress reporting. |
None
|
png_compression
|
int
|
PNG compression level for video extraction (0–9). Default 1 is recommended for intermediate frames. |
DEFAULT_PNG_COMPRESSION
|
Returns:
| Type | Description |
|---|---|
ClipManifest
|
ClipManifest ready for preprocessing, or for the interface to generate |
ClipManifest
|
alpha externally via resolve_alpha() if needs_alpha is True. |
Raises:
| Type | Description |
|---|---|
FrameMismatchError
|
If validation fails. |
ExtractionError
|
If video extraction fails. |
Update a manifest with an externally generated alpha sequence.
Called by the interface layer (CLI, GUI, etc.) after it has generated alpha frames using an external tool. Alpha generation is not a pipeline stage — it is entirely the interface's responsibility.
Validates the provided alpha directory matches the stored frame count
(using the count already in the manifest — no re-scan of the input
directory), then returns an updated manifest with needs_alpha=False
and alpha_frames_dir set, ready for preprocessing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
manifest
|
ClipManifest
|
A ClipManifest with |
required |
alpha_frames_dir
|
Path
|
Path to the directory containing the generated alpha frame sequence. |
required |
Returns:
| Type | Description |
|---|---|
ClipManifest
|
Updated ClipManifest with |
Raises:
| Type | Description |
|---|---|
ValueError
|
If manifest already has alpha or the directory doesn't exist. |
FrameMismatchError
|
If the alpha frame count doesn't match manifest.frame_count. |
Read VideoMetadata from the clip root directory, if present.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
clip_root
|
Path
|
Clip root directory. |
required |
Returns:
| Type | Description |
|---|---|
VideoMetadata | None
|
VideoMetadata if video_meta.json exists, None otherwise. |
Preprocess one frame from a clip for model inference.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
manifest
|
ClipManifest
|
ClipManifest from stage 1. Must have needs_alpha=False. |
required |
i
|
int
|
Frame index within manifest.frame_range. |
required |
config
|
PreprocessConfig
|
Preprocessing configuration. |
required |
image_files
|
list[Path] | None
|
Pre-sorted image paths (build once per clip, pass every frame). |
None
|
alpha_files
|
list[Path] | None
|
Pre-sorted alpha paths (build once per clip, pass every frame). |
None
|
Returns:
| Type | Description |
|---|---|
PreprocessedFrame
|
PreprocessedFrame — tensor on device + FrameMeta for postprocessing. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If manifest still needs alpha or i is out of range. |
FrameReadError
|
If a frame file cannot be read. |
Load a GreenFormer model from a checkpoint.
Constructs the model architecture, moves it to the configured device, loads the checkpoint, applies dtype fixups, and optionally compiles the model with torch.compile.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
InferenceConfig
|
InferenceConfig with checkpoint_path, device, img_size, use_refiner, model_precision, and refiner_mode. |
required |
resolved_refiner_mode
|
str | None
|
The concrete refiner mode after "auto" has been resolved (either "full_frame" or "tiled"). If None, uses config.refiner_mode directly (must not be "auto"). |
None
|
Returns:
| Type | Description |
|---|---|
Module
|
GreenFormer in eval mode on config.device, ready for inference. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the checkpoint file does not exist. |
RuntimeError
|
If the checkpoint cannot be loaded. |
Run model inference on a single preprocessed frame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
frame
|
PreprocessedFrame
|
Output of the preprocessing stage. tensor is [1, 4, H, W] on config.device, already ImageNet-normalised. |
required |
model
|
Module
|
Loaded GreenFormer in eval mode. |
required |
config
|
InferenceConfig
|
Inference configuration. |
required |
resolved_refiner_mode
|
str | None
|
Pre-resolved refiner mode ("full_frame" or
"tiled"). When provided, skips the per-frame VRAM probe that
|
None
|
Returns:
| Type | Description |
|---|---|
InferenceResult
|
InferenceResult with alpha and fg tensors on device, plus FrameMeta. |
Postprocess a single inference result into output-ready numpy arrays.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
InferenceResult
|
InferenceResult from the inference stage. |
required |
config
|
PostprocessConfig
|
Postprocessing options (despill, despeckle, source_passthrough). |
required |
stem
|
str
|
Filename stem for output naming (e.g. "frame_000001"). Defaults to "frame_{frame_index:06d}" when empty. |
''
|
output_dir
|
Path | None
|
Root output directory. Required when config.debug_dump=True. |
None
|
Returns:
| Type | Description |
|---|---|
PostprocessedFrame
|
PostprocessedFrame with all arrays at source resolution, float32. |
Write all enabled outputs for one postprocessed frame to disk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
frame
|
PostprocessedFrame
|
PostprocessedFrame from the postprocessor stage. |
required |
config
|
WriteConfig
|
WriteConfig controlling which outputs to write and where. |
required |
Raises:
| Type | Description |
|---|---|
OSError
|
If any cv2.imwrite call fails. |
Runner Classes
Runs the full pipeline for a single clip on a single device.
Instantiate once per clip, call run(), then discard.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
manifest
|
ClipManifest
|
ClipManifest from the loader stage. Must have needs_alpha=False. |
required |
config
|
PipelineConfig
|
Pipeline configuration. |
required |
run()
Run the pipeline for the clip. Blocks until all frames are done.
Configuration for the full pipeline runner.
Attributes:
| Name | Type | Description |
|---|---|---|
preprocess |
PreprocessConfig
|
Preprocessing stage config (img_size, device, strategy). |
inference |
InferenceConfig | None
|
Inference stage config (checkpoint, device, precision). None means inference is skipped (preprocess-only mode). |
postprocess |
PostprocessConfig
|
Postprocessing stage config (despill, despeckle, checkerboard). |
write |
WriteConfig | None
|
Writer stage config (formats, enabled outputs). None means a default WriteConfig is derived from the manifest output_dir. |
input_queue_depth |
int
|
Max preprocessed frames waiting for inference. Keep small — each frame is ~64MB on GPU at 2048 resolution. |
output_queue_depth |
int
|
Max inference outputs waiting for postprocessing. |
resolved_refiner_mode |
str | None
|
Pre-resolved refiner mode ("full_frame" or "tiled").
When set, PipelineRunner skips the VRAM probe entirely and uses this
value directly. Populated by |
resolved_refiner_mode = None
class-attribute
instance-attribute
Optional event callbacks for all pipeline stages. Pass a PipelineEvents instance to receive per-frame and per-stage progress notifications.
Runs the pipeline across multiple GPUs in parallel (frame-level dispatch).
Each GPU gets its own model instance loaded in its own thread. All GPU workers pull from a single shared input queue and push to a single shared output queue. The preprocessor and postwriter are single-threaded as usual.
Frame ordering: output frames may arrive out of order when multiple GPUs
are running at different speeds. The PostWriteWorker writes frames as they
arrive — if strict ordering is required, the caller should sort by
InferenceResult.meta.frame_index after the run.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
manifest
|
ClipManifest
|
ClipManifest from the loader stage. |
required |
config
|
MultiGPUConfig
|
MultiGPUConfig with a list of device strings. |
required |
run()
Run the pipeline. Blocks until all frames are written.
Configuration for multi-GPU frame-level parallel inference.
Each device gets its own model instance and InferenceWorker thread. All workers share a single input queue (preprocessed frames) and a single output queue (inference results), so the pipeline naturally load-balances — whichever GPU finishes first picks up the next frame.
Attributes:
| Name | Type | Description |
|---|---|---|
devices |
list[str]
|
List of PyTorch device strings to use (e.g. ["cuda:0", "cuda:1"]).
Must have at least one entry. Use |
inference |
InferenceConfig
|
Base InferenceConfig. The |
preprocess |
PreprocessConfig
|
Preprocessing config. Runs on CPU (device-agnostic). |
postprocess |
PostprocessConfig
|
Postprocessing config. |
write |
WriteConfig | None
|
Writer config. None → default derived from manifest output_dir. |
input_queue_depth |
int
|
Shared input queue depth. Scale with GPU count — a depth of 2×N gives each GPU a frame in flight plus one buffered. |
output_queue_depth |
int
|
Shared output queue depth. |
events |
PipelineEvents | None
|
Optional pipeline event callbacks. |
Contracts
Bases: BaseModel
A clip ready for stage 1. Output contract of stage 0.
Frozen — all fields are immutable after construction. Downstream stages must never mutate a Clip; create a new one if a field needs to change.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Human-readable clip name derived from the folder name. |
root |
Path
|
Absolute path to the clip folder. |
input_path |
Path
|
Path to the input asset. Either the Input/ directory (for pre-structured clips) or a video file inside Input/ (for normalised videos). |
alpha_path |
Path | None
|
Path to the alpha hint asset. None if absent — the interface must generate alpha externally and call resolve_alpha() before proceeding. |
Bases: BaseModel
Output contract of stage 1. Input to all downstream stages.
Frozen — all fields are immutable after construction. Use model_copy() to produce an updated manifest (e.g. in resolve_alpha).
Downstream stages only receive what they need — resolved frame paths, output destination, and clip metadata. All discovery, validation, and extraction decisions are made in stage 1 and are not repeated.
Attributes:
| Name | Type | Description |
|---|---|---|
clip_name |
str
|
Clip name, carried through for logging and output naming. |
clip_root |
Path
|
Absolute path to the clip folder. Parent of Input/, Output/, Frames/, AlphaHint/, etc. |
frames_dir |
Path
|
Directory containing the input frame sequence.
Points to |
alpha_frames_dir |
Path | None
|
Directory containing the alpha hint frame sequence.
Points to |
output_dir |
Path
|
Directory where stage 6 writes all output images.
Created by stage 1 at |
needs_alpha |
bool
|
True if alpha is absent. The interface layer must generate
alpha externally and call |
frame_count |
int
|
Total number of input frames. |
frame_range |
tuple[int, int]
|
Half-open range |
is_linear |
bool
|
True if input frames are in linear light (e.g. .exr). |
video_meta_path |
Path | None
|
Path to the |
png_compression |
int
|
PNG compression level used during extraction (0–9). Stored so the writer stage can use the same level for consistency. |
Bases: BaseModel
Source video metadata captured at extraction time.
Carried through the pipeline so stage 6 can re-encode output with matching properties.
Attributes:
| Name | Type | Description |
|---|---|---|
filename |
str
|
Original video filename (stem + suffix). |
width |
int
|
Frame width in pixels. |
height |
int
|
Frame height in pixels. |
fps_num |
int
|
Framerate numerator. |
fps_den |
int
|
Framerate denominator. |
pix_fmt |
str
|
Pixel format string (e.g. "yuv420p"). |
codec_name |
str
|
Video codec name (e.g. "h264", "prores"). |
frame_count |
int
|
Total frame count as reported by the container. 0 if the container does not report it (use duration_s / fps instead). |
duration_s |
float | None
|
Total duration in seconds. None if not reported by container. |
has_audio |
bool
|
True if the source container has at least one audio stream. |
color_space |
str | None
|
Color space string (e.g. "bt709"). None if not reported. |
color_transfer |
str | None
|
Transfer characteristic (e.g. "bt709"). None if not reported. |
color_primaries |
str | None
|
Color primaries (e.g. "bt709"). None if not reported. |
estimated_frame_count
property
Best-effort frame count: container value if available, else duration * fps.
fps
property
Framerate as a float.
Output contract of the preprocessing stage.
Attributes:
| Name | Type | Description |
|---|---|---|
tensor |
Tensor
|
Model input tensor [1, 4, img_size, img_size] on device. Channels: [R_norm, G_norm, B_norm, alpha_hint]. dtype is float32 unless PreprocessConfig.half_precision=True. |
meta |
FrameMeta
|
Original frame dimensions and optional source image for postprocessing. |
Metadata carried alongside the tensor for use by postprocessing.
Attributes:
| Name | Type | Description |
|---|---|---|
frame_index |
int
|
Index of this frame within the clip's frame_range. |
original_h |
int
|
Frame height before resizing, in pixels. |
original_w |
int
|
Frame width before resizing, in pixels. |
source_image |
ndarray | None
|
Original sRGB image [H, W, 3] float32, RGB channel order, at source resolution, used by postprocessor source_passthrough to replace model FG in opaque interior regions. None if source passthrough is disabled. |
alpha_hint |
ndarray | None
|
Raw alpha hint [H, W, 1] float32 0-1 at source resolution, used by postprocessor hint_sharpen to produce a hard binary mask that eliminates soft edge tails introduced by upscaling. None if no alpha hint was provided. |
Configuration for the preprocessing stage.
Attributes:
| Name | Type | Description |
|---|---|---|
img_size |
int
|
Square resolution the model runs at. 2048 is the native training resolution — do not change unless retraining. |
device |
str
|
PyTorch device string ("cuda", "mps", "cpu"). |
image_upsample_mode |
ImageUpsampleMode
|
Interpolation mode used when the source image is smaller than img_size. "bicubic" (default) gives the sharpest result. "bilinear" is faster but slightly softer. Has no effect when downscaling — area mode is always used then. |
half_precision |
bool
|
If True, cast tensors to float16 before inference. |
source_passthrough |
bool
|
If True, carry the original sRGB source image in FrameMeta so the postprocessor can replace model FG in opaque interior regions with original source pixels. |
sharpen_strength |
float
|
Unsharp mask strength applied after upscaling. 0.3 (default) recovers softness from the antialias filter. 0.0 disables. Has no effect when downscaling. |
Runtime configuration for the inference stage.
Attributes:
| Name | Type | Description |
|---|---|---|
checkpoint_path |
Path
|
Path to the .pth model checkpoint file. |
device |
str
|
PyTorch device string ("cuda", "cuda:0", "mps", "cpu"). |
img_size |
int
|
Square resolution the model runs at. Must be one of
0, 512, 1024, 1536, or 2048. 0 means auto-select based on VRAM
(resolved by |
use_refiner |
bool
|
Whether to enable the CNN refiner module. The refiner corrects transformer macroblocking artifacts at subject edges. Disabling it is faster but produces visibly coarser alpha mattes. |
mixed_precision |
bool
|
Run the forward pass under autocast (fp16/bf16). Ignored on CPU. Reduces VRAM usage with minimal quality impact. |
model_precision |
dtype
|
Weight dtype for the model forward pass. float32 is safe everywhere. float16/bfloat16 saves VRAM on CUDA but may reduce numerical stability. |
refiner_mode |
RefinerMode
|
Controls how the CNN refiner executes. "auto" — probe VRAM; <12 GB → tiled, else → full_frame. "full_frame" — run the refiner on the full image at once. Best performance on GPUs with 12+ GB VRAM. "tiled" — run the refiner in 512×512 overlapping tiles. Keeps peak VRAM flat. Identical output quality to full_frame. Required on low-VRAM GPUs. |
refiner_scale |
float
|
Multiplier applied to the CNN refiner's delta output. 1.0 applies full refinement. 0.0 disables the refiner output entirely (equivalent to use_refiner=False but without skipping the forward pass). Values between 0 and 1 blend between no refinement and full refinement. |
backend |
BackendChoice
|
Which inference backend to use. "auto" — Apple Silicon + corridorkey-mlx installed → mlx, else → torch. "torch" — always use PyTorch (CUDA / ROCm / MPS / CPU). "mlx" — always use MLX (Apple Silicon only, optional package). Can also be overridden via the CORRIDORKEY_BACKEND env var. |
Output contract of the inference stage. Input to postprocessing.
Attributes:
| Name | Type | Description |
|---|---|---|
alpha |
Tensor
|
Predicted alpha matte [1, 1, img_size, img_size], sigmoid-activated, range 0-1, on device. |
fg |
Tensor
|
Predicted foreground colour [1, 3, img_size, img_size], sigmoid-activated sRGB range 0-1, on device. |
meta |
FrameMeta
|
Original frame dimensions and index, carried through from preprocessing so postprocessing can resize outputs back to source resolution. |
Configuration for the postprocessor stage.
Attributes:
| Name | Type | Description |
|---|---|---|
fg_upsample_mode |
FgUpsampleMode
|
Interpolation mode for upscaling the foreground when the model resolution is smaller than the source. "lanczos4" (default) gives the sharpest result. "bicubic" is slightly faster. "bilinear" is fastest. Downscaling always uses INTER_AREA. |
alpha_upsample_mode |
AlphaUpsampleMode
|
Interpolation mode for upscaling the alpha matte. "lanczos4" (default) gives the sharpest matte edges. "bilinear" is faster. Downscaling always uses INTER_AREA. |
despill_strength |
float
|
Green spill suppression strength (0.0 = off, 1.0 = full). |
auto_despeckle |
bool
|
Remove small disconnected alpha islands. |
despeckle_size |
int
|
Minimum connected region area in pixels to keep. |
despeckle_dilation |
int
|
Dilation radius in pixels applied after component removal to recover edges lost during binarisation. Default 25. |
despeckle_blur |
int
|
Gaussian blur radius applied after dilation to soften the hard mask edge. Default 5. |
checkerboard_size |
int
|
Tile size in pixels for the preview composite background. |
source_passthrough |
bool
|
Replace model FG in opaque interior regions with the original source pixels. Eliminates dark fringing caused by background contamination in the model FG prediction. Requires source_image in FrameMeta (set PreprocessConfig.source_passthrough=True). |
edge_erode_px |
int
|
Erosion radius (pixels) for the interior mask used by source_passthrough. Shrinks the interior region inward so the blend seam sits inside the subject rather than at the raw alpha edge. |
edge_blur_px |
int
|
Gaussian blur radius for the source_passthrough blend seam. Higher values produce a softer transition between model FG and source. |
hint_sharpen |
bool
|
Apply a hard binary mask derived from the alpha hint to eliminate soft edge tails introduced by upscaling. Requires an alpha hint in FrameMeta. Default True. |
hint_sharpen_dilation |
int
|
Dilation radius in pixels applied to the binarised hint before masking. Gives breathing room so fine model edge detail is not clipped. Default 3. |
debug_dump |
bool
|
Save raw inference output (before any postprocessing) to a
|
Output contract of the postprocessor stage. Input to the writer stage.
All arrays are at original source resolution, float32, numpy.
Attributes:
| Name | Type | Description |
|---|---|---|
alpha |
ndarray
|
Alpha matte [H, W, 1], linear, range 0-1. |
fg |
ndarray
|
Foreground RGB [H, W, 3], sRGB straight, range 0-1.
In transparent regions the values are undefined — use |
processed |
ndarray
|
Premultiplied linear RGBA [H, W, 4], range 0-1. This is the primary output for compositing. Transparent regions are correctly zeroed out (fg * alpha), so no black-blob artefacts. |
comp |
ndarray
|
Preview composite over checkerboard [H, W, 3], sRGB, range 0-1. |
frame_index |
int
|
Frame index carried through from FrameMeta. |
source_h |
int
|
Original frame height in pixels. |
source_w |
int
|
Original frame width in pixels. |
stem |
str
|
Filename stem for output naming (e.g. "frame_000001"). |
Configuration for the writer stage.
Controls which outputs are written and in what format.
All output subdirectories are created under output_dir.
Attributes:
| Name | Type | Description |
|---|---|---|
output_dir |
Path
|
Root directory for all outputs. |
alpha_enabled |
bool
|
Write the alpha matte. |
alpha_format |
ImageFormat
|
File format for alpha output ("png" or "exr"). |
fg_enabled |
bool
|
Write the straight sRGB foreground colour image. |
fg_format |
ImageFormat
|
File format for fg output ("png" or "exr"). |
processed_enabled |
bool
|
Write the premultiplied linear RGBA output. This is the primary compositor output — transparent regions are correctly zeroed out. Saved as EXR (float32) by default. |
processed_format |
ImageFormat
|
File format for processed output ("png" or "exr"). |
comp_enabled |
bool
|
Write the checkerboard preview composite. |
comp_format |
Literal['png']
|
File format for comp output (always "png"). |
exr_compression |
str
|
EXR compression codec name. One of: "none", "rle", "zips", "zip", "piz", "pxr24", "dwaa", "dwab". |