pipeline

Pipeline stage functions, runner classes, and supporting contracts exported from corridorkey.

Stage Functions

Scan a path for processable clips.

Accepts: - A clips directory containing multiple clip subfolders - A single clip folder (must contain Input/ and optionally AlphaHint/) - A single video file (reorganised in-place into a clip folder structure)

Parameters:

Name	Type	Description	Default
`path`	`str \| Path`	Path to a clips directory, a single clip folder, or a video file.	required
`reorganise`	`bool`	If True (default), loose video files are moved into an Input/ subfolder in-place. If False, loose videos are reported as skipped rather than silently ignored.	`True`
`events`	`PipelineEvents \| None`	Optional PipelineEvents for streaming clip discovery to a GUI. on_clip_found fires for each valid clip as it is discovered. on_clip_skipped fires for each path that could not be used.	`None`

Returns:

Type	Description
`ScanResult`	ScanResult with clips ready for the loader stage and any skipped paths.

Raises:

Type	Description
`ClipScanError`	If the path does not exist or is an unrecognised file type.
`PermissionError`	If the top-level directory cannot be read.
`OSError`	If video reorganisation fails.

Validate a clip and return its manifest.

For image sequence inputs, reads directly from Input/ and AlphaHint/. For video inputs, extracts frames into Frames/ and AlphaFrames/ without touching the original files.

Parameters:

Name	Type	Description	Default
`clip`	`Clip`	A Clip from stage 0.	required
`events`	`PipelineEvents \| None`	Optional PipelineEvents for extraction progress reporting.	`None`
`png_compression`	`int`	PNG compression level for video extraction (0–9). Default 1 is recommended for intermediate frames.	`DEFAULT_PNG_COMPRESSION`

Returns:

Type	Description
`ClipManifest`	ClipManifest ready for preprocessing, or for the interface to generate
`ClipManifest`	alpha externally via resolve_alpha() if needs_alpha is True.

Raises:

Type	Description
`FrameMismatchError`	If validation fails.
`ExtractionError`	If video extraction fails.

Update a manifest with an externally generated alpha sequence.

Called by the interface layer (CLI, GUI, etc.) after it has generated alpha frames using an external tool. Alpha generation is not a pipeline stage — it is entirely the interface's responsibility.

Validates the provided alpha directory matches the stored frame count (using the count already in the manifest — no re-scan of the input directory), then returns an updated manifest with needs_alpha=False and alpha_frames_dir set, ready for preprocessing.

Parameters:

Name	Type	Description	Default
`manifest`	`ClipManifest`	A ClipManifest with `needs_alpha=True`.	required
`alpha_frames_dir`	`Path`	Path to the directory containing the generated alpha frame sequence.	required

Returns:

Type	Description
`ClipManifest`	Updated ClipManifest with `needs_alpha=False`, ready for preprocessing.

Raises:

Type	Description
`ValueError`	If manifest already has alpha or the directory doesn't exist.
`FrameMismatchError`	If the alpha frame count doesn't match manifest.frame_count.

Read VideoMetadata from the clip root directory, if present.

Parameters:

Name	Type	Description	Default
`clip_root`	`Path`	Clip root directory.	required

Returns:

Type	Description
`VideoMetadata \| None`	VideoMetadata if video_meta.json exists, None otherwise.

Preprocess one frame from a clip for model inference.

Parameters:

Name	Type	Description	Default
`manifest`	`ClipManifest`	ClipManifest from stage 1. Must have needs_alpha=False.	required
`i`	`int`	Frame index within manifest.frame_range.	required
`config`	`PreprocessConfig`	Preprocessing configuration.	required
`image_files`	`list[Path] \| None`	Pre-sorted image paths (build once per clip, pass every frame).	`None`
`alpha_files`	`list[Path] \| None`	Pre-sorted alpha paths (build once per clip, pass every frame).	`None`

Returns:

Type	Description
`PreprocessedFrame`	PreprocessedFrame — tensor on device + FrameMeta for postprocessing.

Raises:

Type	Description
`ValueError`	If manifest still needs alpha or i is out of range.
`FrameReadError`	If a frame file cannot be read.

Load a GreenFormer model from a checkpoint.

Constructs the model architecture, moves it to the configured device, loads the checkpoint, applies dtype fixups, and optionally compiles the model with torch.compile.

Parameters:

Name	Type	Description	Default
`config`	`InferenceConfig`	InferenceConfig with checkpoint_path, device, img_size, use_refiner, model_precision, and refiner_mode.	required
`resolved_refiner_mode`	`str \| None`	The concrete refiner mode after "auto" has been resolved (either "full_frame" or "tiled"). If None, uses config.refiner_mode directly (must not be "auto").	`None`

Returns:

Type	Description
`Module`	GreenFormer in eval mode on config.device, ready for inference.

Raises:

Type	Description
`FileNotFoundError`	If the checkpoint file does not exist.
`RuntimeError`	If the checkpoint cannot be loaded.

Run model inference on a single preprocessed frame.

Parameters:

Name	Type	Description	Default
`frame`	`PreprocessedFrame`	Output of the preprocessing stage. tensor is [1, 4, H, W] on config.device, already ImageNet-normalised.	required
`model`	`Module`	Loaded GreenFormer in eval mode.	required
`config`	`InferenceConfig`	Inference configuration.	required
`resolved_refiner_mode`	`str \| None`	Pre-resolved refiner mode ("full_frame" or "tiled"). When provided, skips the per-frame VRAM probe that `_should_tile_refiner` would otherwise perform for "auto" mode. `TorchBackend.run()` always passes this to avoid pynvml overhead on every frame.	`None`

Returns:

Type	Description
`InferenceResult`	InferenceResult with alpha and fg tensors on device, plus FrameMeta.

Postprocess a single inference result into output-ready numpy arrays.

Parameters:

Name	Type	Description	Default
`result`	`InferenceResult`	InferenceResult from the inference stage.	required
`config`	`PostprocessConfig`	Postprocessing options (despill, despeckle, source_passthrough).	required
`stem`	`str`	Filename stem for output naming (e.g. "frame_000001"). Defaults to "frame_{frame_index:06d}" when empty.	`''`
`output_dir`	`Path \| None`	Root output directory. Required when config.debug_dump=True.	`None`

Returns:

Type	Description
`PostprocessedFrame`	PostprocessedFrame with all arrays at source resolution, float32.

Write all enabled outputs for one postprocessed frame to disk.

Parameters:

Name	Type	Description	Default
`frame`	`PostprocessedFrame`	PostprocessedFrame from the postprocessor stage.	required
`config`	`WriteConfig`	WriteConfig controlling which outputs to write and where.	required

Raises:

Type	Description
`OSError`	If any cv2.imwrite call fails.

Runner Classes

Runs the full pipeline for a single clip on a single device.

Instantiate once per clip, call run(), then discard.

Parameters:

Name	Type	Description	Default
`manifest`	`ClipManifest`	ClipManifest from the loader stage. Must have needs_alpha=False.	required
`config`	`PipelineConfig`	Pipeline configuration.	required

`run()`

Run the pipeline for the clip. Blocks until all frames are done.

Configuration for the full pipeline runner.

Attributes:

Name	Type	Description
`preprocess`	`PreprocessConfig`	Preprocessing stage config (img_size, device, strategy).
`inference`	`InferenceConfig \| None`	Inference stage config (checkpoint, device, precision). None means inference is skipped (preprocess-only mode).
`postprocess`	`PostprocessConfig`	Postprocessing stage config (despill, despeckle, checkerboard).
`write`	`WriteConfig \| None`	Writer stage config (formats, enabled outputs). None means a default WriteConfig is derived from the manifest output_dir.
`input_queue_depth`	`int`	Max preprocessed frames waiting for inference. Keep small — each frame is ~64MB on GPU at 2048 resolution.
`output_queue_depth`	`int`	Max inference outputs waiting for postprocessing.
`resolved_refiner_mode`	`str \| None`	Pre-resolved refiner mode ("full_frame" or "tiled"). When set, PipelineRunner skips the VRAM probe entirely and uses this value directly. Populated by `to_pipeline_config()` so the VRAM probe that already ran for img_size resolution is not repeated.

`resolved_refiner_mode = None` `class-attribute` `instance-attribute`

Optional event callbacks for all pipeline stages. Pass a PipelineEvents instance to receive per-frame and per-stage progress notifications.

Runs the pipeline across multiple GPUs in parallel (frame-level dispatch).

Each GPU gets its own model instance loaded in its own thread. All GPU workers pull from a single shared input queue and push to a single shared output queue. The preprocessor and postwriter are single-threaded as usual.

Frame ordering: output frames may arrive out of order when multiple GPUs are running at different speeds. The PostWriteWorker writes frames as they arrive — if strict ordering is required, the caller should sort by InferenceResult.meta.frame_index after the run.

Parameters:

Name	Type	Description	Default
`manifest`	`ClipManifest`	ClipManifest from the loader stage.	required
`config`	`MultiGPUConfig`	MultiGPUConfig with a list of device strings.	required

`run()`

Run the pipeline. Blocks until all frames are written.

Configuration for multi-GPU frame-level parallel inference.

Each device gets its own model instance and InferenceWorker thread. All workers share a single input queue (preprocessed frames) and a single output queue (inference results), so the pipeline naturally load-balances — whichever GPU finishes first picks up the next frame.

Attributes:

Name	Type	Description
`devices`	`list[str]`	List of PyTorch device strings to use (e.g. ["cuda:0", "cuda:1"]). Must have at least one entry. Use `resolve_devices("all")` to populate this from all available CUDA GPUs.
`inference`	`InferenceConfig`	Base InferenceConfig. The `device` field is overridden per-worker — all other fields (checkpoint, precision, etc.) are shared across all GPUs.
`preprocess`	`PreprocessConfig`	Preprocessing config. Runs on CPU (device-agnostic).
`postprocess`	`PostprocessConfig`	Postprocessing config.
`write`	`WriteConfig \| None`	Writer config. None → default derived from manifest output_dir.
`input_queue_depth`	`int`	Shared input queue depth. Scale with GPU count — a depth of 2×N gives each GPU a frame in flight plus one buffered.
`output_queue_depth`	`int`	Shared output queue depth.
`events`	`PipelineEvents \| None`	Optional pipeline event callbacks.

Contracts

Bases: BaseModel

A clip ready for stage 1. Output contract of stage 0.

Frozen — all fields are immutable after construction. Downstream stages must never mutate a Clip; create a new one if a field needs to change.

Attributes:

Name	Type	Description
`name`	`str`	Human-readable clip name derived from the folder name.
`root`	`Path`	Absolute path to the clip folder.
`input_path`	`Path`	Path to the input asset. Either the Input/ directory (for pre-structured clips) or a video file inside Input/ (for normalised videos).
`alpha_path`	`Path \| None`	Path to the alpha hint asset. None if absent — the interface must generate alpha externally and call resolve_alpha() before proceeding.

Bases: BaseModel

Output contract of stage 1. Input to all downstream stages.

Frozen — all fields are immutable after construction. Use model_copy() to produce an updated manifest (e.g. in resolve_alpha).

Downstream stages only receive what they need — resolved frame paths, output destination, and clip metadata. All discovery, validation, and extraction decisions are made in stage 1 and are not repeated.

Attributes:

Name	Type	Description
`clip_name`	`str`	Clip name, carried through for logging and output naming.
`clip_root`	`Path`	Absolute path to the clip folder. Parent of Input/, Output/, Frames/, AlphaHint/, etc.
`frames_dir`	`Path`	Directory containing the input frame sequence. Points to `Input/` for image sequence inputs, or `Frames/` for video inputs (extracted by stage 1).
`alpha_frames_dir`	`Path \| None`	Directory containing the alpha hint frame sequence. Points to `AlphaHint/` for image sequence inputs, or `AlphaFrames/` for video inputs. None if absent — the interface layer is responsible for generating alpha externally and calling `resolve_alpha()` to update the manifest before proceeding.
`output_dir`	`Path`	Directory where stage 6 writes all output images. Created by stage 1 at `clip/Output/`.
`needs_alpha`	`bool`	True if alpha is absent. The interface layer must generate alpha externally and call `resolve_alpha()` before proceeding.
`frame_count`	`int`	Total number of input frames.
`frame_range`	`tuple[int, int]`	Half-open range `(start, end)` of frames to process. Defaults to `(0, frame_count)` — the full sequence. Can be narrowed for partial runs or testing.
`is_linear`	`bool`	True if input frames are in linear light (e.g. .exr).
`video_meta_path`	`Path \| None`	Path to the `video_meta.json` sidecar file written by stage 1 during video extraction. None for image sequence inputs. Stage 6 reads this to re-encode output with matching properties.
`png_compression`	`int`	PNG compression level used during extraction (0–9). Stored so the writer stage can use the same level for consistency.

Bases: BaseModel

Source video metadata captured at extraction time.

Carried through the pipeline so stage 6 can re-encode output with matching properties.

Attributes:

Name	Type	Description
`filename`	`str`	Original video filename (stem + suffix).
`width`	`int`	Frame width in pixels.
`height`	`int`	Frame height in pixels.
`fps_num`	`int`	Framerate numerator.
`fps_den`	`int`	Framerate denominator.
`pix_fmt`	`str`	Pixel format string (e.g. "yuv420p").
`codec_name`	`str`	Video codec name (e.g. "h264", "prores").
`frame_count`	`int`	Total frame count as reported by the container. 0 if the container does not report it (use duration_s / fps instead).
`duration_s`	`float \| None`	Total duration in seconds. None if not reported by container.
`has_audio`	`bool`	True if the source container has at least one audio stream.
`color_space`	`str \| None`	Color space string (e.g. "bt709"). None if not reported.
`color_transfer`	`str \| None`	Transfer characteristic (e.g. "bt709"). None if not reported.
`color_primaries`	`str \| None`	Color primaries (e.g. "bt709"). None if not reported.

`estimated_frame_count` `property`

Best-effort frame count: container value if available, else duration * fps.

`fps` `property`

Framerate as a float.

Output contract of the preprocessing stage.

Attributes:

Name	Type	Description
`tensor`	`Tensor`	Model input tensor [1, 4, img_size, img_size] on device. Channels: [R_norm, G_norm, B_norm, alpha_hint]. dtype is float32 unless PreprocessConfig.half_precision=True.
`meta`	`FrameMeta`	Original frame dimensions and optional source image for postprocessing.

Metadata carried alongside the tensor for use by postprocessing.

Attributes:

Name	Type	Description
`frame_index`	`int`	Index of this frame within the clip's frame_range.
`original_h`	`int`	Frame height before resizing, in pixels.
`original_w`	`int`	Frame width before resizing, in pixels.
`source_image`	`ndarray \| None`	Original sRGB image [H, W, 3] float32, RGB channel order, at source resolution, used by postprocessor source_passthrough to replace model FG in opaque interior regions. None if source passthrough is disabled.
`alpha_hint`	`ndarray \| None`	Raw alpha hint [H, W, 1] float32 0-1 at source resolution, used by postprocessor hint_sharpen to produce a hard binary mask that eliminates soft edge tails introduced by upscaling. None if no alpha hint was provided.

Configuration for the preprocessing stage.

Attributes:

Name	Type	Description
`img_size`	`int`	Square resolution the model runs at. 2048 is the native training resolution — do not change unless retraining.
`device`	`str`	PyTorch device string ("cuda", "mps", "cpu").
`image_upsample_mode`	`ImageUpsampleMode`	Interpolation mode used when the source image is smaller than img_size. "bicubic" (default) gives the sharpest result. "bilinear" is faster but slightly softer. Has no effect when downscaling — area mode is always used then.
`half_precision`	`bool`	If True, cast tensors to float16 before inference.
`source_passthrough`	`bool`	If True, carry the original sRGB source image in FrameMeta so the postprocessor can replace model FG in opaque interior regions with original source pixels.
`sharpen_strength`	`float`	Unsharp mask strength applied after upscaling. 0.3 (default) recovers softness from the antialias filter. 0.0 disables. Has no effect when downscaling.

Runtime configuration for the inference stage.

Attributes:

Name	Type	Description
`checkpoint_path`	`Path`	Path to the .pth model checkpoint file.
`device`	`str`	PyTorch device string ("cuda", "cuda:0", "mps", "cpu").
`img_size`	`int`	Square resolution the model runs at. Must be one of 0, 512, 1024, 1536, or 2048. 0 means auto-select based on VRAM (resolved by `pipeline.to_inference_config()` before the model is loaded — do not pass 0 directly to `load_model`). 2048 is the native training resolution and produces the best output. Smaller values reduce VRAM usage at the cost of output quality.
`use_refiner`	`bool`	Whether to enable the CNN refiner module. The refiner corrects transformer macroblocking artifacts at subject edges. Disabling it is faster but produces visibly coarser alpha mattes.
`mixed_precision`	`bool`	Run the forward pass under autocast (fp16/bf16). Ignored on CPU. Reduces VRAM usage with minimal quality impact.
`model_precision`	`dtype`	Weight dtype for the model forward pass. float32 is safe everywhere. float16/bfloat16 saves VRAM on CUDA but may reduce numerical stability.
`refiner_mode`	`RefinerMode`	Controls how the CNN refiner executes. "auto" — probe VRAM; <12 GB → tiled, else → full_frame. "full_frame" — run the refiner on the full image at once. Best performance on GPUs with 12+ GB VRAM. "tiled" — run the refiner in 512×512 overlapping tiles. Keeps peak VRAM flat. Identical output quality to full_frame. Required on low-VRAM GPUs.
`refiner_scale`	`float`	Multiplier applied to the CNN refiner's delta output. 1.0 applies full refinement. 0.0 disables the refiner output entirely (equivalent to use_refiner=False but without skipping the forward pass). Values between 0 and 1 blend between no refinement and full refinement.
`backend`	`BackendChoice`	Which inference backend to use. "auto" — Apple Silicon + corridorkey-mlx installed → mlx, else → torch. "torch" — always use PyTorch (CUDA / ROCm / MPS / CPU). "mlx" — always use MLX (Apple Silicon only, optional package). Can also be overridden via the CORRIDORKEY_BACKEND env var.

Output contract of the inference stage. Input to postprocessing.

Attributes:

Name	Type	Description
`alpha`	`Tensor`	Predicted alpha matte [1, 1, img_size, img_size], sigmoid-activated, range 0-1, on device.
`fg`	`Tensor`	Predicted foreground colour [1, 3, img_size, img_size], sigmoid-activated sRGB range 0-1, on device.
`meta`	`FrameMeta`	Original frame dimensions and index, carried through from preprocessing so postprocessing can resize outputs back to source resolution.

Configuration for the postprocessor stage.

Attributes:

Name	Type	Description
`fg_upsample_mode`	`FgUpsampleMode`	Interpolation mode for upscaling the foreground when the model resolution is smaller than the source. "lanczos4" (default) gives the sharpest result. "bicubic" is slightly faster. "bilinear" is fastest. Downscaling always uses INTER_AREA.
`alpha_upsample_mode`	`AlphaUpsampleMode`	Interpolation mode for upscaling the alpha matte. "lanczos4" (default) gives the sharpest matte edges. "bilinear" is faster. Downscaling always uses INTER_AREA.
`despill_strength`	`float`	Green spill suppression strength (0.0 = off, 1.0 = full).
`auto_despeckle`	`bool`	Remove small disconnected alpha islands.
`despeckle_size`	`int`	Minimum connected region area in pixels to keep.
`despeckle_dilation`	`int`	Dilation radius in pixels applied after component removal to recover edges lost during binarisation. Default 25.
`despeckle_blur`	`int`	Gaussian blur radius applied after dilation to soften the hard mask edge. Default 5.
`checkerboard_size`	`int`	Tile size in pixels for the preview composite background.
`source_passthrough`	`bool`	Replace model FG in opaque interior regions with the original source pixels. Eliminates dark fringing caused by background contamination in the model FG prediction. Requires source_image in FrameMeta (set PreprocessConfig.source_passthrough=True).
`edge_erode_px`	`int`	Erosion radius (pixels) for the interior mask used by source_passthrough. Shrinks the interior region inward so the blend seam sits inside the subject rather than at the raw alpha edge.
`edge_blur_px`	`int`	Gaussian blur radius for the source_passthrough blend seam. Higher values produce a softer transition between model FG and source.
`hint_sharpen`	`bool`	Apply a hard binary mask derived from the alpha hint to eliminate soft edge tails introduced by upscaling. Requires an alpha hint in FrameMeta. Default True.
`hint_sharpen_dilation`	`int`	Dilation radius in pixels applied to the binarised hint before masking. Gives breathing room so fine model edge detail is not clipped. Default 3.
`debug_dump`	`bool`	Save raw inference output (before any postprocessing) to a `debug/` subfolder alongside the normal outputs. Writes four PNGs per frame: raw_alpha, raw_fg, post_hint_alpha, post_hint_fg. Useful for diagnosing whether quality issues originate in the model or in postprocessing. Default False.

Output contract of the postprocessor stage. Input to the writer stage.

All arrays are at original source resolution, float32, numpy.

Attributes:

Name	Type	Description
`alpha`	`ndarray`	Alpha matte [H, W, 1], linear, range 0-1.
`fg`	`ndarray`	Foreground RGB [H, W, 3], sRGB straight, range 0-1. In transparent regions the values are undefined — use `processed` for compositing work.
`processed`	`ndarray`	Premultiplied linear RGBA [H, W, 4], range 0-1. This is the primary output for compositing. Transparent regions are correctly zeroed out (fg * alpha), so no black-blob artefacts.
`comp`	`ndarray`	Preview composite over checkerboard [H, W, 3], sRGB, range 0-1.
`frame_index`	`int`	Frame index carried through from FrameMeta.
`source_h`	`int`	Original frame height in pixels.
`source_w`	`int`	Original frame width in pixels.
`stem`	`str`	Filename stem for output naming (e.g. "frame_000001").

Configuration for the writer stage.

Controls which outputs are written and in what format. All output subdirectories are created under output_dir.

Attributes:

Name	Type	Description
`output_dir`	`Path`	Root directory for all outputs.
`alpha_enabled`	`bool`	Write the alpha matte.
`alpha_format`	`ImageFormat`	File format for alpha output ("png" or "exr").
`fg_enabled`	`bool`	Write the straight sRGB foreground colour image.
`fg_format`	`ImageFormat`	File format for fg output ("png" or "exr").
`processed_enabled`	`bool`	Write the premultiplied linear RGBA output. This is the primary compositor output — transparent regions are correctly zeroed out. Saved as EXR (float32) by default.
`processed_format`	`ImageFormat`	File format for processed output ("png" or "exr").
`comp_enabled`	`bool`	Write the checkerboard preview composite.
`comp_format`	`Literal['png']`	File format for comp output (always "png").
`exr_compression`	`str`	EXR compression codec name. One of: "none", "rle", "zips", "zip", "piz", "pxr24", "dwaa", "dwab".

pipeline

Stage Functions

Runner Classes

run()

resolved_refiner_mode = None class-attribute instance-attribute

run()

Contracts

estimated_frame_count property

fps property

`run()`

`resolved_refiner_mode = None` `class-attribute` `instance-attribute`

`run()`

`estimated_frame_count` `property`

`fps` `property`