Add comprehensive LTX-Video support (LTXVideo, LTXVideoI2V, LTX2Video, LTX2VideoI2V)#35
Add comprehensive LTX-Video support (LTXVideo, LTXVideoI2V, LTX2Video, LTX2VideoI2V)#35Copilot wants to merge 1 commit into
Conversation
…, LTX2VideoI2V nodes Agent-Logs-Url: https://github.com/nodetool-ai/nodetool-huggingface/sessions/3a47e6ee-3e27-42ae-bb22-f77a93bee7b9 Co-authored-by: georgi <19498+georgi@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds new HuggingFace pipeline nodes to generate videos with Lightricks LTX-Video and LTX-2 models, covering both text-to-video and image-to-video workflows.
Changes:
- Add
LTXVideoandLTX2Videotext-to-video nodes using diffusers LTX/LTX2 pipelines. - Add
LTXVideoI2VandLTX2VideoI2Vimage-to-video nodes using diffusers LTX/LTX2 image2video pipelines. - Introduce model-variant selection, seed control, CPU offload, and VAE slicing/tiling options for these nodes.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 11 comments.
| File | Description |
|---|---|
| src/nodetool/nodes/huggingface/text_to_video.py | Adds LTX-Video and LTX-2 text-to-video nodes and related diffusers pipeline imports. |
| src/nodetool/nodes/huggingface/image_to_video.py | Adds LTX-Video and LTX-2 image-to-video nodes and related diffusers pipeline imports. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| frame_rate: int = Field( | ||
| default=25, | ||
| description="Frame rate for the output video file.", | ||
| ge=1, | ||
| le=60, | ||
| ) |
There was a problem hiding this comment.
This node introduces frame_rate for output FPS, but other image-to-video nodes in this file use the fps field name. For consistency across nodes (and to reduce UI/API special-casing), consider renaming this field to fps and using it consistently for video_from_frames output metadata; if the pipeline requires frame_rate, pass frame_rate=self.fps internally.
| frame_rate: int = Field( | ||
| default=25, | ||
| description="Frame rate for the output video file.", | ||
| ge=1, | ||
| le=60, | ||
| ) |
There was a problem hiding this comment.
This node uses frame_rate for output FPS, but the rest of the video nodes in this module use the fps field name (and pass fps=... to video_from_frames/video_from_numpy). Using a different name makes the API/UI inconsistent and increases maintenance burden. Consider renaming this field to fps (and, if the diffusers pipeline needs frame_rate, pass frame_rate=self.fps internally).
| frame_rate: int = Field( | |
| default=25, | |
| description="Frame rate for the output video file.", | |
| ge=1, | |
| le=60, | |
| ) | |
| fps: int = Field( | |
| default=25, | |
| alias="frame_rate", | |
| description="Frame rate for the output video file.", | |
| ge=1, | |
| le=60, | |
| ) | |
| @property | |
| def frame_rate(self) -> int: | |
| return self.fps |
| frame_rate: float = Field( | ||
| default=24.0, | ||
| description="Frame rate for the output video file.", | ||
| ge=1.0, | ||
| le=60.0, |
There was a problem hiding this comment.
frame_rate is defined as a float but video_from_frames requires an integer FPS, and the implementation truncates via int(self.frame_rate). This can silently change user input (e.g., 23.976 → 23). Prefer an fps: int field (consistent with other nodes) or validate/round explicitly so the stored metadata matches the requested frame rate.
| frame_rate: float = Field( | |
| default=24.0, | |
| description="Frame rate for the output video file.", | |
| ge=1.0, | |
| le=60.0, | |
| frame_rate: int = Field( | |
| default=24, | |
| description="Frame rate for the output video file in integer FPS.", | |
| ge=1, | |
| le=60, |
| num_frames: int = Field( | ||
| default=161, | ||
| description="Total frames in the output video. Must be 8n+1 (e.g. 65, 97, 129, 161). More frames = longer video.", | ||
| ge=9, | ||
| le=257, | ||
| ) |
There was a problem hiding this comment.
num_frames is documented as requiring the form 8n+1, but the current validation only enforces a range (ge/le). This allows invalid values (e.g., 10) that contradict the docs and may cause runtime errors. Add explicit validation (e.g., enforce (num_frames - 1) % 8 == 0) or constrain the UI/field to only valid values.
| frame_rate: float = Field( | ||
| default=24.0, | ||
| description="Frame rate for the output video file.", | ||
| ge=1.0, | ||
| le=60.0, |
There was a problem hiding this comment.
frame_rate is a float but the implementation later truncates to an int for encoding (fps = int(self.frame_rate)), which can silently change user intent (e.g., 23.976 → 23). Prefer an fps: int field (matching other nodes) or validate/round explicitly and reflect the effective FPS in the output metadata.
| frame_rate: float = Field( | |
| default=24.0, | |
| description="Frame rate for the output video file.", | |
| ge=1.0, | |
| le=60.0, | |
| frame_rate: int = Field( | |
| default=24, | |
| description="Integer frame rate (FPS) for the output video file.", | |
| ge=1, | |
| le=60, |
| class LTXVideoI2V(HuggingFacePipelineNode): | ||
| """ | ||
| Animates a static image into a video using LTX-Video image-to-video diffusion models. | ||
| video, generation, AI, image-to-video, diffusion, LTX, Lightricks, animation | ||
|
|
||
| Use cases: | ||
| - Animate photographs or artwork into dynamic video clips | ||
| - Create motion-driven video from a single input image with text guidance | ||
| - Generate content for social media or creative projects from still images | ||
| - Bring product images or illustrations to life | ||
| - Produce smooth animated sequences with fine temporal control | ||
|
|
||
| **Note:** LTX-Video-0.9.5 is recommended for best quality. | ||
| """ |
There was a problem hiding this comment.
New LTX image-to-video nodes are added without a lightweight import/smoke test. Consider adding an import-only test to ensure these nodes (and the diffusers.pipelines.ltx* imports) remain available with the supported dependency versions, similar to other HuggingFace node smoke tests.
| num_frames: int = Field( | ||
| default=161, | ||
| description="Total frames in the output video. Must be 8n+1 (e.g. 65, 97, 129, 161). More frames = longer video.", | ||
| ge=9, | ||
| le=257, | ||
| ) |
There was a problem hiding this comment.
num_frames is documented as requiring the form 8n+1, but the current validation only enforces a range (ge/le). This allows values like 10 or 100 that will violate the documented requirement and may fail at runtime. Add explicit validation (e.g., a Pydantic validator enforcing (num_frames - 1) % 8 == 0) or adjust the field constraints/UI to only permit valid values.
| default=1024, | ||
| description="Maximum prompt encoding length. LTX-2 supports long prompts up to 1024 tokens.", | ||
| ge=64, | ||
| le=2048, |
There was a problem hiding this comment.
The description says LTX-2 supports prompts up to 1024 tokens, but the field allows values up to 2048 (le=2048). Either update the description to match the actual supported limit or tighten the constraint to avoid allowing values that the model/pipeline may reject.
| le=2048, | |
| le=1024, |
| - Generate videos with improved temporal consistency and visual fidelity | ||
| - Produce cinematic content with detailed motion and scene description | ||
| - Build next-generation AI video generation workflows | ||
| - Create visual content with support for audio output |
There was a problem hiding this comment.
The docstring claims “support for audio output”, but the node only returns frames/video and doesn’t expose any audio-related parameters or outputs. If audio isn’t actually supported here, please remove/adjust this bullet to avoid misleading users; if it is supported, the node should surface the audio output in its return type/metadata.
| - Create visual content with support for audio output | |
| - Create visual content for video-first generation workflows |
| class LTXVideo(HuggingFacePipelineNode): | ||
| """ | ||
| Generates high-quality videos from text prompts using LTX-Video diffusion models. | ||
| video, generation, AI, text-to-video, diffusion, LTX, Lightricks, cinematic | ||
|
|
||
| Use cases: | ||
| - Create smooth, high-fidelity videos from detailed text descriptions | ||
| - Generate cinematic and artistic video content | ||
| - Produce short animated clips for social media and creative projects | ||
| - Build AI-driven video generation pipelines | ||
| - Create visual content with fine-grained temporal control | ||
|
|
||
| **Note:** LTX-Video-0.9.5 is recommended for best quality; older versions are also supported. | ||
| """ |
There was a problem hiding this comment.
New LTX nodes are added, but there’s no accompanying lightweight test to ensure these modules/nodes can be imported in the supported dependency set (e.g., that the diffusers.pipelines.ltx* entrypoints exist). Consider adding an import-only smoke test similar to existing HuggingFace node tests to catch dependency/version regressions early.
Summary
Implements comprehensive support for LTX-Video (Lightricks) models, including the newest LTX-2 architecture, with both text-to-video and image-to-video pipelines.
Changes
src/nodetool/nodes/huggingface/text_to_video.pyLTXVideo— Text-to-video usingLTXPipelinewith three model variants:Lightricks/LTX-Video-0.9.5(recommended, latest)Lightricks/LTX-Video-0.9.1Lightricks/LTX-Video(original)LTX2Video— Text-to-video usingLTX2PipelineforLightricks/LTX-2(newest architecture with Gemma3 text encoder, long prompt support up to 1024 tokens)src/nodetool/nodes/huggingface/image_to_video.pyLTXVideoI2V— Image-to-video usingLTXImageToVideoPipelinewith the same three model variants as aboveLTX2VideoI2V— Image-to-video usingLTX2ImageToVideoPipelineforLightricks/LTX-2Design Details
All nodes follow the established patterns in this repository:
enable_model_cpu_offload()to reduce VRAM usageget_basic_fields(),get_recommended_models(),get_title()implementedrequired_inputs()for image-to-video nodesLTX-Video nodes use
available_torch_dtype()(bfloat16/float16 depending on hardware). LTX-2 nodes usetorch.bfloat16explicitly as required by the model.