An open AI agent skill for generating videos using Wan (Alibaba) models via Atlas Cloud API. Text-to-video, image-to-video, video-to-video β all from your terminal. Wan 2.1/2.2 are open source (Apache 2.0), Wan 2.5/2.6 are closed-source APIs. LoRA support, and NSFW-capable with Wan 2.2 Spicy.
Built for the open agent skills ecosystem β works with Claude Code, Cursor, Codex, Copilot, Gemini CLI, Windsurf, OpenCode, Kiro, and 15+ AI coding agents.
- Text-to-Video β Generate video from a text prompt
- Image-to-Video β Animate a static image into a video
- Video-to-Video β Transform existing video with new styles/content
- Wan Model Family β Built on Alibaba's Wan video generation models (Wan 2.1/2.2 are open source under Apache 2.0; Wan 2.5/2.6 are closed-source commercial APIs)
- LoRA Support β Use custom LoRA weights with Wan 2.2 Spicy for fine-tuned results
- NSFW Mode β Wan 2.2 Spicy for uncensored, unrestricted content generation
- Multiple Resolutions β 480p, 720p, 1080p output support
- Variable Duration β 3s to 10s video generation
- Affordable Pricing β Starting at just from $0.03/s with Atlas Cloud
| Model | Mode | Starting Price per Second | Resolution | Duration | Notes |
|---|---|---|---|---|---|
| Wan 2.6 T2V | Text-to-Video | from $0.07/s | Up to 720p | 5s | Latest generation, best quality |
| Wan 2.6 I2V | Image-to-Video | from $0.07/s | Up to 720p | 5s | Animate any image |
| Wan 2.6 V2V | Video-to-Video | from $0.07/s | Up to 720p | 5s | Transform existing videos |
| Wan 2.5 T2V | Text-to-Video | from $0.05/s | Up to 720p | 5s | Balanced quality and cost |
| Wan 2.2 Spicy | Image-to-Video | from $0.03/s | Up to 480p | 5s | NSFW/uncensored content |
| Wan 2.2 Spicy LoRA | Image-to-Video | from $0.03/s | Up to 480p | 5s | NSFW + custom LoRA weights |
Prices shown are starting prices. Higher resolution or longer duration may cost more.
Requirements: Bun
# Clone the repo
git clone https://github.com/thoughtincode/wan-video-skill.git ~/tools/wan-video-skill
cd ~/tools/wan-video-skill
# Install dependencies
bun install
# Link globally (no sudo needed - uses Bun's global bin)
bun link
# Set up your API key
cp .env.example .env
# Edit .env and add your Atlas Cloud API keyGet an Atlas Cloud API key at Atlas Cloud.
Now you can use wan-video from anywhere.
When installed as an agent skill (via Claude Code, Cursor, Codex, Copilot, Gemini CLI, Windsurf, Kiro, and more), just say /init and your AI agent will clone the repo, install deps, and link the command for you. Then use it by saying "generate a video of..." and the agent handles the rest.
mkdir -p ~/.local/bin
ln -sf ~/tools/wan-video-skill/src/cli.ts ~/.local/bin/wan-video
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc# Basic text-to-video
wan-video "a cat walking across a sunny garden"
# Custom output name
wan-video "ocean waves crashing on rocks" -o waves
# Higher resolution
wan-video "futuristic city flyover" --resolution 720p
# Longer duration
wan-video "time-lapse of flowers blooming" --duration 10
# Custom output directory
wan-video "sunset timelapse" -o sunset -d ~/Videos# Text-to-Video (default)
wan-video "your prompt"
# Image-to-Video β animate a static image
wan-video "make this character wave hello" --mode i2v --image character.png
# Video-to-Video β transform existing video
wan-video "convert to anime style" --mode v2v --image input.mp4# Default β Wan 2.6 (latest, best quality)
wan-video "your prompt"
# Wan 2.5 β balanced quality and cost
wan-video "your prompt" --model wan-2.5
# Wan 2.6 Image-to-Video
wan-video "animate this scene" --model wan-2.6-i2v --image scene.png
# Wan 2.6 Video-to-Video
wan-video "add rain effect" --model wan-2.6-v2v --image input.mp4| Alias | Model ID | Best For |
|---|---|---|
wan-2.6, wan-2.6-t2v |
alibaba/wan-2.6/text-to-video |
Best quality text-to-video |
wan-2.6-i2v |
alibaba/wan-2.6/image-to-video |
Animating static images |
wan-2.6-v2v |
alibaba/wan-2.6/video-to-video |
Transforming existing videos |
wan-2.5 |
alibaba/wan-2.5/text-to-video |
Cost-effective generation |
wan-spicy |
alibaba/wan-2.2-spicy/image-to-video |
NSFW/uncensored content |
wan-spicy-lora |
alibaba/wan-2.2-spicy/image-to-video-lora |
NSFW + custom LoRA |
Use the --nsfw flag to switch to Wan 2.2 Spicy for uncensored content generation:
# NSFW image-to-video (requires --image)
wan-video "your nsfw prompt" --nsfw --image input.png
# NSFW with LoRA weights
wan-video "your nsfw prompt" --nsfw --lora --image input.pngThe --nsfw flag automatically selects the Wan 2.2 Spicy model. Add --lora to use the LoRA variant for custom fine-tuned styles.
Important: NSFW mode requires an input image (--image flag). Wan 2.2 Spicy only supports image-to-video generation.
# 480p (fastest, cheapest)
wan-video "quick concept video" --resolution 480p
# 720p (default, balanced)
wan-video "product demo video" --resolution 720p
# Short clip
wan-video "logo animation" --duration 3
# Long clip
wan-video "nature documentary scene" --duration 10| Option | Default | Description |
|---|---|---|
-o, --output |
wan-gen-{timestamp} |
Output filename (no extension) |
--model |
wan-2.6 |
Model alias or full model ID |
--mode |
t2v |
Generation mode: t2v, i2v, v2v |
--duration |
5 |
Video duration in seconds (3-10) |
--resolution |
720p |
Output resolution: 480p, 720p, 1080p |
--nsfw |
false |
Use Wan 2.2 Spicy for NSFW content |
--lora |
false |
Use LoRA variant (only with --nsfw) |
--image |
- | Input image/video path (for i2v/v2v modes) |
-d, --dir |
current directory | Output directory |
--api-key |
- | Atlas Cloud API key (overrides env/file) |
-h, --help |
- | Show help |
The CLI resolves the Atlas Cloud API key in priority order:
--api-keyflag on the command lineATLAS_API_KEYenvironment variable.envfile in the current working directory.envfile in the repo root (next tosrc/)~/.wan-video/.env
# Option 1: Environment variable
export ATLAS_API_KEY=your_key_here
# Option 2: .env file in current directory
echo "ATLAS_API_KEY=your_key_here" > .env
# Option 3: Global config
mkdir -p ~/.wan-video
echo "ATLAS_API_KEY=your_key_here" > ~/.wan-video/.env
# Option 4: Pass directly
wan-video "your prompt" --api-key your_key_hereThe CLI uses the Atlas Cloud API to interface with Alibaba's Wan video generation models:
- Submit Request β Sends your prompt (and optional image/video) to the Atlas Cloud API
- Poll for Completion β Checks the prediction status every 5 seconds until the video is ready
- Download Result β Downloads the generated video to your specified output location
POST /api/v1/model/prediction
β Returns request_id
GET /api/v1/model/prediction/{request_id}
β Poll until status: "completed"
β Download output video URL
Default: alibaba/wan-2.6/text-to-video
--mode i2v: alibaba/wan-2.6/image-to-video
--mode v2v: alibaba/wan-2.6/video-to-video
--model wan-2.5: alibaba/wan-2.5/text-to-video
--nsfw: alibaba/wan-2.2-spicy/image-to-video
--nsfw --lora: alibaba/wan-2.2-spicy/image-to-video-lora
The newest generation of Alibaba's Wan video models. Wan 2.6 delivers the best quality across text-to-video, image-to-video, and video-to-video tasks. It supports up to 720p resolution and 5-second clips, with significantly improved temporal consistency and prompt adherence compared to earlier versions.
- Best for: Production-quality video generation, commercial content, marketing materials
- Strengths: Superior motion quality, accurate prompt following, clean outputs
- Price: from $0.07/s β the most affordable high-quality video API available
The previous generation model offering a good balance of quality and cost. Wan 2.5 produces solid results at a lower price point, making it ideal for prototyping and high-volume generation.
- Best for: Prototyping, batch generation, cost-sensitive workflows
- Strengths: Fast generation, reliable quality, lower cost
- Price: from $0.05/s
An uncensored variant of the Wan model specifically designed for adult content generation. Wan 2.2 Spicy removes all content filters and safety restrictions, allowing generation of explicit content.
- Best for: Adult content platforms, unrestricted creative work
- Strengths: No content filters, NSFW output, LoRA support for custom styles
- Price: from $0.03/s β cheapest NSFW video generation API
- Note: Image-to-video only β requires an input image
The LoRA variant of Wan 2.2 Spicy adds support for custom fine-tuned weights. This allows you to apply specific styles, characters, or aesthetics to your NSFW generations.
- Best for: Custom character generation, specific art styles, branded content
- Strengths: All Spicy features + custom LoRA weight support
- Price: from $0.03/s
- Marketing Videos β Product demos, social media clips, ad creatives
- Prototyping β Quick video concepts before full production
- Content Creation β YouTube thumbnails-to-video, blog post illustrations
- Game Development β Cutscene concepts, trailer storyboards
- Animation β Animate static artwork, character concepts in motion
- Style Transfer β Transform existing videos into new visual styles
- Adult Content β NSFW generation with Wan 2.2 Spicy (uncensored)
- Video Production β Visual elements for Remotion/video compositions
When installed as an agent skill, the skill triggers on phrases like:
- "generate a video"
- "create a video clip"
- "animate this image"
- "make a video of"
- "convert this to video"
Your AI agent will construct the appropriate wan-video command based on your request, handling model selection, resolution, duration, NSFW mode, and output configuration automatically. Works across Claude Code, Cursor, Codex, Copilot, Gemini CLI, Windsurf, Kiro, and more.
User: "Generate a 10-second video of a dragon flying over mountains"
Agent: wan-video "a dragon flying over mountains, cinematic aerial shot" --duration 10
User: "Animate this character image"
Agent: wan-video "character performing idle animation, smooth motion" --mode i2v --image character.png
User: "Make an NSFW video from this image"
Agent: wan-video "your prompt" --nsfw --image input.png
| Feature | Wan 2.6 | Kling 2.0 | Runway Gen-3 | Sora |
|---|---|---|---|---|
| Price | from $0.07/s | $0.10+ | $0.50+ | N/A |
| Open Source | 2.1/2.2 only | No | No | No |
| NSFW Support | Yes (Spicy) | No | No | No |
| LoRA Support | Yes (Spicy) | No | No | No |
| API Access | Atlas Cloud | Limited | Runway API | Waitlist |
| Text-to-Video | Yes | Yes | Yes | Yes |
| Image-to-Video | Yes | Yes | Yes | Yes |
| Video-to-Video | Yes | No | No | No |
Writing effective prompts is key to getting great results from Wan video models. Here are some best practices:
Instead of vague descriptions, describe exactly what should move and how:
# Vague (unpredictable results)
wan-video "a person in a park"
# Specific (much better results)
wan-video "a woman walking slowly through a sunlit park, camera follows from behind, gentle breeze moving tree leaves"Wan models respond well to cinematographic language:
# Dolly shot
wan-video "slow dolly forward through a dimly lit hallway, dust particles in light beams"
# Aerial shot
wan-video "drone aerial shot flying over a vast mountain range at golden hour, clouds below"
# Tracking shot
wan-video "tracking shot following a cyclist through city streets, shallow depth of field"
# Static shot
wan-video "static wide shot of a waterfall in a tropical forest, mist rising"wan-video "neon-lit cyberpunk alley at night, rain reflections on wet pavement, fog rolling in"
wan-video "soft morning light streaming through curtains, dust particles visible in light beams"
wan-video "harsh midday sun casting sharp shadows on a desert landscape"# Cinematic
wan-video "cinematic shot of a train crossing a bridge at sunset, 35mm film grain, anamorphic lens flare"
# Documentary
wan-video "documentary style close-up of hands crafting pottery, natural lighting, handheld camera"
# Music video
wan-video "stylized music video shot, artist performing on rooftop, city skyline background, dramatic lighting"A good prompt follows this pattern:
[subject] [action] [environment] [camera/style] [lighting/mood]
Example:
wan-video "a golden retriever (subject) running through shallow water (action) on a beach at sunset (environment), slow motion tracking shot (camera), warm golden hour lighting with lens flare (mood)"Generate multiple videos with a simple shell loop:
# Generate variations of the same concept
for i in 1 2 3 4 5; do
wan-video "abstract fluid art, colorful paint mixing in water, macro shot" \
-o "fluid-art-$i" --duration 5
doneUse AI-generated images as input for Wan I2V:
# Step 1: Generate an image (using any image generation tool)
# Step 2: Animate it with Wan
wan-video "the character slowly turns to face the camera and smiles" \
--mode i2v --image generated-character.png -o animated-characterPost-process your generated videos:
# Generate a video
wan-video "cinematic landscape timelapse" -o landscape
# Add slow motion (2x slower)
ffmpeg -i landscape.mp4 -filter:v "setpts=2.0*PTS" landscape-slow.mp4
# Create a loop
ffmpeg -stream_loop 3 -i landscape.mp4 -c copy landscape-looped.mp4
# Convert to GIF
ffmpeg -i landscape.mp4 -vf "fps=15,scale=480:-1" -loop 0 landscape.gif
# Add audio track
ffmpeg -i landscape.mp4 -i music.mp3 -shortest -c:v copy landscape-audio.mp4# Production environment
ATLAS_API_KEY=prod_key wan-video "product demo" --resolution 720p --model wan-2.6
# Development (cheaper model for testing)
ATLAS_API_KEY=dev_key wan-video "test video" --model wan-2.5
# NSFW pipeline
ATLAS_API_KEY=nsfw_key wan-video "prompt" --nsfw --image input.pngTypical generation times:
- Wan 2.6: 30-90 seconds depending on resolution and duration
- Wan 2.5: 20-60 seconds
- Wan 2.2 Spicy: 15-45 seconds
The CLI polls automatically and will timeout after 5 minutes if the request hasn't completed.
PNG, JPEG, WebP, and GIF are supported for image-to-video mode. For best results, use PNG or JPEG at the target resolution.
MP4, MOV, and AVI are supported for video-to-video mode. The output is always MP4.
Yes. Wan 2.1/2.2 models are open source (Apache 2.0). Wan 2.5/2.6 are closed-source commercial APIs. This CLI is MIT licensed. You own the generated content. Check Atlas Cloud's terms for API usage specifics.
- Use Wan 2.6 (latest model)
- Write detailed, specific prompts
- Use 720p resolution (best quality/speed balance)
- Include camera and lighting descriptions in your prompt
Currently 3-10 seconds per generation. For longer videos, generate multiple clips and concatenate them with FFmpeg:
wan-video "scene 1" -o clip1 --duration 10
wan-video "scene 2" -o clip2 --duration 10
ffmpeg -f concat -i <(echo -e "file clip1.mp4\nfile clip2.mp4") -c copy final.mp4Wan models currently support up to 720p natively. For 1080p output, generate at 720p and upscale with FFmpeg or an AI upscaler.
Make sure your Atlas Cloud API key is set. See API Key Configuration.
Video generation can take 30-120 seconds depending on the model and resolution. The CLI will poll automatically. If it times out after 5 minutes, try again or use a lower resolution.
Check that you're using a valid model alias. Run wan-video --help to see all available models.
Wan 2.2 Spicy only supports image-to-video. You must provide an input image with --image.
This workflow is optimized for Atlas Cloud. Move from experiment to enterprise-ready scale.
- Production-Ready: Wan 2.6 at only from $0.07/s β cheapest API available
- NSFW Support: Wan 2.2 Spicy for uncensored content at from $0.03/s
- Enterprise Security: SOC I & II Certified | HIPAA Compliant
- Zero Maintenance: Serverless architectureβfocus on your product, not the servers
π Start Building
MIT