Wan Video Skill

An open AI agent skill for generating videos using Wan (Alibaba) models via Atlas Cloud API. Text-to-video, image-to-video, video-to-video — all from your terminal. Wan 2.1/2.2 are open source (Apache 2.0), Wan 2.5/2.6 are closed-source APIs. LoRA support, and NSFW-capable with Wan 2.2 Spicy.

Built for the open agent skills ecosystem — works with Claude Code, Cursor, Codex, Copilot, Gemini CLI, Windsurf, OpenCode, Kiro, and 15+ AI coding agents.

Features

Text-to-Video — Generate video from a text prompt
Image-to-Video — Animate a static image into a video
Video-to-Video — Transform existing video with new styles/content
Wan Model Family — Built on Alibaba's Wan video generation models (Wan 2.1/2.2 are open source under Apache 2.0; Wan 2.5/2.6 are closed-source commercial APIs)
LoRA Support — Use custom LoRA weights with Wan 2.2 Spicy for fine-tuned results
NSFW Mode — Wan 2.2 Spicy for uncensored, unrestricted content generation
Multiple Resolutions — 480p, 720p, 1080p output support
Variable Duration — 3s to 10s video generation
Affordable Pricing — Starting at just from $0.03/s with Atlas Cloud

Model Variants

Model	Mode	Starting Price per Second	Resolution	Duration	Notes
Wan 2.6 T2V	Text-to-Video	from $0.07/s	Up to 720p	5s	Latest generation, best quality
Wan 2.6 I2V	Image-to-Video	from $0.07/s	Up to 720p	5s	Animate any image
Wan 2.6 V2V	Video-to-Video	from $0.07/s	Up to 720p	5s	Transform existing videos
Wan 2.5 T2V	Text-to-Video	from $0.05/s	Up to 720p	5s	Balanced quality and cost
Wan 2.2 Spicy	Image-to-Video	from $0.03/s	Up to 480p	5s	NSFW/uncensored content
Wan 2.2 Spicy LoRA	Image-to-Video	from $0.03/s	Up to 480p	5s	NSFW + custom LoRA weights

Prices shown are starting prices. Higher resolution or longer duration may cost more.

Install

Requirements: Bun

# Clone the repo
git clone https://github.com/thoughtincode/wan-video-skill.git ~/tools/wan-video-skill
cd ~/tools/wan-video-skill

# Install dependencies
bun install

# Link globally (no sudo needed - uses Bun's global bin)
bun link

# Set up your API key
cp .env.example .env
# Edit .env and add your Atlas Cloud API key

Get an Atlas Cloud API key at Atlas Cloud.

Now you can use wan-video from anywhere.

As an Agent Skill

When installed as an agent skill (via Claude Code, Cursor, Codex, Copilot, Gemini CLI, Windsurf, Kiro, and more), just say /init and your AI agent will clone the repo, install deps, and link the command for you. Then use it by saying "generate a video of..." and the agent handles the rest.

Fallback (if `bun link` doesn't work)

mkdir -p ~/.local/bin
ln -sf ~/tools/wan-video-skill/src/cli.ts ~/.local/bin/wan-video
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Usage

# Basic text-to-video
wan-video "a cat walking across a sunny garden"

# Custom output name
wan-video "ocean waves crashing on rocks" -o waves

# Higher resolution
wan-video "futuristic city flyover" --resolution 720p

# Longer duration
wan-video "time-lapse of flowers blooming" --duration 10

# Custom output directory
wan-video "sunset timelapse" -o sunset -d ~/Videos

Modes

# Text-to-Video (default)
wan-video "your prompt"

# Image-to-Video — animate a static image
wan-video "make this character wave hello" --mode i2v --image character.png

# Video-to-Video — transform existing video
wan-video "convert to anime style" --mode v2v --image input.mp4

Models

# Default — Wan 2.6 (latest, best quality)
wan-video "your prompt"

# Wan 2.5 — balanced quality and cost
wan-video "your prompt" --model wan-2.5

# Wan 2.6 Image-to-Video
wan-video "animate this scene" --model wan-2.6-i2v --image scene.png

# Wan 2.6 Video-to-Video
wan-video "add rain effect" --model wan-2.6-v2v --image input.mp4

Alias	Model ID	Best For
`wan-2.6`, `wan-2.6-t2v`	`alibaba/wan-2.6/text-to-video`	Best quality text-to-video
`wan-2.6-i2v`	`alibaba/wan-2.6/image-to-video`	Animating static images
`wan-2.6-v2v`	`alibaba/wan-2.6/video-to-video`	Transforming existing videos
`wan-2.5`	`alibaba/wan-2.5/text-to-video`	Cost-effective generation
`wan-spicy`	`alibaba/wan-2.2-spicy/image-to-video`	NSFW/uncensored content
`wan-spicy-lora`	`alibaba/wan-2.2-spicy/image-to-video-lora`	NSFW + custom LoRA

NSFW Mode

Use the --nsfw flag to switch to Wan 2.2 Spicy for uncensored content generation:

# NSFW image-to-video (requires --image)
wan-video "your nsfw prompt" --nsfw --image input.png

# NSFW with LoRA weights
wan-video "your nsfw prompt" --nsfw --lora --image input.png

The --nsfw flag automatically selects the Wan 2.2 Spicy model. Add --lora to use the LoRA variant for custom fine-tuned styles.

Important: NSFW mode requires an input image (--image flag). Wan 2.2 Spicy only supports image-to-video generation.

Resolution and Duration

# 480p (fastest, cheapest)
wan-video "quick concept video" --resolution 480p

# 720p (default, balanced)
wan-video "product demo video" --resolution 720p

# Short clip
wan-video "logo animation" --duration 3

# Long clip
wan-video "nature documentary scene" --duration 10

Options

Option	Default	Description
`-o, --output`	`wan-gen-{timestamp}`	Output filename (no extension)
`--model`	`wan-2.6`	Model alias or full model ID
`--mode`	`t2v`	Generation mode: `t2v`, `i2v`, `v2v`
`--duration`	`5`	Video duration in seconds (3-10)
`--resolution`	`720p`	Output resolution: `480p`, `720p`, `1080p`
`--nsfw`	`false`	Use Wan 2.2 Spicy for NSFW content
`--lora`	`false`	Use LoRA variant (only with --nsfw)
`--image`	-	Input image/video path (for i2v/v2v modes)
`-d, --dir`	current directory	Output directory
`--api-key`	-	Atlas Cloud API key (overrides env/file)
`-h, --help`	-	Show help

API Key Configuration

The CLI resolves the Atlas Cloud API key in priority order:

--api-key flag on the command line
ATLAS_API_KEY environment variable
.env file in the current working directory
.env file in the repo root (next to src/)
~/.wan-video/.env

# Option 1: Environment variable
export ATLAS_API_KEY=your_key_here

# Option 2: .env file in current directory
echo "ATLAS_API_KEY=your_key_here" > .env

# Option 3: Global config
mkdir -p ~/.wan-video
echo "ATLAS_API_KEY=your_key_here" > ~/.wan-video/.env

# Option 4: Pass directly
wan-video "your prompt" --api-key your_key_here

How It Works

The CLI uses the Atlas Cloud API to interface with Alibaba's Wan video generation models:

Submit Request — Sends your prompt (and optional image/video) to the Atlas Cloud API
Poll for Completion — Checks the prediction status every 5 seconds until the video is ready
Download Result — Downloads the generated video to your specified output location

API Flow

POST /api/v1/model/prediction
  → Returns request_id

GET /api/v1/model/prediction/{request_id}
  → Poll until status: "completed"
  → Download output video URL

Model Selection Logic

Default:         alibaba/wan-2.6/text-to-video
--mode i2v:      alibaba/wan-2.6/image-to-video
--mode v2v:      alibaba/wan-2.6/video-to-video
--model wan-2.5: alibaba/wan-2.5/text-to-video
--nsfw:          alibaba/wan-2.2-spicy/image-to-video
--nsfw --lora:   alibaba/wan-2.2-spicy/image-to-video-lora

Wan Model Deep Dive

Wan 2.6 (Latest)

The newest generation of Alibaba's Wan video models. Wan 2.6 delivers the best quality across text-to-video, image-to-video, and video-to-video tasks. It supports up to 720p resolution and 5-second clips, with significantly improved temporal consistency and prompt adherence compared to earlier versions.

Best for: Production-quality video generation, commercial content, marketing materials
Strengths: Superior motion quality, accurate prompt following, clean outputs
Price: from $0.07/s — the most affordable high-quality video API available

Wan 2.5

The previous generation model offering a good balance of quality and cost. Wan 2.5 produces solid results at a lower price point, making it ideal for prototyping and high-volume generation.

Best for: Prototyping, batch generation, cost-sensitive workflows
Strengths: Fast generation, reliable quality, lower cost
Price: from $0.05/s

Wan 2.2 Spicy (NSFW)

An uncensored variant of the Wan model specifically designed for adult content generation. Wan 2.2 Spicy removes all content filters and safety restrictions, allowing generation of explicit content.

Best for: Adult content platforms, unrestricted creative work
Strengths: No content filters, NSFW output, LoRA support for custom styles
Price: from $0.03/s — cheapest NSFW video generation API
Note: Image-to-video only — requires an input image

Wan 2.2 Spicy LoRA

The LoRA variant of Wan 2.2 Spicy adds support for custom fine-tuned weights. This allows you to apply specific styles, characters, or aesthetics to your NSFW generations.

Best for: Custom character generation, specific art styles, branded content
Strengths: All Spicy features + custom LoRA weight support
Price: from $0.03/s

Use Cases

Marketing Videos — Product demos, social media clips, ad creatives
Prototyping — Quick video concepts before full production
Content Creation — YouTube thumbnails-to-video, blog post illustrations
Game Development — Cutscene concepts, trailer storyboards
Animation — Animate static artwork, character concepts in motion
Style Transfer — Transform existing videos into new visual styles
Adult Content — NSFW generation with Wan 2.2 Spicy (uncensored)
Video Production — Visual elements for Remotion/video compositions

Agent Skill Integration

When installed as an agent skill, the skill triggers on phrases like:

"generate a video"
"create a video clip"
"animate this image"
"make a video of"
"convert this to video"

Your AI agent will construct the appropriate wan-video command based on your request, handling model selection, resolution, duration, NSFW mode, and output configuration automatically. Works across Claude Code, Cursor, Codex, Copilot, Gemini CLI, Windsurf, Kiro, and more.

Example Skill Interactions

User: "Generate a 10-second video of a dragon flying over mountains"
Agent: wan-video "a dragon flying over mountains, cinematic aerial shot" --duration 10

User: "Animate this character image"
Agent: wan-video "character performing idle animation, smooth motion" --mode i2v --image character.png

User: "Make an NSFW video from this image"
Agent: wan-video "your prompt" --nsfw --image input.png

Comparison with Other Video Models

Feature	Wan 2.6	Kling 2.0	Runway Gen-3	Sora
Price	from $0.07/s	$0.10+	$0.50+	N/A
Open Source	2.1/2.2 only	No	No	No
NSFW Support	Yes (Spicy)	No	No	No
LoRA Support	Yes (Spicy)	No	No	No
API Access	Atlas Cloud	Limited	Runway API	Waitlist
Text-to-Video	Yes	Yes	Yes	Yes
Image-to-Video	Yes	Yes	Yes	Yes
Video-to-Video	Yes	No	No	No

Prompt Engineering Tips

Writing effective prompts is key to getting great results from Wan video models. Here are some best practices:

Be Specific About Motion

Instead of vague descriptions, describe exactly what should move and how:

# Vague (unpredictable results)
wan-video "a person in a park"

# Specific (much better results)
wan-video "a woman walking slowly through a sunlit park, camera follows from behind, gentle breeze moving tree leaves"

Include Camera Instructions

Wan models respond well to cinematographic language:

# Dolly shot
wan-video "slow dolly forward through a dimly lit hallway, dust particles in light beams"

# Aerial shot
wan-video "drone aerial shot flying over a vast mountain range at golden hour, clouds below"

# Tracking shot
wan-video "tracking shot following a cyclist through city streets, shallow depth of field"

# Static shot
wan-video "static wide shot of a waterfall in a tropical forest, mist rising"

Describe Lighting and Atmosphere

wan-video "neon-lit cyberpunk alley at night, rain reflections on wet pavement, fog rolling in"
wan-video "soft morning light streaming through curtains, dust particles visible in light beams"
wan-video "harsh midday sun casting sharp shadows on a desert landscape"

Use Style Keywords

# Cinematic
wan-video "cinematic shot of a train crossing a bridge at sunset, 35mm film grain, anamorphic lens flare"

# Documentary
wan-video "documentary style close-up of hands crafting pottery, natural lighting, handheld camera"

# Music video
wan-video "stylized music video shot, artist performing on rooftop, city skyline background, dramatic lighting"

Prompt Structure

A good prompt follows this pattern:

[subject] [action] [environment] [camera/style] [lighting/mood]

Example:

wan-video "a golden retriever (subject) running through shallow water (action) on a beach at sunset (environment), slow motion tracking shot (camera), warm golden hour lighting with lens flare (mood)"

Advanced Usage

Batch Generation

Generate multiple videos with a simple shell loop:

# Generate variations of the same concept
for i in 1 2 3 4 5; do
  wan-video "abstract fluid art, colorful paint mixing in water, macro shot" \
    -o "fluid-art-$i" --duration 5
done

Pipeline with Image-to-Video

Use AI-generated images as input for Wan I2V:

# Step 1: Generate an image (using any image generation tool)
# Step 2: Animate it with Wan
wan-video "the character slowly turns to face the camera and smiles" \
  --mode i2v --image generated-character.png -o animated-character

Combining with FFmpeg

Post-process your generated videos:

# Generate a video
wan-video "cinematic landscape timelapse" -o landscape

# Add slow motion (2x slower)
ffmpeg -i landscape.mp4 -filter:v "setpts=2.0*PTS" landscape-slow.mp4

# Create a loop
ffmpeg -stream_loop 3 -i landscape.mp4 -c copy landscape-looped.mp4

# Convert to GIF
ffmpeg -i landscape.mp4 -vf "fps=15,scale=480:-1" -loop 0 landscape.gif

# Add audio track
ffmpeg -i landscape.mp4 -i music.mp3 -shortest -c:v copy landscape-audio.mp4

Environment-Specific Configurations

# Production environment
ATLAS_API_KEY=prod_key wan-video "product demo" --resolution 720p --model wan-2.6

# Development (cheaper model for testing)
ATLAS_API_KEY=dev_key wan-video "test video" --model wan-2.5

# NSFW pipeline
ATLAS_API_KEY=nsfw_key wan-video "prompt" --nsfw --image input.png

Frequently Asked Questions

How long does video generation take?

Typical generation times:

Wan 2.6: 30-90 seconds depending on resolution and duration
Wan 2.5: 20-60 seconds
Wan 2.2 Spicy: 15-45 seconds

The CLI polls automatically and will timeout after 5 minutes if the request hasn't completed.

What image formats are supported for I2V?

PNG, JPEG, WebP, and GIF are supported for image-to-video mode. For best results, use PNG or JPEG at the target resolution.

What video formats are supported for V2V?

MP4, MOV, and AVI are supported for video-to-video mode. The output is always MP4.

Can I use this commercially?

Yes. Wan 2.1/2.2 models are open source (Apache 2.0). Wan 2.5/2.6 are closed-source commercial APIs. This CLI is MIT licensed. You own the generated content. Check Atlas Cloud's terms for API usage specifics.

How do I get the best quality?

Use Wan 2.6 (latest model)
Write detailed, specific prompts
Use 720p resolution (best quality/speed balance)
Include camera and lighting descriptions in your prompt

What's the maximum video duration?

Currently 3-10 seconds per generation. For longer videos, generate multiple clips and concatenate them with FFmpeg:

wan-video "scene 1" -o clip1 --duration 10
wan-video "scene 2" -o clip2 --duration 10
ffmpeg -f concat -i <(echo -e "file clip1.mp4\nfile clip2.mp4") -c copy final.mp4

Can I generate in 1080p?

Wan models currently support up to 720p natively. For 1080p output, generate at 720p and upscale with FFmpeg or an AI upscaler.

Troubleshooting

"API key not found"

Make sure your Atlas Cloud API key is set. See API Key Configuration.

"Request timed out"

Video generation can take 30-120 seconds depending on the model and resolution. The CLI will poll automatically. If it times out after 5 minutes, try again or use a lower resolution.

"Model not found"

Check that you're using a valid model alias. Run wan-video --help to see all available models.

"NSFW mode requires --image"

Wan 2.2 Spicy only supports image-to-video. You must provide an input image with --image.

🚀 Take This to Production Today

This workflow is optimized for Atlas Cloud. Move from experiment to enterprise-ready scale.

Production-Ready: Wan 2.6 at only from $0.07/s — cheapest API available
NSFW Support: Wan 2.2 Spicy for uncensored content at from $0.03/s
Enterprise Security: SOC I & II Certified | HIPAA Compliant
Zero Maintenance: Serverless architecture—focus on your product, not the servers

👉 Start Building

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

Wan Video Skill

Features

Model Variants

Install

As an Agent Skill

Fallback (if bun link doesn't work)

Usage

Modes

Models

NSFW Mode

Resolution and Duration

Options

API Key Configuration

How It Works

API Flow

Model Selection Logic

Wan Model Deep Dive

Wan 2.6 (Latest)

Wan 2.5

Wan 2.2 Spicy (NSFW)

Wan 2.2 Spicy LoRA

Use Cases

Agent Skill Integration

Example Skill Interactions

Comparison with Other Video Models

Prompt Engineering Tips

Be Specific About Motion

Include Camera Instructions

Describe Lighting and Atmosphere

Use Style Keywords

Prompt Structure

Advanced Usage

Batch Generation

Pipeline with Image-to-Video

Combining with FFmpeg

Environment-Specific Configurations

Frequently Asked Questions

How long does video generation take?

What image formats are supported for I2V?

What video formats are supported for V2V?

Can I use this commercially?

How do I get the best quality?

What's the maximum video duration?

Can I generate in 1080p?

Troubleshooting

"API key not found"

"Request timed out"

"Model not found"

"NSFW mode requires --image"

🚀 Take This to Production Today

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Fallback (if `bun link` doesn't work)

Packages