Skip to content

[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

License

Notifications You must be signed in to change notification settings

LYL1015/JarvisArt

Repository files navigation

JarvisArt Icon

[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Paper Project Page YouTube BiliBili

Hugging Face Demo Huggingface Daily Papers Model Weights Dataset MMArt-Bench

Twitter Follow GitHub Stars

1Xiamen University, 2The Hong Kong University of Science and Technology (Guangzhou), 3 The Chinese University of Hong Kong, 4Bytedance, 5National University of Singapore, 6Tsinghua University

💡 Our new work that may interest you ✨.

JarvisEvo: Towards a Self-Evolving Photo Editing Agent with Synergistic Editor-Evaluator Optimization
Yunlong Lin, Lingqing Wang, Zixu Lin and Kunjie Lin, etc.
github github arXiv Project Page


📮 Updates


🧭 Navigation


📝 Overview

JarvisArt Teaser
JarvisArt workflow and results showcase

JarvisArt is a multi-modal large language model (MLLM)-driven agent for intelligent photo retouching. It is designed to liberate human creativity by understanding user intent, mimicking the reasoning of professional artists, and coordinating over 200 tools in Adobe Lightroom. JarvisArt utilizes a novel two-stage training framework, starting with Chain-of-Thought supervised fine-tuning for foundational reasoning, followed by Group Relative Policy Optimization for Retouching (GRPO-R) to enhance its decision-making and tool proficiency. Supported by the newly created MMArt dataset (55K samples) and MMArt-Bench, JarvisArt demonstrates superior performance, outperforming GPT-4o with a 60% improvement in pixel-level metrics for content fidelity while maintaining comparable instruction-following capabilities.


🎬 Demo Videos

Global Retouching Case

JarvisArt Demo

Local Retouching Case

JarvisArt Demo

JarvisArt supports multi-granularity retouching goals, ranging from scene-level adjustments to region-specific refinements. Users can perform intuitive, free-form edits through natural inputs such as text prompts and bounding boxes


💻 Getting Started

For gradio demo running, please follow:

For batch inference, please follow the instructions below:

For Agent-to-Lightroom Protocol, please follow:

For training (SFT & GRPO-R), please follow:

For data construction pipeline (image pairs, instructions, CoT generation & format conversion), please follow:

For evaluation, please follow:


🎪 Checklist

  • Create repo and project page
  • Release preview Inference code and gradio demo
  • Release huggingface online demo
  • Release preview model weight
  • Release Agent-to-Lightroom Protocol
  • Release MMArt-PPR10K dataset with open license
  • Release SFT training code
  • Release GRPO-R training code
  • Release evaluation code
  • Release MMArt-Bench
  • Release data construction scripts

🔍 Jarvis Family

JarvisIR: An intelligent image restoration agent for diverse and complex degradations in real-world scenarios.

We are excited to expand the Jarvis family with more intelligent agents in the future. Stay tuned for upcoming releases!

🙏 Acknowledgements

We would like to express our gratitude to LLaMA-Factory, gradio_image_annotator and VLM-R1 for their valuable open-source contributions which have provided important technical references for our work.

🌤️ Discussion Group

If you have any questions during the trial, running or deployment, feel free to join our WeChat group discussion! If you have any ideas or suggestions for the project, you are also welcome to join our WeChat group discussion!

WeChat Group

Scan QR code to join WeChat group discussion

📧 Contact

For any questions or inquiries, please reach out to us:


📚 Citation

If you find JarvisArt useful in your research, please consider citing:

@article{jarvisart2025,
title={JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent}, 
      author={Yunlong Lin and Zixu Lin and Kunjie Lin and Jinbin Bai and Panwang Pan and Chenxin Li and Haoyu Chen and Zhongdao Wang and Xinghao Ding and Wenbo Li and Shuicheng Yan},
      year={2025},
      journal={arXiv preprint arXiv:2506.17612}
}

📜 License

JarvisArt is released under the Apache License 2.0, but commercial use is explicitly prohibited. While the Apache 2.0 license typically allows free use, modification, and distribution of code, for this project we specifically declare: commercial use of JarvisArt and its related code, models, and datasets is forbidden.

Our MMArt-PPR10k dataset is also prohibited from commercial use. Any commercial application requires explicit written permission from the authors.

For commercial cooperation or commercial licensing, please contact the project authors.

Star History Chart

About

[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 6