Skip to content

weimengting/MagicPortrait

Repository files navigation

MagicPortrait

MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance
Mengting Wei, Yante Li, Tuomas Varanka, Yan Jiang, Guoying Zhao

arXiv | Model

This repository contains the example inference script for the MagicPortrait-preview model.

combined_with_transitions_24.mp4

Installation

conda create -n mgp python=3.10 -y
conda activate mgp
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py310_cu121_pyt241/download.html

Inference of the Model

Step 1: Download pre-trained models

Download our models from Huggingface.

huggingface-cli download --resume-download mengtingwei/MagicPortrait --local-dir ./pre_trained

Step 2: Setup necessary libraries for face motion transfer

  1. Put the downloaded third_party_files in the last step under the project directory ./.
  2. Visit DECA Github to download the pretrained deca_model.tar.
  3. Visit FLAME website to download FLAME 2020 and extract generic_model.pkl.
  4. Visit FLAME website to download FLAME texture space and extract FLAME_texture.npz.
  5. Visit DECA' data page and download all files.
  6. Visit SMIRK website to download SMIRK_em1.pt.
  7. Place the files in their corresponding locations as specified below.
decalib
    data/
      deca_model.tar
      generic_model.pkl
      FLAME_texture.npz
      fixed_displacement_256.npy
      head_template.obj
      landmark_embedding.npy
      mean_texture.jpg
      texture_data_256.npy
      uv_face_eye_mask.png
      uv_face_mask.png
    ...
    smirk/
      pretrained_models/
        SMIRK_em1.pt
      ...
    ... 

Step 3: Process the identity image and driving video

As our model is designed to focus only on the face, you should crop the face from your images or videos if they are full-body shots. However, if your images or videos already contain only the face and the aspect ratio is approximately 1:1, you can simply resize them into resolution of 512 $\times$ 512 without doing the following crop (1 and 2) steps.

  1. Crop the face from an image:
python crop_process.py --sign image --img_path './assets/boy.jpeg' --save_path './assets/boy_cropped.jpg'
  1. Crop the faces sequence from the driving video.

    • If you have a video
       mkdir ./assets/driving_images
       ffmpeg -i ./assets/driving.mp4 ./assets/driving_images/frame_%04d.jpg

    Crop face from the driving images.

python crop_process.py --sign video --video_path './assets/driving_images' --video_imgs_dir './assets/driving_images_cropped'
  1. Retrieve guidance images using DECA and SMIRK models.
python render_and_transfer.py --sor_img './assets/boy_cropped.jpg' --driving_path './assets/driving_images_cropped' --save_name example1

The guidance will be saved in the ./transfers directory.

Step 4: Inference

Update the model and image directories in ./configs/inference/inference.yaml to match your own file locations.

Then run:

python inference.py

Acknowledgement

Our work is made possible thanks to open-source pioneering 3D face reconstruction works (including DECA and SMIRK) and a high-quality talking-video dataset CelebV-HQ.

Contact

Open an issue here or email [email protected].

About

Pytorch implementation of Temporally Consistent Face Reenactment with 3D Geometric Guidance

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published