LPIPS-AttnWav2Lip 🎤👄

Welcome to LPIPS-AttnWav2Lip, the ultimate audio-driven lip-syncing wizardry! This magical model transforms your audio into perfectly synced lip movements, making your talking head videos come alive. 🌟

Published in the prestigious journal Speech Communication, this work is a game-changer for talking head generation in the wild. Check out the paper here: LPIPS-AttnWav2Lip: Generic audio-driven lip synchronization for talking head generation in the wild.

Model Download Link 🔗

Get the pre-trained model here: Baidu Drive (Password: hat7).

What’s the Buzz About? 🐝

LPIPS-AttnWav2Lip combines the power of LPIPS perceptual loss and attention mechanisms to deliver high-quality, synchronized lip movements. Whether it’s noisy environments or complex scenarios, this model has got you covered. 💪

Framework Overview 🧠

Below is the framework of LPIPS-AttnWav2Lip:

Comparison with Other Methods 📊

Our method outperforms existing approaches in both quality and synchronization. Check out the comparison below:

How to Infer Like a Pro 🔮

Prepare Your Inputs:
- A video or image with a face (--face parameter).
- An audio file (--audio parameter).

Run the Magic:

python inference.py \
    --checkpoint_path <path_to_model_weights> \
    --face <path_to_face_video_or_image> \
    --audio <path_to_audio_file> \
    --outfile <path_to_output_video>

Voilà! Your synced video is ready to dazzle. ✨

Prepping the Dataset 🛠️

Get the Data:
- Download the LRS2 dataset from here.

Process It:

python preprocess.py \
    --data_root <path_to_LRS2_dataset> \
    --preprocessed_root <path_to_save_preprocessed_data>

Done! Your data is now model-ready. 🚀

Training the Beast 🦾

Set the Stage:
- Tweak the hparams.py file to set your training parameters.

Train Away:

python hq_wav2lip_train_lpips.py \
    --data_root <path_to_preprocessed_data> \
    --checkpoint_dir <path_to_save_checkpoints> \
    --syncnet_checkpoint_path <path_to_pretrained_syncnet>

Save the Day: The model checkpoints will be saved for your future adventures. 🏆

Citation 📜

If you find this work useful, please give us a shout-out by citing:

@article{chen2024lpips,
  title={LPIPS-AttnWav2Lip: Generic audio-driven lip synchronization for talking head generation in the wild},
  author={Chen, Zhipeng and Wang, Xinheng and Xie, Lun and Yuan, Haijie and Pan, Hang},
  journal={Speech Communication},
  volume={157},
  pages={103028},
  year={2024},
  publisher={Elsevier}
}

Let’s make the world a more synchronized place, one lip movement at a time! 😄

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
asset		asset
lpips_pytorch		lpips_pytorch
models		models
README.md		README.md
audio.py		audio.py
color_syncnet_train.py		color_syncnet_train.py
hparams.py		hparams.py
hq_wav2lip_train_lpips.py		hq_wav2lip_train_lpips.py
inference.py		inference.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LPIPS-AttnWav2Lip 🎤👄

Model Download Link 🔗

What’s the Buzz About? 🐝

Framework Overview 🧠

Comparison with Other Methods 📊

How to Infer Like a Pro 🔮

Prepping the Dataset 🛠️

Training the Beast 🦾

Citation 📜

About

Uh oh!

Releases

Packages

Languages

FelixChan9527/LPIPS-AttnWav2Lip

Folders and files

Latest commit

History

Repository files navigation

LPIPS-AttnWav2Lip 🎤👄

Model Download Link 🔗

What’s the Buzz About? 🐝

Framework Overview 🧠

Comparison with Other Methods 📊

How to Infer Like a Pro 🔮

Prepping the Dataset 🛠️

Training the Beast 🦾

Citation 📜

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages