SyncNet with RetinaFace for face cropping.

This repository is a fork of the original SyncNet repository with the addition of Retinaface face detection, instead of S3FD. Retinaface is a more accurate and robust face detection algorithm compared to the Haar cascades face detection used in the original repository.

Changes

Added Retinaface face detection for more accurate and robust face detection
Added a .devcontainer.json and a Dockerfile_retinaface to pre-install all the python dependencies
Updated the README.md file to reflect the changes

This repository contains the demo for the audio-to-video synchronisation network (SyncNet). This network can be used for audio-visual synchronisation tasks including:

Removing temporal lags between the audio and visual streams in a video;
Determining who is speaking amongst multiple faces in a video.

Please cite the paper below if you make use of the software.

Dependencies

pip install -r requirements.txt

In addition, ffmpeg is required.

Demo

SyncNet demo:

python demo_syncnet.py --videofile data/example.avi --tmp_dir /path/to/temp/directory

Check that this script returns:

AV offset:      3 
Min dist:       5.353
Confidence:     10.021

Full pipeline:

sh download_model.sh
python run_pipeline.py --videofile /path/to/video.mp4 --reference name_of_video --data_dir /path/to/output
python run_syncnet.py --videofile /path/to/video.mp4 --reference name_of_video --data_dir /path/to/output
python run_visualise.py --videofile /path/to/video.mp4 --reference name_of_video --data_dir /path/to/output

Outputs:

$DATA_DIR/pycrop/$REFERENCE/*.avi - cropped face tracks
$DATA_DIR/pywork/$REFERENCE/offsets.txt - audio-video offset values
$DATA_DIR/pyavi/$REFERENCE/video_out.avi - output video (as shown below)

check_synch_videos.py

This script is a batch script that checks a folder with videos and creates a CSV file with quantitative results in terms of video and audio sync offset. The script uses the SyncNetInstance module to check the sync between audio and video.

Requirements

Python 3.x
SyncNetInstance module

Usage

To use the script, simply run it from the command line and specify the folder containing the videos you want to check for video-audio sync, as well as the path to the CSV file where the results will be saved. For example:

python check_synch_videos.py --folder ../data/input/avspeech/train/ --results ../data/preprocessing/avspeechdataset_results.csv

This will check all the videos in the specified folder and save the results to the specified CSV file.

Output

The script creates a CSV file with the following columns:

video_file: the name of the video file
av_offset: the number of frames audio-video offset
min_dist: the minimum distance between the audio and video signals
confidence: the confidence score of the sync detection algorithm

If the script encounters an error while processing a video file, the av_offset, min_dist, and confidence columns will be set to "error".

You can decide then to use the av_offset value to sync correct the videos, but depending on your purpose and the size of your dataset, you can opt to simply filter out data that has an av_offset larger than a threshold.

AVdataset_downloader.py

This script is a modified version of an existing script that downloads high quality head talking videos datasets to train Wav2Lip or similar type of machine learning algorithms. The original code can be found here(add link).

To use the script, simply run it from the command line and specify the dataset you want to download. For example:

python avdataset_downloader.py --dataset train

This will download the GRID dataset to the current directory.

License

This script is licensed under the MIT License. See the LICENSE file for more information.

Publications

@InProceedings{Chung16a,
  author       = "Chung, J.~S. and Zisserman, A.",
  title        = "Out of time: automated lip sync in the wild",
  booktitle    = "Workshop on Multi-view Lip-reading, ACCV",
  year         = "2016",
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
detectors		detectors
img		img
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
SyncNetInstance.py		SyncNetInstance.py
SyncNetModel.py		SyncNetModel.py
avdataset_downloader.py		avdataset_downloader.py
check_synch_videos.py		check_synch_videos.py
demo_feature.py		demo_feature.py
demo_syncnet.py		demo_syncnet.py
download_model.sh		download_model.sh
requirements.txt		requirements.txt
run_pipeline.py		run_pipeline.py
run_syncnet.py		run_syncnet.py
run_visualise.py		run_visualise.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SyncNet with RetinaFace for face cropping.

Changes

Dependencies

Demo

check_synch_videos.py

Requirements

Usage

Output

AVdataset_downloader.py

License

Publications

About

Uh oh!

Releases

Packages

Languages

License

armindocachada/syncnet_retinaface_python

Folders and files

Latest commit

History

Repository files navigation

SyncNet with RetinaFace for face cropping.

Changes

Dependencies

Demo

check_synch_videos.py

Requirements

Usage

Output

AVdataset_downloader.py

License

Publications

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages