1Technical University of Munich 2University Federico II of Naples 3University of Erlangen-Nuremberg
The FaceForensic Dataset
FaceForensics is a video dataset consisting of more than 500,000 frames containing faces from 1004 videos that can be used to study image or video forgeries. To create these videos we use an automatated version of the state of the art Face2Face approach. All videos are downloaded from Youtube and are cut down to short continuous clips that contain mostly frontal faces. In particular, we offer two versions of our dataset:
Source-to-Target: where we reenact over 1000 videos with new facial expressions extracted from other videos, which e.g. can be used to train a classifier to detect fake images or videos.
Selfreenactment: where we use Face2Face to reenact the facial expressions of videos with their own facial expressions as input to get pairs of videos, which e.g. can be used to train supervised generative refinement models.
Benchmarks as well as more information can be found in our paper.
To download the FaceForensics dataset please fill out this google form and we will send you the download link. If you have any questions send us a mail. Both dataset versions are split up into train, validation and test, which contain 704, 150 and 150 videos respectively. Every one of these folders has the following structure:
altered: videos that are manipulated by Face2Face
original: the original video clips
mask: the face mask that we use in the Face2Face algorithm to manipulate the original video
All videos have been compressed lossless with H.264 and have a constant framerate of 30 fps.
In addition to these, we also provide the cropped extracted face frames for the selfreenactment dataset that we use for our refinement task. For more information, check out our github.