REALY: Rethinking the Evaluation of 3D Face Reconstruction

ECCV 2022

logo-thusz      logo-AI-lab
  • Zenghao Chai*, Haoxian Zhang*, Jing Ren, Di Kang,

  • Zhengzhuo Xu, Xuefei Zhe, Chun Yuan, Linchao Bao

  • (* Equal Contribution, Corresponding Author)

teaser

Abstract

The evaluation of 3D face reconstruction results typically relies on a rigid shape alignment between the estimated 3D model and the ground-truth scan. We observe that aligning two shapes with different reference points can largely affect the evaluation results. This poses difficulties for precisely diagnosing and improving a 3D face reconstruction method. In this paper, we propose a novel evaluation approach with a new benchmark REALY, consists of 100 globally aligned face scans with accurate facial keypoints, high-quality region masks, and topology-consistent meshes. Our approach performs region-wise shape alignment and leads to more accurate, bidirectional correspondences during computing the shape errors. The fine-grained, region-wise evaluation results provide us detailed understandings about the performance of state-of-the-art 3D face reconstruction methods. For example, our experiments on single-image based reconstruction methods reveal that DECA performs the best on nose regions, while GANFit performs better on cheek regions. Besides, a new and high-quality 3DMM basis, HIFI3D++, is further derived using the same procedure as we construct REALY to align and retopologize several 3D face datasets. We will release REALY, HIFI3D++, and our new evaluation pipeline at https://realy3dface.com.

Keywords: 3D Face Reconstruction, Evaluation, Benchmark, 3DMM

REALY Benchmark (Region-aware 3D face benchmark based on the LYHM dataset)

REALY REALY

The REALY benchmark uses multi-view rendered portraits images of 100 high quality scans from LYHM to evaluate 3D face reconstruction accuracy. Examples of REALY benchmark: First row: globally aligned high-resolution scans with textures. Second row: retopologized meshes in HIFI3D topology with semantically consistent keypoints (red points). Third row: high quality face region masks of each scan. Fourth-Seventh row: multi-view images of each scan. Eighth row: frontal images of each scan.

The single-view reconstruction results of different methods are reported separately (100 frontal images + 400 side-view images).


Important Notes: The data and code released with the website can only be used for non-commercial research purposes. Please do not copy, sell, trade, or exploit the data for any commercial purposes.


REALY Benchmark (Region-aware 3D face benchmark based on the LYHM dataset)

REALY REALY

The REALY benchmark uses multi-view rendered portraits images of 100 high quality scans from LYHM to evaluate 3D face reconstruction accuracy. Examples of REALY benchmark: First row: globally aligned high-resolution scans with textures. Second row: retopologized meshes in HIFI3D topology with semantically consistent keypoints (red points). Third row: high quality face region masks of each scan. Fourth-Seventh row: multi-view images of each scan. Eighth row: frontal images of each scan.

The statistical information of REALY is summarized as follows.

Metric Age Ethnicity
Category <15 15-30 30-45 45-60 >60 Caucasian Asian Black
Ratio 12 31 27 21 9 79 16 5
Metric Gender BMI
Category F M Underweight Normal Overweight Obese Extreme Obese
Ratio 48 52 15 61 15 7 2

The single-view reconstruction results of different methods are reported separately (100 frontal images + 400 multi-view images).

Reconstruction from Single Image (frontal-view)

Rank Method @nose (mm) @mouth (mm) @forehead (mm) @cheek (mm) all Source
avg. med. std. avg. med. std. avg. med. std. avg. med. std. avg.
1. Deep3D 1.719 1.683 0.354 1.368 1.301 0.439 2.015 2.007 0.449 1.528 1.442 0.501 1.657 Code
2. MGCNet 1.771 1.741 0.380 1.417 1.355 0.409 2.268 2.215 0.503 1.639 1.494 0.650 1.774 Code
3. GANFit 1.928 1.881 0.490 1.812 1.769 0.544 2.402 3.339 0.545 1.329 1.234 0.504 1.868 Page
4. SADRNet 1.791 1.724 0.542 1.591 1.551 0.488 2.413 2.351 0.537 1.856 1.737 0.701 1.913 Code
5. 3DDFA-v2 1.903 1.857 0.517 1.597 1.529 0.478 2.447 3.356 0.647 1.757 1.683 0.642 1.926 Code
6. DECA-c 1.697 1.654 0.355 2.516 2.465 0.839 2.394 2.256 0.576 1.479 1.400 0.535 2.010 Code
7. PRNet 1.923 1.811 0.518 1.838 1.699 0.637 2.429 2.329 0.588 1.863 1.715 0.698 2.013 Code
8. CEST 2.779 2.717 0.835 1.448 1.438 0.406 2.384 2.302 0.578 1.456 1.321 0.485 2.017 -
9. SynergyNet 2.026 2.014 0.532 1.731 1.724 0.502 2.679 2.615 0.741 1.647 1.550 0.622 2.021 Code
10. DECA-f 2.138 2.137 0.461 2.802 2.699 0.868 2.457 2.341 0.559 1.443 1.353 0.498 2.210 Code
11. RingNet 1.934 1.907 0.458 2.074 1.991 0.616 2.995 2.852 0.908 2.028 1.937 0.720 2.258 Code
12. ExpNet 2.509 2.463 0.486 1.912 1.850 0.450 3.084 2.879 1.005 1.717 1.642 0.590 2.306 Code
13. N-3DMM 2.936 2.857 0.810 2.375 2.390 0.599 4.582 4.452 1.488 1.918 1.717 0.801 2.953 Code

Reconstruction from Single Image (side-view)

Rank Method @nose (mm) @mouth (mm) @forehead (mm) @cheek (mm) all Source
avg. med. std. avg. med. std. avg. med. std. avg. med. std. avg.
1. Deep3D 1.749 1.704 0.343 1.411 1.359 0.395 2.074 2.063 0.486 1.528 1.435 0.517 1.691 Code
2. MGCNet 1.827 1.783 0.383 1.409 1.353 0.418 2.248 2.171 0.508 1.665 1.568 0.644 1.787 Code
3. 3DDFA-v2 1.883 1.865 0.499 1.642 1.611 0.501 2.465 2.402 0.622 1.781 1.737 0.636 1.943 Code
4. SADRNet 1.771 1.695 0.521 1.560 1.542 0.462 2.490 2.429 0.566 2.010 1.913 0.715 1.958 Code
5. SynergyNet 2.008 1.977 0.526 1.725 1.700 0.533 2.638 2.582 0.719 1.662 1.566 0.627 2.008 Code
6. PRNet 1.868 1.813 0.510 1.856 1.780 0.607 2.445 2.390 0.570 1.960 1.815 0.731 2.032 Code
7. DECA-c 1.903 1.700 1.050 2.472 2.348 1.079 2.423 2.308 0.720 1.630 1.456 1.135 2.107 Code
8. RingNet 1.921 1.872 0.451 1.994 1.955 0.604 3.081 2.979 0.950 2.027 1.929 0.710 2.256 Code
9. DECA-f 2.286 2.065 1.103 2.684 2.572 1.041 2.519 2.402 0.718 1.555 1.422 0.822 2.261 Code
10. ExpNet 2.508 2.453 0.491 2.160 2.094 0.448 3.393 3.226 1.076 1.842 1.774 0.609 2.476 Code

"*-c" indicates coarse model and "*-f" indicates fine model. bold font indicates the best one in the column. The benchmark is continuously updated to encourage researchers to participate in. We only consider methods that were defined as public when submitting to REALY, or that are publicly available in a peer-reviewed conference or journal. Methods are ranked according to the average region-wise error, you can also sort by a specified metic by clicking its corresponding table header.

We ensure results submitted to participate in the REALY benchmark are kept confidential strictly. No information about the participants or the results is released or shared with others until explicitly specified by the participants, or after appearing in a peer-reviewed conference or journal. In order to publish results to the REALY benchmark website, follow the Participation Instructions below.


Participation Instructions

To participate in the challenge, you should first get the access of REALY benchmark data according to the guideline here. Then you can use the image to reconstruct meshes and evaluate them using our evaluation pipeline.

If you want to present your method(s) on the REALY project page, please send the reconstructed meshes, barycentric coordinate and template mesh to zenghaochai@gmail.com, then we will re-evaluate and check the results. After that, we will update the project page accordingly. We recommend entitling the email as "REALY-SubmitResult-[MethodName]".

Acknowledgements and Copyrights

The benchmark is derived from the LYHM dataset. Please also cite their paper if you use this benchmark. The test portrait images are rendered using the 3D models provided by LYHM dataset. The copyright of the portraits belongs to the original owner, LYHM dataset. Please contact them to get access to the raw scan data. Note that LYHM dataset allows non-commercial research and education purposes use only.


HIFI3D++ 3DMM Basis

3DMM

We also present the expressive 3DMM during constructing our REALY benchmark. The HIFI3D++ 3DMM is a full-head shape basis built from about 2,000 high quality topology consistent shapes by retopologizing LYHM, FaceScape, and HIFI3D. We have released HIFI3D++ at REALY!

Acknowledgements

HIFI3D++ is derived from LYHM, FaceScape, and HIFI3D. Please also cite these papers if you used this 3DMM basis.


Citation

If you find this work is useful for your research, please cite the following papers.

The REALY paper:

@inproceedings{REALY,
  title={REALY: Rethinking the Evaluation of 3D Face Reconstruction},
  author={Chai, Zenghao and Zhang, Haoxian and Ren, Jing and Kang, Di and Xu, Zhengzhuo and Zhe, Xuefei and Yuan, Chun and Bao, Linchao},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year = {2022}
}

The LYHM paper:

@article{LYHM,
    title={Statistical Modeling of Craniofacial Shape and Texture},
    author={Dai, Hang and Pears, Nick and Smith, William and Duncan, Christian},
    journal={International Journal of Computer Vision},
    year={2019}
}

The FaceScape paper:

@inproceedings{FaceScape, 
    title={FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction}, 
    author={Yang, Haotian and Zhu, Hao and Wang, Yanru and Huang, Mingkai and Shen, Qiu and Yang, Ruigang and Cao, Xun}, 
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year={2020} 
}

The HIFI3D paper:

@article{HIFI3D,
    title={High-Fidelity 3D Digital Human Head Creation from RGB-D Selfies},
    author={Bao, Linchao and Lin, Xiangkai and Chen, Yajing and Zhang, Haoxian and Wang, Sheng and Zhe, Xuefei and Kang, Di and Huang, Haozhi and Jiang, Xinwei and Wang, Jue and Yu, Dong and Zhang, Zhengyou},
    journal={ACM Transactions on Graphics},
    year={2021}
}

Contact

If you have any question, please contact Zenghao Chai or Linchao Bao.