Skip to content

NPRportrait-segmentation provides fine-grained segmentation masks for the NPRportrait datasets

License

Notifications You must be signed in to change notification settings

pystiche/NPRportrait-segmentation

Repository files navigation

NPRportrait segmentation

DOI

Introduction

Style Transfer (ST) algorithms from the field of Non-photorealistic Rendering (NPR) are notoriously hard to compare because there is no objective metric to do so. In a first step, Mould and Rosin introduced the NPRgeneral dataset, that contains 20 diverse images of various motifs. With it, algorithms can at least be evaluated on set of standard images rather than every author picking their own and, unwillingly or not, introduce even more subjective bias in the comparison. As a follow-up Rosin et al. later also introduced the NPRportrait dataset, that, as the name implies, only contains portraits. In 2022 Rosin et al. released an updated version, dubbing the initial version 0.1 and the new one 1.0. Both datasets use a leveled approach, in which each level includes 20 images that increase in difficulty, e.g. partially or fully covered facial features. 0.1 comes with two and 1.0 with three levels.

ST algorithms, e.g. Neural Style Transfer (NST) algorithms, can be applied to an image as is. However, if the image exhibits stark differences between multiple regions, e.g. different facial features, this often leads to artifacts. To avoid this without painstakingly and manually fine-tuning the hyperparameters of an algorithm that should automate the task at hand, one can also guide the algorithm using segmentation masks of the image. Albeit not for portraits, this technique was recently used by Gatys et al. to improve the results of their NST algorithm and even being able to mix different styles in the stylized image per region.

Unfortunately, so far there are no segmentation masks available for the standard datasets detailed above. As the name implies NPRportrait-segmentation fills this gap for the portrait benchmark datasets.

Methodology

There are a number of datasets out there that provide segmentation masks for portraits, e.g. FASSEG or CelebAMask-HQ. However, they are often suboptimal for ST algorithms:

  1. Some datasets are too coarse and lump features like the lips and teeth together in one region, although they have vastly different styles.
  2. Some datasets are too fine-grained and differentiate between left and right facial features, although this has no impact on the ST.

For NPRportrait-segmentation we settled on 14 different regions excluding the background that we found the most important ones during our research:

  • skin
  • clothing
  • eyeballs
  • nose
  • ears
  • hair
  • eyebrows
  • facial hair
  • lips
  • teeth
  • mouth cavity
  • accessories
  • headgear
  • glasses

Due to the fine division, a coarser division of the regions is possible at any time by merging the masks. This makes these segmentations usable for various applications.

The segmentations for all 100 images were manually created. Apart from the inherent annotation bias, it was difficult in some images to introduce a clear cut between the hair and background region (cf. Figure 1). We opted to only label a region as hair if there is a significant hair density, with no intention to make "significant" more concrete.

image segmentation segmentation without background
Figure 1: An image (left), its corresponding segmentation (middle), and the segmentation without background (right). Areas of low hair density were not annotated as belonging to the hair, but rather to the background region.

Usage

NPRportrait-segmentation provides the segmentation in two formats:

  1. As a single segmentation image which separates the regions by colour. The color map is detailed below. You can find these images inside the segmentations/ directory.
  2. As multiple boolean masks per image where each mask is white for the respective region and black everywhere else. You can find these images inside the masks/ directory.

Colormap

region RGB code
background ( 0, 0, 0)
skin ( 49, 130, 189)
clothing (107, 174, 214)
eyeballs (230, 85, 13)
nose (253, 141, 60)
ears (253, 174, 107)
hair ( 49, 163, 84)
eyebrows (116, 196, 118)
facial_hair (161, 217, 155)
lips (117, 107, 177)
teeth (158, 154, 200)
mouth_cavity (188, 189, 220)
accessories ( 99, 99, 99)
headgear (150, 150, 150)
glasses (189, 189, 189)

Region distribution

region distribution
Figure 2: Distribution of regions throughout NPRportrait-segmentation. The regions background, skin, nose, and lips are not shown since they are present in every image regardless of version or level.

Overview

Version v0.1

Level 1

01-12866391563_7312cae13a_o image 01-12866391563_7312cae13a_o segmentation 02-19879809258_3de6bc082b_o image 02-19879809258_3de6bc082b_o segmentation 03-25018815523_80542cf933_o image 03-25018815523_80542cf933_o segmentation 04-wikimedia image 04-wikimedia segmentation

05-34783750806_954ea83724_o image 05-34783750806_954ea83724_o segmentation 06-19876234959_388a248a3e_k image 06-19876234959_388a248a3e_k segmentation 07-pixabay-1869761 image 07-pixabay-1869761 segmentation 08-14717382472_3119d35971_o image 08-14717382472_3119d35971_o segmentation

09-20052719652_d69fe9705c_k image 09-20052719652_d69fe9705c_k segmentation 10-2233221399_3600271088_o image 10-2233221399_3600271088_o segmentation 11-10733769754_d8b219de27_o image 11-10733769754_d8b219de27_o segmentation 12-4727363362_42d266fc1d_o image 12-4727363362_42d266fc1d_o segmentation

13-4727365078_655e6cd0f5_o image 13-4727365078_655e6cd0f5_o segmentation 14-pixabay-1436663 image 14-pixabay-1436663 segmentation 15-4013774662_6abb307517_o image 15-4013774662_6abb307517_o segmentation 16-24898722630_e8d16050f1_o image 16-24898722630_e8d16050f1_o segmentation

17-34589588971_cc1b6b99f7_o image 17-34589588971_cc1b6b99f7_o segmentation 18-20356338610_54af76bcf3_k image 18-20356338610_54af76bcf3_k segmentation 19-7581750218_9ba155cd23_o image 19-7581750218_9ba155cd23_o segmentation 20-15350967367_f9df8a8f06_o image 20-15350967367_f9df8a8f06_o segmentation

Level 2

01-wikimedia image 01-wikimedia segmentation 02-8115174843_7abf278359_o image 02-8115174843_7abf278359_o segmentation 03-34694233905_6f774ba164_o image 03-34694233905_6f774ba164_o segmentation 04-33916867564_c6af07fd52_o image 04-33916867564_c6af07fd52_o segmentation

05-3217118459_7952198317_o image 05-3217118459_7952198317_o segmentation 06-15158628486_c93546d21d_o image 06-15158628486_c93546d21d_o segmentation 07-9467963321_63d0375465_o image 07-9467963321_63d0375465_o segmentation 08-16022313531_5aa50d89d3_o image 08-16022313531_5aa50d89d3_o segmentation

09-306039906_24c5e9600c_o image 09-306039906_24c5e9600c_o segmentation 10-27552868751_703ad00cde_k image 10-27552868751_703ad00cde_k segmentation 11-2389874253_759d8164ab_o image 11-2389874253_759d8164ab_o segmentation 12-14452459445_1bb06c6302_o image 12-14452459445_1bb06c6302_o segmentation

13-pexels-355020 image 13-pexels-355020 segmentation 14-230501453_85963e7ee3_o image 14-230501453_85963e7ee3_o segmentation 15-15339171978_a04047dc31_k image 15-15339171978_a04047dc31_k segmentation 16-14610754861_8fd4939744_k image 16-14610754861_8fd4939744_k segmentation

17-wikimedia image 17-wikimedia segmentation 18-8047854084_76a56e69bb_k image 18-8047854084_76a56e69bb_k segmentation 19-7741742246_502b485404_h image 19-7741742246_502b485404_h segmentation 20-3900823607_9f9d376bb8_o image 20-3900823607_9f9d376bb8_o segmentation

Version v1.0

Level 1

01-Kimberly-Howell image 01-Kimberly-Howell segmentation 02-22636558997-80ee36b602-o image 02-22636558997-80ee36b602-o segmentation 03-Rosin image 03-Rosin segmentation 04-Rosin image 04-Rosin segmentation

05-Major-General-Edward-Rice image 05-Major-General-Edward-Rice segmentation 06-Team-Karim image 06-Team-Karim segmentation 07-24930764475-3bfbc008f8-o image 07-24930764475-3bfbc008f8-o segmentation 08-Moid-Rasheedi image 08-Moid-Rasheedi segmentation

09-Aswini-Phy-ALC image 09-Aswini-Phy-ALC segmentation 10-Saira-Shah-Halim image 10-Saira-Shah-Halim segmentation 11-Michelle-Chan image 11-Michelle-Chan segmentation 12-38753438484-f65be5b961-o image 12-38753438484-f65be5b961-o segmentation

13-6884042760-1ee2b00829-o image 13-6884042760-1ee2b00829-o segmentation 14-Rosin image 14-Rosin segmentation 15-Shimizu-kurumi image 15-Shimizu-kurumi segmentation 16-Yoon-Yeoil image 16-Yoon-Yeoil segmentation

17-4891358118-32c20e9d8e-o image 17-4891358118-32c20e9d8e-o segmentation 18-Huw-Evans image 18-Huw-Evans segmentation 19-Zboralski-waldemar-2012 image 19-Zboralski-waldemar-2012 segmentation 20-8552893573-1473209795-o image 20-8552893573-1473209795-o segmentation

Level 2

01-wikimedia image 01-wikimedia segmentation 02-8115174843_7abf278359_o image 02-8115174843_7abf278359_o segmentation 03-34694233905_6f774ba164_o image 03-34694233905_6f774ba164_o segmentation 04-33916867564_c6af07fd52_o image 04-33916867564_c6af07fd52_o segmentation

05-Tom_Selleck_at_PaleyFest_2014 image 05-Tom_Selleck_at_PaleyFest_2014 segmentation 06-15158628486_c93546d21d_o image 06-15158628486_c93546d21d_o segmentation 07-9467963321_63d0375465_o image 07-9467963321_63d0375465_o segmentation 08-16022313531_5aa50d89d3_o image 08-16022313531_5aa50d89d3_o segmentation

09-306039906_24c5e9600c_o image 09-306039906_24c5e9600c_o segmentation 10-3761108471_08e3f9f80d_o image 10-3761108471_08e3f9f80d_o segmentation 11-2389874253_759d8164ab_o image 11-2389874253_759d8164ab_o segmentation 12-14452459445_1bb06c6302_o image 12-14452459445_1bb06c6302_o segmentation

13-noah-buscher-_E-ogRrpM0s-unsplash image 13-noah-buscher-_E-ogRrpM0s-unsplash segmentation 14-230501453_85963e7ee3_o image 14-230501453_85963e7ee3_o segmentation 15-pexels-355020 image 15-pexels-355020 segmentation 16-14610754861_8fd4939744_k image 16-14610754861_8fd4939744_k segmentation

17-wikimedia image 17-wikimedia segmentation 18-Bjornar_Moxnes image 18-Bjornar_Moxnes segmentation 19-7741742246_502b485404_h image 19-7741742246_502b485404_h segmentation 20-3900823607_9f9d376bb8_o image 20-3900823607_9f9d376bb8_o segmentation

Level 3

01-johanna-buguet-9GOAzu0G4oM-unsplash image 01-johanna-buguet-9GOAzu0G4oM-unsplash segmentation 02-29881657997_24106d0716_o image 02-29881657997_24106d0716_o segmentation 03-felipe-sagn-2xd0_6wEj6k-unsplash image 03-felipe-sagn-2xd0_6wEj6k-unsplash segmentation 04-nathan-dumlao-cibBnDQ9hcQ-unsplash image 04-nathan-dumlao-cibBnDQ9hcQ-unsplash segmentation

05-old-youth-tAJog0uJkT0-unsplash image 05-old-youth-tAJog0uJkT0-unsplash segmentation 06-olesya-yemets-AjilVpkggN8-unsplash image 06-olesya-yemets-AjilVpkggN8-unsplash segmentation 07-azamat-zhanisov-h9Oo45soK_0-unsplash image 07-azamat-zhanisov-h9Oo45soK_0-unsplash segmentation 08-claudia-owBcefxgrIE-unsplash image 08-claudia-owBcefxgrIE-unsplash segmentation

09-8079036040_6e5d7798f5_o image 09-8079036040_6e5d7798f5_o segmentation 10-calvin-lupiya-Mx4auh5zO4w-unsplash image 10-calvin-lupiya-Mx4auh5zO4w-unsplash segmentation 11-artyom-kim-zEa75CRX88M-unsplash image 11-artyom-kim-zEa75CRX88M-unsplash segmentation 12-jordan-bauer-Is3VRzUaXVk-unsplash image 12-jordan-bauer-Is3VRzUaXVk-unsplash segmentation

13-alex-iby-470eBDOc8bk-unsplash image 13-alex-iby-470eBDOc8bk-unsplash segmentation 14-andrew-robinson-4ar-CSxLcMg-unsplash image 14-andrew-robinson-4ar-CSxLcMg-unsplash segmentation 15-8717570008_edc9120e59_o image 15-8717570008_edc9120e59_o segmentation 16-gabriel-silverio-u3WmDyKGsrY-unsplash image 16-gabriel-silverio-u3WmDyKGsrY-unsplash segmentation

17-6262243021_47792d9ca0_o image 17-6262243021_47792d9ca0_o segmentation 18-16585637733_67b0e18bcf_o image 18-16585637733_67b0e18bcf_o segmentation 19-1824233430_59f1a20f0d_o image 19-1824233430_59f1a20f0d_o segmentation 20-3683799501_052eb48752_o image 20-3683799501_052eb48752_o segmentation

License

The Creative Commons Attribution 4.0 International License applies solely to the segmentation images that we have created. It does not apply to any third-party data or information that we may have used in the creation of our segmentation. We make no claims or guarantees regarding the accuracy, completeness, or legality of any third-party data or information that may have been used in our data, and we disclaim any liability for any damages or losses that may result from the use or reliance on such third-party data or information.

Citation

If you use the NPRportrait-segmentation dataset provided in this repository, please cite it as below:

@misc{Bultemeier_NPRportrait-segmentation,
  author = {Bültemeier, Julian and Meier, Philip and Lohweg, Volker},
  doi    = {10.5281/zenodo.7852139},
  title  = {{NPRportrait-segmentation}},
  url    = {https://github.com/pystiche/NPRportrait-segmentation}
}

Please don't forget to cite the original work by Rosin et al. as well:

@inproceedings{10.1145/3092919.3092921,
  author     = {Rosin, Paul L. and Mould, David and Berger, Itamar and Collomosse, John and Lai, Yu-Kun and Li, Chuan and Li, Hua and Shamir, Ariel and Wand, Michael and Wang, Tinghuai and Winnem\"{o}ller, Holger},
  title      = {Benchmarking Non-Photorealistic Rendering of Portraits},
  year       = {2017},
  isbn       = {9781450350815},
  publisher  = {Association for Computing Machinery},
  address    = {New York, NY, USA},
  url        = {https://doi.org/10.1145/3092919.3092921},
  doi        = {10.1145/3092919.3092921},
  abstract   = {We present a set of images for helping NPR practitioners evaluate their image-based portrait stylisation algorithms. Using a standard set both facilitates comparisons with other methods and helps ensure that presented results are representative. We give two levels of difficulty, each consisting of 20 images selected systematically so as to provide good coverage of several possible portrait characteristics. We applied three existing portrait-specific stylisation algorithms, two general-purpose stylisation algorithms, and one general learning based stylisation algorithm to the first level of the benchmark, corresponding to the type of constrained images that have often been used in portrait-specific work. We found that the existing methods are generally effective on this new image set, demonstrating that level one of the benchmark is tractable; challenges remain at level two. Results revealed several advantages conferred by portrait-specific algorithms over general-purpose algorithms: portrait-specific algorithms can use domain-specific information to preserve key details such as eyes and to eliminate extraneous details, and they have more scope for semantically meaningful abstraction due to the underlying face model. Finally, we provide some thoughts on systematically extending the benchmark to higher levels of difficulty.},
  booktitle  = {Proceedings of the Symposium on Non-Photorealistic Animation and Rendering},
  articleno  = {11},
  numpages   = {12},
  keywords   = {image stylisation, evaluation, portraits, non-photorealistic rendering},
  location   = {Los Angeles, California},
  series     = {NPAR '17},
}
@article{Rosin2022,
  title    = {NPRportrait 1.0: A three-level benchmark for non-photorealistic rendering of portraits},
  author   = {Rosin, Paul L and Lai, Yu-Kun and Mould, David and Yi, Ran and Berger, Itamar and Doyle, Lars and Lee, Seungyong and Li, Chuan and Liu, Yong-Jin and Semmo, Amir and others},
  journal  = {Computational Visual Media},
  volume   = {8},
  number   = {3},
  pages    = {445--465},
  year     = {2022},
  abstract = {Recently, there has been an upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer (NST). However, the state of performance evaluation in this field is poor, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far not well defined, since it involves subjective, perceptual, and aesthetic aspects. To make progress towards a solution, this paper proposes a new structured, three-level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. Moreover, a new methodology has been developed for evaluating portrait stylisation algorithms, which makes use of the different benchmark levels as well as annotations provided by user studies regarding the characteristics of the faces. We perform evaluation for a wide variety of image stylisation methods (both portrait-specific and general purpose, and also both traditional NPR approaches and NST) using the new benchmark dataset.},
  url      = {https://doi.org/10.1007/s41095-021-0255-3},
  doi      = {10.1007/s41095-021-0255-3},
}