Generating Accurate Human Face Sketches from Text Descriptions

Shorya Sharma (1)
(1) Indian Institute of Technology Bhubaneswar, Bhubaneswar, Odisha, 751013, India
Fulltext View | Download
How to cite (IJASEIT) :
Sharma, S. (2024). Generating Accurate Human Face Sketches from Text Descriptions . International Journal of Advanced Science Computing and Engineering, 6(1), 20–26.

Drawing a face for a suspect just based on the descriptions of the eyewitnesses is a difficult task. There are some state-of-the-art methods in generating images from text, but there are only a few research in generating face images from text and close to none in generating sketches from text. As a result, there is no dataset available to tackle this task. In this paper, we generated a new text-to-sketch dataset for our novel task, and provide two attention based SOTA GAN end-to-end models, Attn_LSTM_256 and Attn_GRU_512, trained on the dataset resulting in Inception score of 1.868 and 1.902, and FID of 175.46 and 176.98. We further propose possible future improvements by applying different model architectures or preserving performance with simplified architectures for real-world applications.

Chen, X., Qing, L., He, X., Luo, X., Xu, Y.: FTGAN: A fully-trained generative adversarial networks for text to face generation. CoRR abs/1904.05729 (2019)

W. Zhang, X. Wang and X. Tang. Coupled Information-Theoretic Encoding for Face Photo-Sketch Recognition. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.

Y. Wang et al., “Text2Sketch: Learning Face Sketch from Facial Attribute Text,” 2018 25th IEEE International Conference on Image Processing (ICIP), Oct. 2018, doi: 10.1109/icip.2018.8451236.

X. Wang and X. Tang, Face Photo-Sketch Synthesis and Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 31, 2009.

A. M. Martinez, and R. Benavente, “The AR Face Database,” CVC Technical Report 24, June 1998.

K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre, “XM2VTSDB: the Extended of M2VTS Database,” in Proceedings of International Conference on Audio- and Video-Based Person Authentication, pp. 72-77, 1999.

H. Zhang and D. Metaxas "StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks", Aug 2017

T. Wang, T. Zhang, and B. C. Lovell, "Faces la Carte: Text-to-Face Generation via Attribute Disentanblement", Sep 2020

Z. Liu, P. Luo, X. Wang, X. Tang: Deep Learning Face Attributes in the Wild. Proceedings of International Conference on Computer Vision (ICCV), (2015)

J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks,” 2017 IEEE International Conference on Computer Vision (ICCV), Oct. 2017, doi: 10.1109/iccv.2017.244.

K. Shmelkov, C. Schmid, and K. A. Inria, "How good is my GAN?", ECCV (2018)

Tao Xu, Xiaolei Huang, "AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks" 1711.10485v1/cs.CV, 28 Nov 2017

O. R. Nasir, S. K. Jha, and M. S. Grover, "Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions", 1911.11378v1/cs.LG, 26 Nov 2019

IIIT-D Sketch Database:

S. Reed, Z. Akata, X. Yan, L. Logeswaran, B. Schiele, and H. Lee. Generative adversarial text to image synthesis. arXiv preprint arXiv:1605.05396, 2016.

E. Mansimov, E. Parisotto, J. Lei Ba, and R. Salakhutdinov "Generating Images From Captions With Attention" 1511.02793v2/cs.LG, 29 Feb 2016