1. Kingma DP, Rezende DJ, Mohamed S, Welling M. Semi-supervised learning with deep generative models. In: Advances in neural information processing systems; 2014. pp. 3581–3589.
2. Kulkarni TD, Whitney W, Kohli P, Tenenbaum JB. Deep convolutional inverse graphics network. In: Advances in neural information processing systems; 2015.pp. 2539–2547.
3. Yan X, Yang J, Sohn K, Lee H. Attribute2image: conditional image generation from visual attributes. In: European conference on computer vision; 2016. pp. 776–791.
4. Larsen ABL, Sønderby SK, Larochelle H, Winther O. Autoencoding beyond pixels using a learned similarity metric.arXiv preprint. 2015. arXiv:1512.09300.
5. Li Y, Ouyang W, Zhou B, Shi J, Zhang C, Wang X. Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation. In: European conference on computer vision; 2018. Pp. 335–351.