1. Referring object manipulation of natural images with conditional classifier-free guidance;Choi,2022
2. 3D-r2n2: A unified approach for single and multi-view 3d object reconstruction;Choy,2016
3. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International conference on learning representations.
4. Fan, H., Su, H., & Guibas, L. J. (2017). A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 605–613).
5. Fouhey, D. F., Gupta, A., & Hebert, M. (2013). Data-driven 3D primitives for single image understanding. In Proceedings of the IEEE international conference on computer vision (pp. 3392–3399).