1. Tulsiani, S., Zhou, T., Efros, A. A., & Malik, J. (2015). Multi-view supervision for sin-gle-view reconstruction via differentiable ray consistency. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2623-2631).
2. Pauwels, K., Van Hulle, M. M., & de Baets, B. (2003). Survey of computational stereo vision algorithms: An overview. Journal of Image and Vision Computing, 21(4), 285-310.
3. Zhou, X., Xu, K., & Zhu, J. Y. (2021). Deformable convolutional networks for object detection in a video. IEEE Transactions on Image Processing, 30, 1865-1877.
4. Rahmati, A., & Lu, J. (2020). Pose-estimation-free object tracking via attentive feature extraction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pat-tern Recognition (pp. 6910-6919).
5. He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2980-2988).