1. Berscheid, L., Meißner, P., Kröger, T.: Self-supervised learning for precise pick-and-place without object model. IEEE Robot. Autom. Lett. 5(3), 4828–4835 (2020)
2. Dai, Q., Zhu, Y., Geng, Y., Ruan, C., Zhang, J., Wang, H.: GraspNeRF: multiview-based 6-dof grasp detection for transparent and specular objects using generalizable nerf. arXiv:2210.06575 (2022)
3. Devin, C., Rowghanian, P., Vigorito, C., Richards, W., Rohanimanesh, K.: Self-supervised goal-conditioned pick and place. arXiv:2008.11466 (2020)
4. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv:2010.11929 (2020)
5. Florence, P.R., Manuelli, L., Tedrake, R.: Dense object nets: learning dense visual object descriptors by and for robotic manipulation. arXiv:1806.08756 (2018)