1. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, I. Sutskever. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning, pp. 8748–8763, 2021.
2. A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W. Y. Lo, P. Dollar, R. Girshick. Segment anything, [Online], Available: https://arxiv.org/abs/2304.02643, 2023.
3. T. Y. Ko, S. H. Lee. Novel method of semantic segmentation applicable to augmented reality. Sensors, vol. 20, no. 6, pp. 1737, 2020. DOI: https://doi.org/10.3390/s20061737.
4. B. Wang, A. Aboah, Z. Y. Zhang, U. Bagci. GazeSAM: What you see is what you segment, [Online], Available: https://arxiv.org/abs/2304_13844, 2023
5. A. Borji, M. M. Cheng, Q. B. Hou, H. Z. Jiang, J. Li. Salient object detection: A survey. Computational Visual Media, vol. 5, no. 2, pp. 117–150, 2019. DOI: https://doi.org/10.1007/s41095-019-0149-9.