Author:
Georgiou Theodoros,Liu Yu,Chen Wei,Lew Michael
Abstract
AbstractHigher dimensional data such as video and 3D are the leading edge of multimedia retrieval and computer vision research. In this survey, we give a comprehensive overview and key insights into the state of the art of higher dimensional features from deep learning and also traditional approaches. Current approaches are frequently using 3D information from the sensor or are using 3D in modeling and understanding the 3D world. With the growth of prevalent application areas such as 3D games, self-driving automobiles, health monitoring and sports activity training, a wide variety of new sensors have allowed researchers to develop feature description models beyond 2D. Although higher dimensional data enhance the performance of methods on numerous tasks, they can also introduce new challenges and problems. The higher dimensionality of the data often leads to more complicated structures which present additional problems in both extracting meaningful content and in adapting it for current machine learning algorithms. Due to the major importance of the evaluation process, we also present an overview of the current datasets and benchmarks. Moreover, based on more than 330 papers from this study, we present the major challenges and future directions.
Funder
Nederlandse Organisatie voor Wetenschappelijk Onderzoek
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Media Technology,Information Systems
Reference336 articles.
1. Abu-El-Haija S, Kothari N, Lee J, Natsev P, Toderici G, Varadarajan B, Vijayanarasimhan S (2016) Youtube-8m: a large-scale video classification benchmark. arXiv preprint arXiv:1609.08675
2. Agostinelli F, Hoffman M, Sadowski P, Baldi P (2014) Learning activation functions to improve deep neural networks. arXiv preprint arXiv:1412.6830
3. Alahi A, Ortiz R, Vandergheynst P (2012) Freak: fast retina keypoint. In: Proceedings of the CVPR. IEEE, pp 510–517
4. Alexandre LA (2016) 3D object recognition using convolutional neural networks with transfer learning between input channels. In: Intelligent autonomous systems, vol 13. Springer, pp 889–898
5. Allaire S, Kim JJ, Breen SL, Jaffray DA, Pekar V (2008) Full orientation invariance and improved feature selectivity of 3D SIFT with application to medical image analysis. In: Proceedings of the CVPRW. IEEE, pp 1–8
Cited by
104 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献