1. Multi-view 3D Object Detection Network for Autonomous Driving
2. The Cityscapes Dataset for Semantic Urban Scene Understanding
3. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes
4. ImageNet: A large-scale hierarchical image database
5. Mark Everingham , Luc Van Gool , Christopher KI Williams, John Winn, and Andrew Zisserman. 2010 . The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303--338. Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303--338.