Evaluation of the use of box size priors for 6D plane segment tracking from point clouds with applications in cargo packing
Author:
Muñoz Guillermo Alberto Camacho1ORCID, Nope-Rodríguez Sandra Esperanza1ORCID, Loaiza-Correa Humberto1, Lima João Paulo Silva do Monte2, Roberto Rafael Alves3
Affiliation:
1. Universidad del Valle 2. Universidade Federal Rural de Pernambuco 3. Universidade Federal de Pernambuco Centro de Informatica
Abstract
Abstract
Available solutions to assist human operators in cargo packing processes offer alternatives to maximize the spatial occupancy of containers used in intralogistics. However, these solutions consist of sequential instructions for picking each box and positioning it in the containers, making it challenging for an operator to interpret and requiring them to alternate between reading the instructions and executing the task. A potential solution to these issues lies in a tool that naturally communicates each box's initial and final location in the desired sequence to the operator. While 6D visual object tracking systems have demonstrated good performance, they have yet to be evaluated in real-world scenarios of manual box packing. They also need to use the available prior knowledge of the packing operation, such as the number of boxes, box size, and physical packing sequence. This study explores the inclusion of box size priors in 6D plane segment tracking systems driven by images from moving cameras and quantifies their contribution in terms of tracker performance when assessed in manual box packing operations. To do this, it compares the performance of a plane segment tracking system, considering variations in the tracking algorithm and camera speed (onboard the packing operator) during the mapping of a manual cargo packing process. The tracking algorithm varies at two levels: algorithm (Awpk), which integrates prior knowledge of box sizes in the scene, and algorithm (Awoutpk), which assumes ignorance of box properties. Camera speed is also evaluated at two levels: low speed (Slow) and high speed (Shigh). This study analyzes the impact of these factors on the precision, recall, and F1-score of the plane segment tracking system. ANOVA analysis was applied to the precision and F1-score results, which allows determining that neither the camera speed-algorithm interactions nor the camera speed are significant in the precision of the tracking system. The factor that presented a significant effect is the tracking algorithm. Tukey's pairwise comparisons concluded that the precision and F1-score of each algorithm level are significantly different, with algorithm Awpk being superior in each evaluation. This superiority reaches its maximum in the tracking of top plane segments: 22 and 14 percentage units for precision and F1-score metrics, respectively. However, the results on the recall metric remain similar with and without the addition of prior knowledge. The contribution of including prior knowledge of box sizes in (6D) plane segment tracking algorithms is identified in reducing false positives. This reduction is associated with significant increases in the tracking system's precision and F1-score metrics. Future work will investigate whether the identified benefits propagate to the tracking problem on objects composed of plane segments, such as cubes or boxes.
Publisher
Springer Science and Business Media LLC
Reference46 articles.
1. Chen Zhang and Yu Hu (2017) CuFusion: Accurate real-time camera tracking and volumetric scene reconstruction with a cuboid. Sensors (Switzerland) 17 https://doi.org/10.3390/s17102260, Depth cameras,Kinect sensors,Open source,Real-time reconstruction,SLAM, 10, 14248220, 8613575752056, © 2017 by the authors. Licensee MDPI, Basel, Switzerland. Given a stream of depth images with a known cuboid reference object present in the scene, we propose a novel approach for accurate camera tracking and volumetric surface reconstruction in real-time. Our contribution in this paper is threefold: (a) utilizing a priori knowledge of the precisely manufactured cuboid reference object, we keep drift-free camera tracking without explicit global optimization; (b) we improve the fineness of the volumetric surface representation by proposing a prediction-corrected data fusion strategy rather than a simple moving average, which enables accurate reconstruction of high-frequency details such as the sharp edges of objects and geometries of high curvature; (c) we introduce a benchmark dataset CU3D that contains both synthetic and real-world scanning sequences with ground-truth camera trajectories and surface models for the quantitative evaluation of 3D reconstruction algorithms. We test our algorithm on our dataset and demonstrate its accuracy compared with other state-of-the-art algorithms. We release both our dataset and code as open-source (https://github.com/zhangxaochen/CuFusion) for other researchers to reproduce and verify our results. 2. Rodriguez-Garavito, C. H. and Camacho-Munoz, Guillermo and {\'A}lvarez-Mart{\'i}nez, David and Cardenas, Karol Viviana and Rojas, David Mateo and Grimaldos, Andr{\'e}s (2018) 3D Object Pose Estimation for Robotic Packing Applications. Springer International Publishing, Cham, 978-3-030-00353-1, Given the growth of internet-based trading on a global level, there are several expected logistic challenges regarding the optimal transportation of large volumes of merchandise. With this in mind, the application of technologies such as computer vision and industrial robotics in facing these challenges presents significant advantages regarding the speed and reliability with which palletization tasks, a critical point in the merchandise transportation chain, can be performed. This paper presents a computer vision strategy for the localization and recognition of boxes in the context of a palletization process carried out by a robotic manipulator. The system operates using a Kinect 2.0 depth camera to capture a scene and processing the resulting point cloud. Obtained results permit the simultaneous recognition of up to 15 boxes, their position in space and their size characteristics within the workspace of the robot, with an average error of approximately 3 cm., 10.1007/978-3-030-00353-1_40, 453--463, Applied Computer Sciences in Engineering, Figueroa-Garc{\'i}a, Juan Carlos and Villegas, Juan G. and Orozco-Arroyave, Juan Rafael and Maya Duque, Pablo Andres 3. Christian Wurll (2016) Mixed Case Palletizing with Industrial Robots Summary / Abstract State of the Art. 2016, 682-687, Proceedings of ISR 2016: 47st International Symposium on Robotics, 9783800742318 4. Veronika Kretschmer and Thorsten Plewan and Gerhard Rinkenauer and Benedikt Maettig (2018) Smart palletisation: Cognitive ergonomics in augmented reality based palletising. Advances in Intelligent Systems and Computing 722: 355-360 https://doi.org/10.1007/978-3-319-73888-8_55, Augmented reality,Cognitive ergonomics,Digitisation,Human factors,Intralogistics,Palletising,Smart technologies, 21945357, 9783319738871, Palletisation is recognized as a central logistics process which largely depends on employee ’s performance and expertise. To assess potential advantages of augmented reality (AR) in palletisation, an AR device was compared with a conventional paper-based pick list and a tablet computer. Usability measures show that the usability of the tablet computer surpassed the AR device and the pick list, whereas task load measures suggests that the AR device provides the less strenuous method to assist participants in palletisation. Thus, we conclude that AR devices are appropriate to assist logistic workers in palletising, however usability has to be improved considerably. 5. Benedikt Maettig and Friederike Hering and Martin Doeltgen (2019) Development of an intuitive, visual packaging assistant. Springer International Publishing, Orlando, Florida, USA, 781, 19-25, alexa,intelligent personal assistant, á amazon echo á, Advances in Human Factors and Systems Interaction, 978-3-319-94333-6, vol 781, 10.1007/978-3-319-94334-3, Natural User Interfaces (NUI) are supposed to be used by humans in a very logic way. However, the run to deploy Speech-based NUIs by the industry has had a large impact on the naturality of such interfaces. This paper presents a usability test of the most prestigious and internationally used Speech-based NUI (i.e., Alexa, Siri, Cortana and Google ’s). A comparison of the services that each one provides was also performed considering: access to music services, agenda, news, weather, To-Do lists and maps or directions, among others. The test was design by two Human Computer Interaction experts and executed by eight persons. Results show that even though there are many services available, there is a lot to do to improve the usability of these systems. Specially focused on separating the traditional use of computers (based on applications that require parameters to function) and to get closer to real NUIs.
|
|