Key Concepts, Weakness and Benchmark on Hash Table Data Structures

Author:

Tapia-Fernández SantiagoORCID,García-García Daniel,García-Hernandez Pablo

Abstract

Most computer programs or applications need fast data structures. The performance of a data structure is necessarily influenced by the complexity of its common operations; thus, any data-structure that exhibits a theoretical complexity of amortized constant time in several of its main operations should draw a lot of attention. Such is the case of a family of data structures that is called hash tables. However, what is the real efficiency of these hash tables? That is an interesting question with no simple answer and there are some issues to be considered. Of course, there is not a unique hash table; in fact, there are several sub-groups of hash tables, and, even more, not all programming languages use the same variety of hash tables in their default hash table implementation, neither they have the same interface. Nevertheless, all hash tables do have a common issue: they have to solve hash collisions; that is a potential weakness and it also induces a classification of hash tables according to the strategy to solve collisions. In this paper, some key concepts about hash tables are exposed and some definitions about those key concepts are reviewed and clarified, especially in order to study the characteristics of the main strategies to implement hash tables and how they deal with hash collisions. Then, some benchmark cases are designed and presented to assess the performance of hash tables. The cases have been designed to be randomized, to be self-tested, to be representative of a real user cases, and to expose and analyze the impact of different factors over the performance across different hash tables and programming languages. Then, all cases have been programmed using C++, Java and Python and analyzed in terms of interfaces and efficiency (time and memory). The benchmark yields important results about the performance of these structures and its (lack of) relationship with complexity analysis.

Funder

Fundación para el Fomento de la Innovación Industrial

Publisher

MDPI AG

Subject

Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science

Reference14 articles.

1. ISO International Standard ISO/IEC 14882:2011(E)—Programming Language C++,2011

2. Interface Map https://docs.oracle.com/javase/8/docs/api/java/util/Map.html

3. Dictionaries https://docs.python.org/3/tutorial/datastructures.html#dictionaries

4. Introduction to Algorithms;Cormen,2009

5. An algorithm for the organization of information;Adelson-Velsky;Proc. USSR Acad. Sci.,1962

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3