Abstract
With the continuous development of information technology, distributed database has become a research hotspot. Due to the limit support for SQL and defects in transaction processing and consistency of distributed databases based on NoSQL architecture, NewSQL databases based on LSM-Tree become gradually the mainstream of applications, such as TiDB and OceanBase. The distributed LSM-Tree storage architecture divides the data into baseline data and incremental data. Through the compaction operation, the incremental data of different partitions and the baseline data are continuously merged and stored on the disk, thereby reducing memory pressure. However, compaction will occupy a large amount of system resources and seriously affect system availability. This paper proposes an asynchronous compaction mechanism based on LSM-Tree architecture. By subdividing the compaction process, the data merging is asynchronous, which effectively shortens the time for a single compaction operation. Experiments show that the asynchronous compaction mechanism proposed in this paper can significantly shorten the data merging time and improve the robustness and usability of the system in high-frequency writing scenarios.