Parallel Implementation of K-Best Quadrature Amplitude Modulation Detection for Massive Multiple Input Multiple Output Systems

Author:

Gokalgandhi Bhargav1ORCID,Ling Jonathan1ORCID,Latinović Zoran1,Samardzija Dragan1,Seskar Ivan2

Affiliation:

1. Nokia Bell Labs, 600 Mountain Ave Bldg 5, New Providence, NJ 07974, USA

2. WINLAB, Rutgers University, 671 US-1, North Brunswick, NJ 08902, USA

Abstract

Massive MIMO (Multiple Input Multiple Output) systems impose significant processing burdens along with strict latency requirements. The combination of large-scale antenna arrays and wide bandwidth requirements for next-generation wireless systems creates an exponential increase in frontend to backend data. Balancing the processing latency and reliability is critical for baseband processing tasks such as QAM detection. While linear detection algorithms have low computational complexity, their use in Massive MIMO scenario has heavy degradation in error performance. Nonlinear detection methods such as Maximum Likelihood and Sphere Decoding have good error performance, but they suffer from high, variable, and uncontrollable computational complexity. For such cases, the K-best QAM detection algorithm can provide required control over the system performance while maintaining near-ML error performance. In this paper, hard-output, as well as soft-output K-best QAM detection, is implemented in a CPU by utilizing the multiple cores combined with vector processing. Similarly, hard-output detection in a GPU is implemented by leveraging the SIMD (Single Instruction, Multiple Data) architecture and Warp-based execution model. The processing time per bit and the energy consumption per bit are compared for CPU and GPU implementations for QAM constellation density and MIMO array size. The GPU implementation shows up to 5× processing latency per bit improvement and up to 120× energy consumption per bit improvement over the CPU implementation for typical QAM constellations such as 4, 16, and 64 QAM. GPU implementation also shows up to 125× improvement over CPU implementation in energy consumption per bit for larger MIMO configurations such as 24 × 24 and 32 × 32. Finally, the soft-output detector is combined with a LDPC (Low-Density Parity Check) decoder to obtain the FER (Frame Error Rate) performance for CPU implementation. The FER is then combined with frame processing latency to form a Goodput metric to demonstrate the latency and reliability tradeoff.

Publisher

MDPI AG

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3