1. Boost Linear Algebra Computation Performance via Efficient VNNI Utilization;Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3;2024-04-27
2. PHCG: Optimizing Simulink Code Generation for Embedded System With SIMD Instructions;IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems;2023-04
3. Co-Utilizing SIMD and Scalar to Accelerate the Data Analytics Workloads;2023 IEEE 39th International Conference on Data Engineering (ICDE);2023-04
4. Custom High-Performance Vector Code Generation for Data-Specific Sparse Computations;Proceedings of the International Conference on Parallel Architectures and Compilation Techniques;2022-10-08
5. COX : Exposing CUDA Warp-level Functions to CPUs;ACM Transactions on Architecture and Code Optimization;2022-09-16