1. Leon Adams and Strategic Marketing. 2002. Choosing the right architecture for real-time signal processing designs. Texas Instruments, Dallas, Texas, USA.
2. Ariful Azad, Aydin Buluç, and John Gilbert. 2015. Parallel Triangle Counting and Enumeration Using Matrix Algebra. In 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE, Hyderabad, India, 804–811. https://doi.org/10.1109/IPDPSW.2015.75
3. The Combinatorial BLAS: design, implementation, and applications
4. Ruiqi Chen, Haoyang Zhang, Shun Li, Enhao Tang, Jun Yu, and Kun Wang. 2023. Graph-OPU: A Highly Integrated FPGA-Based Overlay Processor for Graph Neural Networks. In 2023 33rd International Conference on Field-Programmable Logic and Applications (FPL). IEEE, Gothenburg, Sweden, 228–234. https://doi.org/10.1109/FPL60245.2023.00039
5. Ruiqi Chen, Haoyang Zhang, Yuhanxiao Ma, Jianli Chen, Jun Yu, and Kun Wang. 2023. eSSpMV: An Embedded-FPGA-based Hardware Accelerator for Symmetric Sparse Matrix-Vector Multiplication. In 2023 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, Monterey, CA, USA, 1–5. https://doi.org/10.1109/ISCAS46773.2023.10181734