1. Level-3 BLAS on the TI C6678 Multi-core DSP
2. 26 PFLOPS Stencil Computations for Atmospheric Modeling on Sunway TaihuLight
3. Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, 2006. The Landscape of Parallel Computing Research: A View from Berkeley. eScholarship, University of California.
4. Deshun Bi, Shengguo Li, Yichen Zhang, Xiaojian Yang, and Dezun Dong. 2023. Efficiently Running SpMV on Multi-core DSPs for Banded Matrix. In International Conference on Algorithms and Architectures for Parallel Processing. Springer, 201–220.
5. Diamond Tiling: Tiling Techniques to Maximize Parallelism for Stencil Computations