1. Aila, T., & Laine, S. (2009). Understanding the efficiency of ray traversal on gpus. In Proceedings of the conference on high performance graphics 2009 (pp. 145–149): ACM.
2. Aluru, M., Zola, J., Nettleton, D., & Aluru, S. (2012). Reverse engineering and analysis of large genome-scale gene networks. Nucleic acids research (p. gks904).
3. Asanovic, K., Bodik, R., Catanzaro, B. C., Gebis, J. J., Husbands, P., Keutzer, K., Patterson, D. A., Plishker, W. L., Shalf, J., Williams, S. W., & et al. (2006). The landscape of parallel computing research: A view from berkeley. Tech. rep., Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley.
4. Ashari, A., Sedaghati, N., Eisenlohr, J., Parthasarath, S., & Sadayappan, P. (2014). Fast sparse matrix-vector multiplication on gpus for graph applications. In Proceedings of the international conference for high performance computing, networking, storage and analysis (pp. 781–792): IEEE.
5. Ashari, A., Sedaghati, N., Eisenlohr, J., & Sadayappan, P. (2014). An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on gpus. In Proceedings of the 28th ACM international conference on supercomputing (pp. 273–282): ACM.