Heterogeneous Memory Integration and Optimization for Energy-Efficient Multi-Task NLP Edge Inference-Reference-Cited by-同舟云学术

Heterogeneous Memory Integration and Optimization for Energy-Efficient Multi-Task NLP Edge Inference

Published:2024-08-05 Issue: Volume:35 Page:1-6
ISSN:
Container-title:Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design
language:
Short-container-title:

Author:

Fu Zirui¹^ORCID,Avaliani Aleksandre¹^ORCID,Donato Marco¹^ORCID

Affiliation:

1. Tufts University, Medford, MA, United States

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3665314.3672281

Reference23 articles.

1. H. Cai, C. Gan, L. Zhu, and S. Han. 2020. TinyTL: reduce memory, not parameters for efficient on-device learning. In Proceedings of the 34th International Conference on Neural Information Processing Systems.

2. Z. Carmichael, H. F. Langroudi, C. Khazanov, J. Lillie, J. L. Gustafson, and D. Kudithipudi. 2019. Performance-Efficiency Trade-off of Low-Precision Numerical Formats in Deep Neural Networks. In Proceedings of the Conference for Next Generation Arithmetic 2019.

3. M. Chang, S. D. Spetalnick, B. Crafton, W.-S. Khwa, Y.-D. Chih, M.-F. Chang, and A. Raychowdhury. 2022. A 40nm 60.64TOPS/W ECC-Capable Compute-in-Memory/Digital 2.25MB/768KB RRAM/SRAM System with Embedded Cortex M3 Microprocessor for Edge Recommendation Systems. In 2022 IEEE International Solid-State Circuits Conference (ISSCC).

4. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In North American Chapter of the Association for Computational Linguistics.

5. M. Donato, L. Pentecost, D. Brooks, and G.-Y. Wei. 2019. MEMTI: Optimizing On-Chip Nonvolatile Storage for Visual Multitask Inference at the Edge. IEEE Micro 39 (Nov. 2019), 73--81.