In:
Frontiers in Computational Neuroscience, Frontiers Media SA, Vol. 15 ( 2021-7-5)
Abstract:
Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Memory, in particular Phase Change Memory (PCM), for software-equivalent accurate inference of natural language processing applications. We demonstrate a path to software-equivalent accuracy for the GLUE benchmark on BERT (Bidirectional Encoder Representations from Transformers), by combining noise-aware training to combat inherent PCM drift and noise sources, together with reduced-precision digital attention-block computation down to INT6.
Type of Medium:
Online Resource
ISSN:
1662-5188
DOI:
10.3389/fncom.2021.675741
Language:
Unknown
Publisher:
Frontiers Media SA
Publication Date:
2021
detail.hit.zdb_id:
2452964-3
Permalink