Accelerating machine learning  at the edge with approximate computing on FPGAs

doi:10.18845/tm.v35i9.6491

PDF HTML (Español (España))

Published: Nov 28, 2022

DOI: https://doi.org/10.18845/tm.v35i9.6491

Keywords:

Approximate computing, edge computing, machine learning, neural networks, linear algebra

Instituto Tecnológico de Costa Rica

https://orcid.org/0000-0002-3263-7853

Instituto Tecnológico de Costa Rica

Abstract

Performing inference of complex machine learning (ML) algorithms at the edge is becoming important to unlink the system functionality from the cloud. However, the ML models increase complexity faster than the available hardware resources. This research aims to accelerate machine learning by offloading the computation to low-end FPGAs and using approximate computing techniques to optimise resource usage, taking advantage of the inaccurate nature of machine learning models. In this paper, we propose a generic matrix multiply-add processing element design, parameterised in datatype, matrix size, and data width. We evaluate the resource consumption and error behaviour while varying the matrix size and the data width given a fixed-point data type. We determine that the error scales with the matrix size, but it can be compensated by increasing the data width, posing a trade-off between data width and matrix size with respect to the error.

How to Cite

Luis Gerardo, Eduardo, & Jorge. (2022). Accelerating machine learning at the edge with approximate computing on FPGAs. Tecnología En Marcha Journal, 35(9), Pág. 39–45. https://doi.org/10.18845/tm.v35i9.6491

Issue

2022: Vol. 35 special issue. IEEE International Conference on Bioinspired Processing

Section

Artículo científico

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Los autores conservan los derechos de autor y ceden a la revista el derecho de la primera publicación y pueda editarlo, reproducirlo, distribuirlo, exhibirlo y comunicarlo en el país y en el extranjero mediante medios impresos y electrónicos. Asimismo, asumen el compromiso sobre cualquier litigio o reclamación relacionada con derechos de propiedad intelectual, exonerando de responsabilidad a la Editorial Tecnológica de Costa Rica. Además, se establece que los autores pueden realizar otros acuerdos contractuales independientes y adicionales para la distribución no exclusiva de la versión del artículo publicado en esta revista (p. ej., incluirlo en un repositorio institucional o publicarlo en un libro) siempre que indiquen claramente que el trabajo se publicó por primera vez en esta revista.

References

C. J. Wu, D. Brooks, K. Chen, D. Chen, et al., “Machine learning at facebook: Understanding inference at the edge” Proceedings 25th IEEE International Symposium on High Performance Computer Architecture, HPCA 2019, pp. 331–344, 2019. https://doi.org/10.1109/HPCA.2019.00048

B. C. Schafer and Z. Wang, “High-Level Synthesis Design Space Exploration: Past, Present, and Future” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 10, pp. 2628-2639, Oct. 2020, https://doi.org/10.1109/TCAD.2019.2943570.

Z. Wang and B. C. Schafer, “Learning from the Past: Efficient High-Level Synthesis Design Space Exploration for FPGAs” in ACM Transactions on Design Automation of Electronic Systems, vol. 27, no. 4, Jul. 2022, https://doi.org/10.1145/3495531.

T. Liang, J. Glossner, L. Wang, S. Shi and X. Zhang, “Pruning and quantization for deep neural network acceleration: A survey”, Neurocomputing, vol. 461, pp. 370-403, 2021. https://doi.org/10.1016/j.neucom.2021.07.045

T. González, J. Castro-Godínez. “Improving Performance of Error-Tolerant Applications: A Case Study of Approximations on an Off-the-Shelf Neural Accelerator” in V Jornadas Costarricenses de Investigación en Computación e Informática (JoCICI 2021), Virtual Event, Oct. 2021.

Intel, “Intel® Architecture Instruction Set Extensions and Future Features”, Intel Corporation, May 2021. [Online]. Available: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

NVIDIA Corporation, “NVIDIA TESLA V100 GPU ARCHITECTURE” 2017. [Online]. Available: https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf

Andrew Lavin and Scott Gray. 2016. Fast algorithms for convolutional neural networks. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition. 4013–4021. https://doi.org/10.1109/CVPR.2016.435

Salazar-Villalobos, Eduardo, and Leon-Vega, Luis G. (2022). Flexible Accelerator Library: Approximate Matrix Accelerator (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.6272004

Article Sidebar

Main Article Content

Abstract

Article Details

References

Most read articles by the same author(s)