Skip to content

A simple way to obtain CNNs that run on low-RAM microcontrollers, e.g. 40 kiB, on your laptop, without a GPU.

License

Notifications You must be signed in to change notification settings

AndreaMattiaGaravagno/NanoNAS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NanoNAS

A small hardware-aware neural architecture search (HW NAS) algorithm targeting low-RAM microcontrollers. Given its low search cost, it can be executed on laptops without a GPU.

It provides small CNNs that can fit the constraints of low-RAM microcontrollers.

If you want a more repeatable and accurate similar HW NAS at the expense of using a GPU, check ColabNAS.

News

An overview of its performances

This section shows models obtained by running NanoNAS on several targets using the Visual Wake Words dataset, a standard TinyML benchmark. The following table shows the models' accuracy over the test set (the mini-val split), RAM and FLASH occupancy, and the corresponding hardware target. The search cost (search time + training time) is also reported in the table. It has been measured on a laptop featuring an 11th Gen Intel(R) Core(TM) i7-11370H CPU @ 3.30GHz equipped with 16 GB of RAM and 512 GB of SSD, without using a GPU.

Target Model Name Accuracy RAM occupancy FLASH occupancy Resolution Search Cost GPU
NUCLEO-L010RB vww_nucleo_l010rb 72.3% 20 kiB 10.66 kiB 50x50 rgb 1:50h no
Arduino Nano 33 IoT vww_arduino_nano_33_iot 74.6% 26 kiB 19.73 kiB 50x50 rgb 2:01h no
NUCLEO-L412KB vww_nucleo_l412kb 77.2% 31 kiB 28.48 kiB 50x50 rgb 3:53h no

Here is a script for building the Visual Wake Words dataset.

Considering the smallest model offered by two state-of-the-art HW-NAS targeting microcontrollers, Micronets and MCUNET, and running NanoNAS using the constraints of the larger model of the twos, we can see that NanoNAS delivers significantly smaller networks both in terms of RAM and FLASH occupancy while achieving competitive accuracy over the Visual Wake Words dataset, as shown in the table below.

Project Model Name Accuracy RAM occupancy FLASH occupancy Resolution Search Cost GPU
Micronets MicroNet VWW-2 INT8 76.8% 70.50 kiB 273.81 kiB 50x50 grayscale n/a yes
NanoNAS vww-PRIME23 77% 28.50 kiB 23.65 kiB 50x50 rgb 3:37h no
MCUNET mcunet-vww0 87.4% 168.5 kiB 530.52 kiB 64x64 rgb 300h yes

For further details of the comparison refer to "A hardware-aware neural architecture search algorithm targeting low-end microcontrollers" by A.M. Garavagno et al. published in the proceedings of the 18th Conference on Ph. D Research in Microelectronics and Electronics (PRIME), 2023.

How to use

  • In search.py modify:
    • "path_to_training_set" and "path_to_test_set" variables to use your own dataset (a small-dataset suggestion)
    • "input_size" to set the network's input size (50x50 is a good starting point for microcontrollers)
    • "RAM_upper_bound" and "Flash_upper_bound" according to your microcontroller
    • "MACC_upper_bound" according to the maximum desired number of MACC (a good starting point is given by multiplying the CoreMark score of your microcontroller by 10000)
  • Run search.py

In the folder "results", you will find a copy of the trained Keras model and the corresponding fully quantized Tflite model at uint8, ready to run on a microcontroller.

Hint: try multiple runs to find the best result.

How to use on Raspberry Pi and Jetson Nano

  • Enable multiarch support for amd64 (it's needed to run the STM32's script used to measure Flash and RAM occupancy of candidates)
  • Install libstdc++6 for amd64
  • I also recommend the usage of zram

Requirement

  • Python 3.9.15 or later
  • Tensorflow 2.11.0 or later

Citation

If you find the project useful, please consider citing our works:

@article{garavagno2024affordable,
  title={An Affordable Hardware-aware Neural Architecture Search for Deploying Convolutional Neural Networks on Ultra-low-power Computing Platforms},
  author={Garavagno, Andrea Mattia and Ragusa, Edoardo and Frisoli, Antonio and Gastaldo, Paolo},
  journal={IEEE Sensors Letters},
  year={2024},
  publisher={IEEE}
}

@inproceedings{garavagno2024running,
    title={Running hardware-aware neural architecture search on embedded devices under 512MB of RAM},
    author={Garavagno, Andrea Mattia and Ragusa, Edoardo and Frisoli, Antonio and Gastaldo, Paolo},
    booktitle={2024 IEEE International Conference on Consumer Electronics (ICCE)},
    pages={1--2},
    year={2024},
    organization={IEEE}
}

@inproceedings{garavagno2023hardware,
    title={A hardware-aware neural architecture search algorithm targeting low-end microcontrollers},
    author={Garavagno, Andrea Mattia and Ragusa, Edoardo and Frisoli, Antonio and Gastaldo, Paolo},
    booktitle={2023 18th Conference on Ph. D Research in Microelectronics and Electronics (PRIME)},
    pages={281--284},
    year={2023},
    organization={IEEE}
}

About

A simple way to obtain CNNs that run on low-RAM microcontrollers, e.g. 40 kiB, on your laptop, without a GPU.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages