Skip to content

Commit

Permalink
agilenn: Fix abstract
Browse files Browse the repository at this point in the history
  • Loading branch information
hosiet committed Jul 27, 2023
1 parent fc5d43d commit 56a2b64
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion content/publication/2022-agilenn/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ publication_types: ['1']
publication: In *the 28th Annual International Conference on Mobile Computing And Networking (MobiCom'22)*
publication_short: In *MobiCom'22*

abstract: On-device training is essential for neural networks (NNs) to continuously adapt to new online data, but can be time-consuming due to the device's limited computing power. To speed up on-device training, existing schemes select trainable NN portion offline or conduct unrecoverable selection at runtime, but the evolution of trainable NN portion is constrained and cannot adapt to the current need for training. Instead, runtime adaptation of on-device training should be fully elastic, i.e., every NN substructure can be freely removed from or added to the trainable NN portion at any time in training. In this paper, we present _ElasticTrainer_, a new technique that enforces such elasticity to achieve the required training speedup with the minimum NN accuracy loss. Experiment results show that ElasticTrainer achieves up to 3.5× more training speedup in wall-clock time and reduces energy consumption by 2×-3× more compared to the existing schemes, without noticeable accuracy loss.
abstract: With the wide adoption of AI applications, there is a pressing need of enabling real-time neural network (NN) inference on small embedded devices, but deploying NNs and achieving high performance of NN inference on these small devices is challenging due to their extremely weak capabilities. Although NN partitioning and offloading can contribute to such deployment, they are incapable of minimizing the local costs at embedded devices. Instead, we suggest to address this challenge via agile NN offloading, which migrates the required computations in NN offloading from online inference to offline learning. In this paper, we present AgileNN, a new NN offloading technique that achieves real-time NN inference on weak embedded devices by leveraging eXplainable AI techniques, so as to explicitly enforce feature sparsity during the training phase and minimize the online computation and communication costs. Experiment results show that AgileNN's inference latency is >6X lower than the existing schemes, ensuring that sensory data on embedded devices can be timely consumed. It also reduces the local device's resource consumption by >8X, without impairing the inference accuracy.

# Summary. An optional shortened abstract.
summary: AgileNN is the first work that achieves real-time inference (<20ms) of mainstream neural network models (e.g., ImageNet) on extremely weak MCUs (e.g., STM32 series with <1MB of memory), without impairing the inference accuracy. The usage of eXplainable AI (XAI) techniques allows >6x improvement of feature compressibility during offloading and >8x reduction of the local device’s resource consumption.
Expand Down

0 comments on commit 56a2b64

Please sign in to comment.