Expert Commentary: Advancements in Edge Device Deep Learning
Recent advancements in research and technology have paved the way for on-device computation of deep learning tasks, bringing advanced AI capabilities to edge devices and micro-controller units (MCUs). This has opened up new possibilities for deploying deep neural net (DNN) models on battery-less intermittent devices, which were once constrained by limited power and resources.
One of the key approaches in enabling deep learning on edge devices is through the optimization of DNN models. This involves techniques such as weight sharing, pruning, and neural architecture search (NAS) to tailor the models for specific edge devices. By reducing the model size and optimizing its architecture, these techniques make it possible to run DNN models on devices with limited resources, such as those with SRAM under 256KB.
However, previous optimization techniques did not take into account intermittent execution or power constraints during NAS. They primarily focused on consecutive execution without power loss, and intermittent execution designs only considered data reuse and costs related to intermittent inference, often resulting in low accuracy. This limitation led to the need for a new approach that could harness the power of optimized DNN models specifically targeting SRAM under 256KB and make them schedulable and runnable within intermittent power.
Accelerated Intermittent Deep Inference: Overcoming Limitations
Our research team has proposed a novel solution called Accelerated Intermittent Deep Inference, which addresses the limitations of previous approaches. Our main contributions are:
- Scheduling tasks performed by on-device inferencing into intermittent execution cycles and optimizing for latency.
- Developing a system that can achieve end-to-end latency while maintaining higher accuracy compared to existing baseline models optimized for edge devices.
By carefully scheduling the execution of deep inference tasks within intermittent execution cycles, we are able to utilize the available power more efficiently and minimize latency. This is crucial for achieving real-time responsiveness on edge devices while running resource-intensive DNN models.
In addition to efficient scheduling, we have also developed a system that takes into account the intermittent nature of power availability. By optimizing DNN models specifically for SRAM under 256KB and designing the system to handle intermittent execution, we are able to achieve a much higher accuracy compared to previous approaches.
The Accelerated Intermittent Deep Inference approach not only overcomes the limitations of existing techniques but also opens up new possibilities for deploying deep learning on battery-less intermittent devices. This has tremendous implications for various applications, including IoT devices, wearables, and edge computing.
Overall, the advancements in edge device deep learning are promising, and the proposed Accelerated Intermittent Deep Inference approach presents a significant breakthrough. By optimizing DNN models and designing systems that can handle intermittent execution, we are able to bring high-accuracy deep learning capabilities to resource-constrained edge devices. This will fuel further innovation in AI and enable a wide range of applications in the IoT and edge computing domains.