Waist Tightening of CNNs: A Case study on Tiny YOLOv3 for Distributed IoT Implementations
Authors:
Isaac Sanchez Leal, Eiraj Saqib, Irida Shallari, Axel Jantsch, Silvia Krug and Mattias O'Nils
Keywords:
Abstract:
"Computer vision systems in sensor nodes of the Internet of Things (IoT) based on Deep Learning (DL) are demanding because the DL models are memory and computation hungry while the nodes often come with tight constraints on energy, latency, and memory. Consequently, work has been done to reduce the model size or distribute part of the work to other nodes. However, then the question arises how it impacts energy consumption at the node and the inference time of the system. In this work, we perform a case study to explore the impact of partitioning a Convolutional Neural Network (CNN) such that one part is implemented on the IoT node, while the rest is implemented on an edge device. The goal is to explore how the choice of partition point, quantization method and communication technology effects the IoT system. We identify possible partitioning points between layers, where we transform the feature maps passed between layers by applying quantization and compression to reduce the data sent over the communication channel between the two partitions in Tiny YOLOv3. The results show that a reduction of transmitted data by 99.8% reduces the network accuracy by 3 percentage points. Furthermore, the evaluation of various IoT communication protocols shows that the quantization of data facilitates CNN network partitioning with significant reduction of overall latency and node energy consumption."