TY - JOUR
T1 - REDS
T2 - Resource-Efficient Deep Subnetworks for Dynamic Resource Constraints
AU - Corti, Francesco
AU - Maag, Balz
AU - Schauer, Joachim
AU - Pferschy, Ulrich
AU - Saukh, Olga
PY - 2026
Y1 - 2026
N2 - Deep learning models deployed on edge devices frequently encounter resource variability, which arises from fluctuating energy levels, timing constraints, or prioritization of other critical tasks within the system. State-of-the-art machine learning pipelines generate resource-agnostic models that are not capable to adapt at runtime. In this work, we introduce Resource-Efficient Deep Subnetworks (REDS) to tackle model adaptation to variable resources. In contrast to the state-of-the-art, REDS leverages structured sparsity constructively by exploiting permutation invariance of neurons, which allows for hardware-specific optimizations. Specifically, REDS achieves computational efficiency by (1) skipping sequential computational blocks identified by a novel iterative knapsack optimizer, and (2) taking advantage of data cache by re-arranging the order of operations in REDS computational graph. REDS supports conventional deep networks frequently deployed on the edge and provides computational benefits even for small and simple networks. We evaluate REDS on eight benchmark architectures trained on the Visual Wake Words, Google Speech Commands, Fashion-MNIST, CIFAR-10 and ImageNet-1 K datasets, and test on four off-the-shelf mobile and embedded hardware platforms. We provide a theoretical result and empirical evidence demonstrating REDS’ outstanding performance in terms of submodels’ test set accuracy, and demonstrate an adaptation time in response to dynamic resource constraints of under 40 μs, utilizing a fully-connected network on Arduino Nano 33 BLE.
AB - Deep learning models deployed on edge devices frequently encounter resource variability, which arises from fluctuating energy levels, timing constraints, or prioritization of other critical tasks within the system. State-of-the-art machine learning pipelines generate resource-agnostic models that are not capable to adapt at runtime. In this work, we introduce Resource-Efficient Deep Subnetworks (REDS) to tackle model adaptation to variable resources. In contrast to the state-of-the-art, REDS leverages structured sparsity constructively by exploiting permutation invariance of neurons, which allows for hardware-specific optimizations. Specifically, REDS achieves computational efficiency by (1) skipping sequential computational blocks identified by a novel iterative knapsack optimizer, and (2) taking advantage of data cache by re-arranging the order of operations in REDS computational graph. REDS supports conventional deep networks frequently deployed on the edge and provides computational benefits even for small and simple networks. We evaluate REDS on eight benchmark architectures trained on the Visual Wake Words, Google Speech Commands, Fashion-MNIST, CIFAR-10 and ImageNet-1 K datasets, and test on four off-the-shelf mobile and embedded hardware platforms. We provide a theoretical result and empirical evidence demonstrating REDS’ outstanding performance in terms of submodels’ test set accuracy, and demonstrate an adaptation time in response to dynamic resource constraints of under 40 μs, utilizing a fully-connected network on Arduino Nano 33 BLE.
KW - adaptive neural network compression
KW - Deep learning
KW - dynamic resource constraints
UR - https://www.scopus.com/pages/publications/105012386347
U2 - 10.1109/TMC.2025.3594214
DO - 10.1109/TMC.2025.3594214
M3 - Article
SN - 1536-1233
VL - 25
SP - 451
EP - 465
JO - IEEE Transactions on Mobile Computing
JF - IEEE Transactions on Mobile Computing
IS - 1
ER -