Energy-aware deep learning for real-time video analysis through pruning, quantization, and hardware optimization


İsenkul M. E.

JOURNAL OF REAL-TIME IMAGE PROCESSING, cilt.22, ss.1-16, 2025 (SCI-Expanded, Scopus)

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 22
  • Basım Tarihi: 2025
  • Doi Numarası: 10.1007/s11554-025-01703-0
  • Dergi Adı: JOURNAL OF REAL-TIME IMAGE PROCESSING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, PASCAL, Compendex, INSPEC
  • Sayfa Sayıları: ss.1-16
  • İstanbul Üniversitesi-Cerrahpaşa Adresli: Evet

Özet

Artificial intelligence has dramatically changed real-time video application use across a variety of fields, from surveillance and autonomous systems. However, these AI models have high computational demands, and thus high energy consumption, limiting scalability in power-constrained environments. This paper systematically addresses energy optimization in real-time video inference by exploring energy-performance trade-offs with sophisticated AI models, i.e., YOLOv8 and MobileNet. A variety of techniques, such as model pruning, quantization, and hardware-aware optimizations, were stringently evaluated. Experimental results indicate that model pruning resulted in a reduction in energy expenditure by up to 35% without compromising detection accuracy to 92%. Moreover, quantization resulted in a further energy saving by 18% and improved inference acceleration by 25% with minimal reduction in accuracy. Moreover, reducing frame rates from 30 frames per second (FPS) to 15 FPS resulted in a power reduction by 40% with a mere reduction in detection performance by 3%. Benchmarking across different hardware setups showed that power savings by optimized lightweight models running in edge devices were by as much as 50% compared to GPUs. These results highlight feasible directions for energy efficient AI system design in real-time video applications, which prove to be particularly useful in edge computing and Internet of Things environments.