JOURNAL OF REAL-TIME IMAGE PROCESSING, cilt.22, ss.1-16, 2025 (SCI-Expanded, Scopus)
Artificial intelligence has dramatically changed real-time video application use across a variety of fields, from surveillance and autonomous systems. However, these AI models have high computational demands, and thus high energy consumption, limiting scalability in power-constrained environments. This paper systematically addresses energy optimization in real-time video inference by exploring energy-performance trade-offs with sophisticated AI models, i.e., YOLOv8 and MobileNet. A variety of techniques, such as model pruning, quantization, and hardware-aware optimizations, were stringently evaluated. Experimental results indicate that model pruning resulted in a reduction in energy expenditure by up to 35% without compromising detection accuracy to 92%. Moreover, quantization resulted in a further energy saving by 18% and improved inference acceleration by 25% with minimal reduction in accuracy. Moreover, reducing frame rates from 30 frames per second (FPS) to 15 FPS resulted in a power reduction by 40% with a mere reduction in detection performance by 3%. Benchmarking across different hardware setups showed that power savings by optimized lightweight models running in edge devices were by as much as 50% compared to GPUs. These results highlight feasible directions for energy efficient AI system design in real-time video applications, which prove to be particularly useful in edge computing and Internet of Things environments.