Powering the AI Era: Sustainable Approaches for Intelligent Computing Across HPC and Embedded Systems

Hajar OUAAROUCH; Safae DAHMANI; Kaouthar BOUSSELAM; Mouhcine CHAMI

doi:https://doi.org/10.14445/22315381/IJETT-V74I5P120

Research Article | Open Access | Download PDF

Volume 74 | Issue 5 | Year 2026 | Article Id. IJETT-V74I5P120 | DOI : https://doi.org/10.14445/22315381/IJETT-V74I5P120

Powering the AI Era: Sustainable Approaches for Intelligent Computing Across HPC and Embedded Systems

Hajar OUAAROUCH, Safae DAHMANI, Kaouthar BOUSSELAM, Mouhcine CHAMI

Received	Revised	Accepted	Published
13 Jan 2026	10 Feb 2026	10 Mar 2026	30 May 2026

Citation :

Hajar OUAAROUCH, Safae DAHMANI, Kaouthar BOUSSELAM, Mouhcine CHAMI, "Powering the AI Era: Sustainable Approaches for Intelligent Computing Across HPC and Embedded Systems," International Journal of Engineering Trends and Technology (IJETT), vol. 74, no. 5, pp. 295-310, 2026. Crossref, https://doi.org/10.14445/22315381/IJETT-V74I5P120

Abstract

The evolution of modern Computing has known, in recent years, a significant rapid growth in performance and scalability. This progress has revealed unprecedented computational capacities while the requirement for energy efficiency is simultaneously increasing, especially for embedded systems. In that context, the utilization of intelligent techniques such as Machine Learning (ML) to improve performance and reduce energy consumption in computationally intensive applications has also been explored as an interesting direction. This survey presents a general assessment of the latest energy-aware high-performance computing trends, focusing overall on intelligent optimization techniques. By leveraging recent advances in architecture innovation, energy-efficient design techniques, and predictive learning methods, this paper presents a discussion of the opportunities and challenges leading to the evolution of green and sustainable high-performance systems. The aim of this work is to inspire and guide future research toward energy-efficient and scalable modern computing infrastructures driven by intelligent learning frameworks.

Keywords

Energy Efficiency, Embedded Computing, High-Performance Computing (HPC), Heterogeneous Systems, AI Workloads, Processing-In- Memory, Processing-In-Network.

References

[1] Energy Demand from AI, International Energy Agency (IEA), 2026. [Online]. Available: https://www.iea.org/reports/energy-and-ai/energy-demand-from-ai

[2] Electricity, Ministry of Energy Transition and Sustainable Development, Morocco, 2024. [Online]. Available: https://www.mem.gov.ma/Pages/secteur0a89.html?e=1

[3] Energy Consumption in Data Centres: Air versus Liquid Cooling, Eaton, 2022. [Online]. Available: https://www.boydcorp.com/blog/energy-consumption-in-data-centers-air-versus-liquid-cooling.html

[4] Alyssa Bersine, Reducing Data Center Peak Cooling Demand and Energy Costs with Underground Thermal Energy Storage, National Laboratory of the Rockies, 2025. [Online]. Available: https://www.nrel.gov/news/detail/program/2025/reducing-data-center-peak-cooling-demand-and-energy-costs-with-underground-thermal-energy-storage

[5] M. Shamanna et al., “E-Core Implementation in Intel 4 with PowerVia (Backside Power) Technology,” 2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Kyoto, Japan, pp. 1-2, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[6] Qiang Liu, and Wayne Luk, “Heterogeneous Systems for Energy Efficient Scientific Computing,” International Symposium on Applied Reconfigurable Computing, Hong Kong, China, vol. 1, pp. 64-75, 2012.
[CrossRef] [Google Scholar] [Publisher Link]

[7] Norm Jouppi, Quantifying the Performance of the TPU, Google Cloud Blog, 2017. [Online]. Available: https://cloud.google.com/blog/products/gcp/quantifying-the-performance-of-the-tpu-our-first-machine-learning-chip

[8] Norm Jouppi et al., “TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings,” ISCA '23: Proceedings of the 50^th Annual International Symposium on Computer Architecture, Orlando, FL, USA, pp. 1147-1160, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[9] Kiran Seshadri et al., “An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks,” 2022 IEEE International Symposium on Workload Characterization (IISWC), Austin, TX, USA, pp. 79-91, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[10] H.M. Reddy et al., “Efficient Video Processing at Scale Using MSVP,” Applications of Digital Image Processing XLVI, vol. 12674, pp. 1-16, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[11] Matej Spetko, Lubomir Riha, and Branislav Jansik, Performance, Power Consumption and Thermal Behavioral Evaluation of the DGX-2 Platform, Advances in Parallel Computing, IOS Press, pp. 614-623, 2020.
[CrossRef] [Google Scholar] [Publisher Link]

[12] NVIDIA DGX Spark™ Founders Edition, Leadtek Research Inc., 2025. [Online]. Available: https://www.leadtek.com/eng/products/ai_hpc(37)/nvidia_dgx_spark_founders_edition(51035)/detail

[13] Qingye Jiang, Young Choon Lee, and Albert Y. Zomaya, “The Power of ARM64 in Public Clouds,” 2020 20^th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), Melbourne, VIC, Australia, pp. 459-468, 2020.
[CrossRef] [Google Scholar] [Publisher Link]

[14] Alex de Vries, “The Growing Energy Footprint of Artificial Intelligence,” Joule, vol. 7, no. 10, pp. 2191-2194, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[15] Xian-He Sun, and Xiaoyang Lu, “The Memory-Bounded Speedup Model and its Impacts in Computing,” Journal of Computer Science and Technology, vol. 38, no. 1, pp. 64-79, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[16] Gokcen Kestor et al., “Quantifying the Energy Cost of Data Movement in Scientific Applications,” 2013 IEEE International Symposium on Workload Characterization (IISWC), Portland, OR, USA, pp. 56-65, 2013.
[CrossRef] [Google Scholar] [Publisher Link]

[17] Robert Tracey et al., “Towards Bespoke Optimizations of Energy Efficiency in HPC Environments,” Applied AI Letters, vol. 4, no. 4, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[18] Nefi Alarcon, OpenAI Presents GPT-3, a 175 billion Parameters Language Model, NVIDIA Corporation, 2020. [Online]. Available: https://developer.nvidia.com/blog/openai-presents-gpt-3-a-175-billion-parameters-language-model/

[19] Ilpyung Yoon et al., “Comparative Study on Energy Consumption of Neural Networks by Scaling of Weight-Memory Energy Versus Computing Energy for Implementing Low-Power Edge Intelligence,” Electronics, vol. 14, no. 13, pp. 1-19, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[20] Brad Everman et al., “Evaluating the Carbon Impact of Large Language Models at the Inference Stage,” 2023 IEEE International Performance, Computing, and Communications Conference (IPCCC), Anaheim, CA, USA, pp. 150-157, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[21] El Capitan Retains #1 as JUPITER Becomes Europe’s First Exascale System in the 66th TOP500 List, TOP500.org, 2025. [Online]. Available: https://www.top500.org/

[22] Erqian Tang, Svetlana Minakova, and Todor Stefanov, “Energy-Efficient and High-Throughput CNN Inference on Embedded CPUs-GPUs MPSoCs,” International Conference on Embedded Computer Systems, Samos, Greece, vol. 1, pp. 127-143, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[23] Andrea Borghesi et al., “Scheduling-Based Power Capping in High Performance Computing Systems,” Sustainable Computing: Informatics and Systems, vol. 19, pp. 1-13, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[24] Andrea Borghesi et al., “Predictive Modeling for Job Power Consumption in HPC Systems,” International Conference on High Performance Computing, Frankfurt, Germany, vol. 2, pp. 181-199, 2016.
[CrossRef] [Google Scholar] [Publisher Link]

[25] Zheng Wang, and Michael O’Boyle, “Machine Learning in Compiler Optimization,” Proceedings of the IEEE, vol. 106, no. 11, pp. 1879-1901, 2018.
[CrossRef] [Google Scholar] [Publisher Link]

[26] Vaibhav Sundriyal, and Masha Sosonkina, “Runtime Energy Savings Based on Machine Learning Models for Multicore Applications,” Journal of Computer and Communications, vol. 10, no. 6, pp. 63-80, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[27] Nuno Paulino, João Canas Ferreira, and João M.P. Cardoso, “Improving Performance and Energy Consumption in Embedded Systems via Binary Acceleration: A Survey,” ACM Computing Surveys (CSUR), vol. 53, no. 1, pp. 1-36, 2020.
[CrossRef] [Google Scholar] [Publisher Link]

[28] José Luis Conradi Hoffmann, and Antônio Augusto Fröhlich, “Online Machine Learning for Energy-Aware Multicore Real-Time Embedded Systems,” IEEE Transactions on Computers, vol. 71, no. 2, pp. 493-505, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[29] Manjari Gupta, Lava Bhargava, and S. Indu, “Dynamic Workload-Aware DVFS for Multicore Systems Using Machine Learning,” Computing, vol. 103, no. 8, pp. 1747-1769, 2021.
[CrossRef] [Google Scholar] [Publisher Link]

[30] Somdip Dey et al., “CPU-GPU-Memory DVFS for Power-Efficient MPSoC in Mobile Cyber Physical Systems,” Future Internet, vol. 14, no. 3, pp. 1-14, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[31] Dongyu Xu et al., “Improving Power and Performance of on‑Chip Network through Virtual Channel Sharing and Power Gating,” Integration, vol. 93, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[32] Mehdi Modarressi and S. Hossein SeyyedAghaei Rezaei, Power‑Efficient Network‑On‑Chip Design by Partial Topology Reconfiguration, Advances in Computers, Elsevier, vol. 124, pp. 217-255, 2022.
[CrossRef] [Google Scholar] [Publisher Link]

[33] Xiaoyun Zhang et al., “A survey of Machine Learning for Network‑on‑Chips,” Journal of Parallel and Distributed Computing, vol. 186, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[34] Mahek Desai, Rowena Quinn, and Marjan Asadinia, “SMART-WRITE: Adaptive Learning-Based Write Energy Optimization for Phase Change Memory,” 2025 IEEE 15^th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, pp. 00640-00648, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[35] Sangmin Jeon et al., “HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for edge AI Devices,” 2025 62^nd ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, pp. 1-7, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[36] Xu Yang, Yumin Hou, and Hu He, “A Processing-in-Memory Architecture Programming Paradigm for Wireless IoT Applications,” Sensors, vol. 19, no. 1, pp. 1-23, 2019.
[CrossRef] [Google Scholar] [Publisher Link]

[37] Huu Nghia Nguyen, Manh-Dung Nguyen, and Edgardo Montes de Oca, “A Framework for In-Network Inference Using P4,” ARES '24: Proceedings of the 19^th International Conference on Availability, Reliability and Security, Vienna, Austria, pp. 1-6, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[38] Jiuxi Meng et al., “Beyond Network Switching: FPGA-based Switch Architecture for Fast and Accurate Ensemble Learning,” Preprints, pp. 1-24, 2024.
[CrossRef] [Google Scholar] [Publisher Link]

[39] Aristide Tanyi-Jong Akem, Michele Gucciardo, and Marco Fiore, “Flowrest: Practical Flow-Level Inference in Programmable Switches with Random Forests,” IEEE INFOCOM 2023 - IEEE Conference on Computer Communications, New York City, NY, USA, pp. 1-10, 2023.
[CrossRef] [Google Scholar] [Publisher Link]

[40] Mai Zhang et al., “Quark: Implementing Convolutional Neural Networks Entirely on Programmable Data Plane,” IEEE INFOCOM 2025 - IEEE Conference on Computer Communications, London, United Kingdom, pp. 1-10, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

[41] TOP500, November 2024, Top500 list, 2024. [Online]. Available: https://www.top500.org/lists/top500/2024/11/

[42] Using El Capitan Systems: Hardware Overview, Lawrence Livermore National Laboratory, 2025. [Online]. Available: https://hpc.llnl.gov/documentation/user-guides/using-el-capitan-systems/hardware-overview

[43] Alexandra Kelley, El Capitan Supercomputer is Ready to Handle Nuclear Stockpile and AI Workflows, Nextgov, 2025. [Online]. Available: https://www.nextgov.com/emerging-tech/2025/01/el-capitan-supercomputer-ready-handle-nuclear-stockpile-and-ai-workflows/402088/

[44] AMD Instinct MI300A Accelerators, Advanced Micro Devices, Inc., 2025. [Online]. Available: https://www.amd.com/en/products/accelerators/instinct/mi300/mi300a.html

[45] HPE Delivers World’s Fastest Direct Liquid-Cooled Exascale Supercomputer El Capitan for LLNL, Hewlett Packard Enterprise Wire, 2024. [Online]. Available: https://www.hpcwire.com/off-the-wire/hpe-delivers-worlds-fastest-direct-liquid-cooled-exascale-supercomputer-el-capitan-for-llnl/

[46] Brian Behlendorf, and Olaf Faaland, Rabbit Storage for El Capitan, Fast I/O through Big, Pointy Teeth, Lawrence Livermore National Laboratory, 2023. [Online]. Available: https://www.opensfs.org/wp-content/uploads/Fast-IO-El-Capitan-Rabbits.revised.pdf

[47] HPE Announces Industry’s First 100% Fanless Direct Liquid Cooling Systems Architecture, Hewlett Packard Enterprise, 2024. [Online]. Available: https://www.hpe.com/us/en/newsroom/press-release/2024/10/hpe-announces-industrys-first-100-fanless-direct-liquid-cooling-systems-architecture.html

[48] Janet Morss, El Capitan Takes Exascale Computing to New Heights, Advanced Micro Devices, Inc., 2025. [Online]. Available: https://www.amd.com/en/blogs/2025/el-capitan-takes-exascale-computing-to-new-heights.html

[49] Rodrigo N. Calheiros et al., “CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms,” Software: Practice and Experience, vol. 41, no. 1, pp. 23-50, 2011.
[CrossRef] [Google Scholar] [Publisher Link]

[50] Rajkumar Buyya, and Manzur Murshed, “Gridsim: A Toolkit for the Modeling and Simulation of Distributed Resource Management and Scheduling for Grid Computing,” Concurrency and Computation: Practice and Experience, vol. 14, no. 13-15, pp. 1175-1220, 2002.
[CrossRef] [Google Scholar] [Publisher Link]

[51] Fran Andújar, Hiperion, GitLab, 2026. [Online]. Available: https://gitraap.i3a.info/fandujar/hiperion

[52] Franisco J. Andújar et al., “VEF Traces: A Framework for Modelling MPI Traffic in Interconnection Network Simulators,” 2015 IEEE International Conference on Cluster Computing, Chicago, IL, USA, pp. 841-848, 2015.
[CrossRef] [Google Scholar] [Publisher Link]

[53] Dimemas: Predict Parallel Performance Using a Single CPU Machine | BSC-Tools” Barcelona Supercomputing Center, 2025. [Online]. Available: https://tools.bsc.es/dimemas

[54] Nathan Binkert et al., “The gem5 Simulator,” ACM SIGARCH Computer Architecture News, vol. 39, no. 2, pp. 1-7, 2011.
[CrossRef] [Google Scholar] [Publisher Link]

[55] Farzana Ahmed Siddique et al., “Architectural Modeling and Benchmarking for Digital DRAM PIM,” 2024 IEEE International Symposium on Workload Characterization (IISWC), Vancouver, BC, Canada, pp. 247-261, 2024.
[CrossRef] [Google Scholar] [Publisher Link]