Research and Development of Key Technologies, Applications and Systems for Carbon Neutrality
As artificial intelligence gets deployed more and more in the fields of computer vision, speech recognition, natural language processing (NLP) and other related fields, there has been an exponential increase of computation demand in data centers for AI model training and inference. At the same time, there is a corresponding growth trend in terms of AI computing cost and energy consumption. According to the study by the University of Massachusetts on training several common NLP models, the carbon footprint of the GPU-based BERT model is about 1400 pounds of carbon dioxide, equivalent to the emission of a trans-American flight . Moreover, the environmental cost of training an AI model is almost proportional to its size. If including any additional expense due to the model architecture change and accuracy retraining, the overall process may lead to as much as 626,000 pounds of CO2 emissions, which is nearly five times the emissions over the average car’s life cycle.
During the entire life cycle of AI applications, from algorithm design, model training, inference customization, model deployment, to final execution on computing resources at data center, there exist a lot of optimization strategies and techniques in each phase. For example, numerous computation graph optimizations are done inside Tensorflow and PyTorch frameworks. Similarly, Alibaba PAI platform exploits 512 GPUs to train the M6 model of 10 trillion parameters  and reach a practically usable level in 10 days. The heavy optimizations based on the underlying M6 model structure help reduce its energy consumption drastically, only 1% of the large model GPT-3 with the same parameter scale. Likewise, the research conducted at MIT  found the super model with shared parameters through network architecture search (NAS) such that the model is only trained once and can be deployed in many different scenarios. Although the energy consumption in a single training session itself is increased, model customization and retraining are very efficient and the cost of carbon dioxide emissions in each deployment can be reduced by more than 16 times.
An important goal at Alibaba's data center in FY2023 is to improve the energy efficiency of heterogeneous computing infrastructure and explore green computing technology for achieving carbon neutrality in the future. In our data center, while we promote using the cutting-edge technology of immersive liquid cooling services to improve the power utilization efficiency (PUE), we also focus on optimizing the computing efficiency right from the source, including the improvement of computing resources through large-scale resource pooling and hardware software co-optimizations to boost performance per watts (perf/watts). However, there isn’t a systematic thermal efficiency analysis throughout the full stack of AI execution on heterogeneous computing pools. At the same time, many existing optimization strategies typically emphasize on performance, which aren’t necessarily consistent with the goal of optimal energy efficiency. So we need to conduct a comprehensive profiling and assessment on carbon footprint, from chip (CPU/GPU/NPU), memory (DRAM), to the system so as to find a globally optimal strategy of energy consumption optimization. Therefore, it has practical significance and strategic value for Alibaba Cloud to conduct systematic research on optimizing thermal efficiency and energy consumption of large-scale heterogeneous resource pools.
 Emma Strubell, Ananya Ganesh, and Andrew McCallum. Energy and policy considerations for deep
learning in nlp. In ACL, 2019.
 J Lin, R Men, A Yang, C Zhou, Y Zhang, P Wang, J Zhou, J Tang, H Yang, M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining, Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.
 Han Cai, Chuang Gan, TianzheWang, Zhekai Zhang, Song Han, ONCE-FOR-ALL: TRAIN ONE NETWORK AND SPECIALIZE IT FOR EFFICIENT DEPLOYMENT, ICLR 2020
- Core technology of carbon-neutral-friendly scheduling for resource pools and optimization strategies
- The carbon footprint assessment report
- Prototype of system optimizations
- Top conference papers in computer architecture, HPC, or AI fields
Related Research Topics
This research aims to analyze and evaluate the carbon footprint of large-scale heterogeneous resource pools throughout the full stack of AI applications. We’d expect the research to explore carbon-neutral friendly technologies of heterogeneous computing pools and lead to global optimal strategies to greatly improve the energy efficiency at data centers and contribute to the carbon neutrality goal of Alibaba Cloud's infrastructure.
- Comprehensive profiling of thermal efficiency analysis and systematic assessment of carbon footprint for heterogeneous computing pools. From the cloud infrastructure’s perspective (system, chip, DRAM, etc.), propose a viable optimization strategies and executable roadmaps for future carbon neutrality goals.
- Research the optimization technology of heterogeneous computing based on the best thermal efficiency, which can greatly reduce the carbon emission of computing system
- Research thermal efficiency-based strategies (such as resource pool load balancing, data layout and transferring, and approximate computing) and explore carbon-neutral-friendly scheduling for resource pools
- Dynamic optimization and context-sensitive DVFS power management of accelerators based on AI workload characteristics