Research Focus

Service-oriented OS

Cloud-native technologies are gaining traction and are poised to be the foundation on which the next-generation cloud computing is built. This departure from traditional technologies poses great challenges to the services and original design of operating systems. The OS services and interfaces need to be innovated around the cloud-native serverless computing and modern application architectures. Our research in a service-oriented OS focuses on the software isolation mechanism of the OS, and the security and verification mechanism of the managed runtime and bytecode. The lab aspires to improve the OS architecture and services, cater to the requirements of cloud-edge-end applications, and build a new OS ecosystem.

Hardware/Software Co-design System

Research in this area focuses on improving compute capabilities of an OS running on heterogenous chips, and developing performance analysis capabilities for the OS. The research centers on improving performance of hardware and software algorithms in the compute, storage, and network fields. Furthermore, feasibility studies are also performed on heterogenous processor programming frameworks and toolchain solutions. We aim to design the next-generation cloud-native serverless execution unit with hardware/software co-design, building an ecosystem where the heterogenous hardware is better coordinated with software and meeting the requirements for cloud-edge-end services.


Research Team
Jiangwei JiangHead of DAMO OS Lab

He is a distinguished engineer in Alibaba Group. He is also in charge of the Basic Products Business Unit, Alibaba Cloud Intelligence Business Group. Jiangwei joined Taobao in 2008 and built a highly-available architecture for the e-commerce platform. He had led the high availability team and middleware product line since 2012. In December, 2017, Jiangwei started to lead the Basic Products team of Alibaba Cloud with a focus on the R&D of Apsara OS.


Academic Achievements
Publications and Presentations
  • Zhe Wang, Teng Ma, Linghe Kong, Zhenzao Wen, Jingxuan Li, Zhuo Song, Yang Lu, Guihai Chen, Wei Cao. Zero Overhead Monitoring for Cloud-native Infrastructure using RDMA. ATC 2022
  • Zijun Li, Jiagan Cheng, Quan Chen, Eryu Guan, Zizheng Bian, Yi Tao, Bin Zha, Qiang Wang, Weidong Han, and Minyi Guo. RunD: A Lightweight Secure Container Runtime for High-Density Deployment and High-Concurrency Startup in Serverless Computing. ATC 2022
  • Zijun Li, Linsong Guo, Quan Chen, Jiagan Cheng, Chuhao Xu, Deze Zeng, Zhuo Song, Tao Ma, Yong Yang, Chao Li, and others. Help Rather Than Recycle: Alleviating Cold Startup in Serverless Computing Through Inter-Function Container Sharing. ATC 2022
  • Zihan Wang, Chengcheng Wan, Yuting Chen, Ziyi Lin, He Jiang, Lei Qiao. Hierarchical Memory-constrained Operator scheduling of Neural Architecture Search Networks. DAC 2022
  • Runzhe Wang, Qinglong Wang, Yuxi Hu, Heyuan Shi, Yuheng Shen, Yu Zhan, Ying Fu, Zheng Liu, Xiaohai Shi, Yu Jiang. Industry Practice of Configuration Auto-Tuning for Cloud Applications and Services. ESEC/FSE 2022
  • Weihao Cui,Han Zhao,Quan Chen,Ningxin Zheng,Jingwen Leng,Jieru Zhao,Zhuo Song,Tao Ma,Yong Yang,Chao Li,Minyi Guo. Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency Prediction. [SC'21]
  • Teng Ma, Kang Chen, Shaonan Ma, Zhuo Song, Yongwei Wu. Thinking More about RDMA Memory Semantics. [CLUSTER'21]
  • Shuai Xue, Shang Zhao, Quan Chen, Zhuo Song, Tao Ma, Shanpei Chen, Yong Yang, Wenli Zheng and Minyi Guo. Kronos: Towards Bus Contention Aware Job Scheduling in Public Clouds. [FCS'21]
  • Shuai Xue, Shang Zhao, Quan Chen, Gang Deng, Zheng Liu, Jie Zhang, Zhuo Song, Tao Ma, Yong Yang, Zhou Yanbo, Niu Keqiang, Sun Sijie, Minyi Guo. Spool: Reliable Virtualized NVMe Storage Pool in Public Cloud Infrastructure. ATC 2020.
  • Mingyu Wu, Ziming Zhao, Yanfei Yang, Haoyu Li, Haibo Chen, Binyu Zang, Haibing Guan, Sanhong Li, Chuansheng Lu, Tongbao Zhang. Platinum: A CPU-Efficient Concurrent Garbage Collector for Tail-Reduction of Interactive Services. ATC 2020.
  • Quan Chen, Shuai Xue, Shang Zhao, Shanpei Chen, Zhuo Song, Huan Ding, Yu Xu, Tao Ma, Yong Yang, Minyi Guo. Alita: Comprehensive Performance Isolation through Bias Resource Management for Public Clouds. [SC'20]
  • Quan Chen, Shuai Xue, Shang Zhao, Shanpei Chen, Zhuo Song, Huan Ding, Yu Xu, Tao Ma, Yong Yang, Minyi Guo. Alita: Comprehensive Performance Isolation through Bias Resource Management for Public Clouds. SC 2020.
  • Zijun Li, Quan Chen, Shuai Xue, Tao Ma, Yong Yang, Zhuo Song, Minyi Guo. Amoeba: QoS-Awareness and Reduced Resource Usage of Microservices with Serverless Computing. IPDPS 2020.
  • Teng Ma, Mingxing Zhang, Kang Chen, Zhuo Song, Yongwei Wu, Xuehai Qian. AsymNVM: An Efficient Framework for Implementing Persistent Data Structures on Asymmetric NVM Architecture. ASPLOS 2020.
  • Wei Zhang, Ningxin Zheng, Quan Chen, Yong Yang, Zhuo Song, Tao Ma, Jingwen Leng, Minyi Guo. URSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds. ICPP 2020.
  • Sa Wang, Yanhai Zhu, Shanpei Chen, Tian-Ze Wu, Wenjie Li, Xusheng Zhan, Haiyang Ding, Weisong Shi, Yungang Bao. A Case for Adaptive Resource Management in Alibaba Datacenter Using Neural Networks. JCST 2020.
  • Teng Ma, Tao Ma, Zhuo Song, Jingxuan Li, Huaixin Chang, Kang Chen, Hai Jiang, Yongwei Wu. X-RDMA: Effective RDMA Middleware in Large-scale Production Environments. CLUSTER 2019.
  • Heyuan Shi, Runzhe Wang, Ying Fu, Mingzhe Wang, Xiaohai Shi, Xun Jiao, Houbing Song, Yu Jiang, Jiaguang Sun. Industry Practice of Coverage-Guided Enterprise Linux Kernel Fuzzing. ESEC/FSE 2019.
  • Shiyou Huang, Jianmei Guo, Sanhong Li, Xiang Li, Yumin Qi, Kingsum Chow, Jeff Huang. SafeCheck: Safety Enhancement of Java Unsafe API. ICSE 2019.
  • Fangxi Yin, Denghui Dong, Chuansheng Lu, Tongbao Zhang, Sanhong Li, Jianmei Guo, Kingsum Chow. Cloud-Scale Java Profiling at Alibaba. ICPE 2018.
  • Fangxi Yin, Denghui Dong, Sanhong Li, Jianmei Guo, Kingsum Chow. Java Performance Troubleshooting and Optimization at Alibaba. ICSE 2018 SEIP.
  • 1. Zhe Wang, Teng Ma, Linghe Kong, Zhenzao Wen, Jingxuan Li, Zhuo Song, Yang Lu, Guihai Chen, Wei Cao. Zero Overhead Monitoring for Cloud-native Infrastructure using RDMA. ATC 2022
  • 1. Zijun Li, Jiagan Cheng, Quan Chen, Eryu Guan, Zizheng Bian, Yi Tao, Bin Zha, Qiang Wang, Weidong Han, and Minyi Guo. RunD: A Lightweight Secure Container Runtime for High-Density Deployment and High-Concurrency Startup in Serverless Computing. ATC 2022
  • 1. Zijun Li, Linsong Guo, Quan Chen, Jiagan Cheng, Chuhao Xu, Deze Zeng, Zhuo Song, Tao Ma, Yong Yang, Chao Li, and others. Help Rather Than Recycle: Alleviating Cold Startup in Serverless Computing Through Inter-Function Container Sharing. ATC 2022
  • 1. Zihan Wang, Chengcheng Wan, Yuting Chen, Ziyi Lin, He Jiang, Lei Qiao. Hierarchical Memory-constrained Operator scheduling of Neural Architecture Search Networks. DAC 2022
  • 1. Runzhe Wang, Qinglong Wang, Yuxi Hu, Heyuan Shi, Yuheng Shen, Yu Zhan, Ying Fu, Zheng Liu, Xiaohai Shi, Yu Jiang. Industry Practice of Configuration Auto-Tuning for Cloud Applications and Services. ESEC/FSE 2022
  • Shuai Xue, Shang Zhao, Quan Chen, Gang Deng, Zheng Liu, Jie Zhang, Zhuo Song, Tao Ma, Yong Yang, Zhou Yanbo, Niu Keqiang, Sun Sijie, Minyi Guo. Spool: Reliable Virtualized NVMe Storage Pool in Public Cloud Infrastructure. [ATC’20]
Expand

Scan QR code
关注Ali TechnologyWechat Account