Alibaba Innovative Research (AIR) > Research on Frontier Technologies in Data Center and Server
Hardware Resource Quality-of-Service (QoS) for Workload Co-location in Cloud Data Centers

Research Themes

Research on Frontier Technologies in Data Center and Server

Background

Workloads running in cloud data centers exhibit varied behaviors in processing user requests and the resulting CPU utilizations. Latency-critical (LC) workloads, e.g., e-commerce, online search, etc., are highly sensitive to fulfilling query latency requirements, whereas best-effort (BE) workloads (such as big data analytics) occupy hardware resources in their best efforts to improve system throughput. Current cloud data centers usually co-locate LC and BE workloads in the same server to improve server CPU utilization and also reduce machine costs.

 

The key requirement for workload co-location is to improve the BE workload’s throughput as much as possible while not violating the LC workload’s service-level objectives (SLO). To achieve this, proper allocations of hardware resources in the server to provide high quality-of service (QoS) to different workloads are critical. While some hardware resources (e.g., CPU cores, memory capacity) are easy to be allocated to different workloads, some (e.g., memory bandwidth, last-level caches) are not. Intel CPUs have enabled Resource Director Technology (RDT), which provides memory bandwidth allocation (MBA) and cache allocation technology (CAT), etc. Similarly, Arm provides MPAM, a technology corresponding to Intel RDT in the Arm architecture.

 

However, the effectiveness of RDT/MPAM is questionable in production cloud data center environments. Furthermore, RDT and MPAM only provide resource allocation interfaces, but a systematic approach making use of these interfaces is in need to achieve high resource QoS for cloud workload co-locations. Many prior works simply tackle the QoS problem at software level by calling hardware interfaces, and may not be useful in production cloud data centers.

Target

  • A thorough understanding of existing hardware resource allocation technologies (Intel RDT, Arm MPAM, etc.) in providing QoS for cloud workload co-locations
  • Proposal and evaluations of chip enhancements to existing hardware resource allocation technologies to provide high QoS
  • A systematic approach making use of hardware resource allocation technologies to enable cloud workload co-locations

Related Research Topics

  • Chip enhancements to existing resource allocation technologies in different architectures (x86/Arm/RISC-V)
  • Dynamic and hardware-assisted managements of resource allocations, including memory bandwidth and last-level caches.
  • Coordinated managements of multiple resources (memory bandwidth and LLC) and multiple workloads (one or more LC + one or more BE)
  • Workload co-location in SMT
  • Any other techniques related to resource QoS in cloud data centers, especially at chip hardware level or via hardware-software co-design.

Scan QR code
关注Ali TechnologyWechat Account