Research Focus
  • Online Transaction Processing (OLTP) and Hybrid Transactional/Analytical Processing (HTAP) Engines

In single-node architectures, the lab uses storage and state sharing technologies to store data across multiple nodes, allowing single-node systems to scale up for transactional processing. In multi-node cluster architectures, the lab allows users to create distributed databases by means of sharding, enabling a cluster to scale out for transactional processing. The HTAP engine uses GTM to adjust transaction concurrency and control the consistency of data read and write to allow both transactional and analytical processing on the same data.

  • Multi-modal Online Analytical Processing (OLAP), NoSQL, and NewSQL Database Systems

In scenarios that involve complex and content-rich multi-modal data, database systems are required to perform query fusion, integration, and cleansing of such structured, semi-structured, and unstructured data to extract and process structured features. This requires the lab to continuously improve the applicability, performance, and efficiency of NoSQL, NewSQL, and OLAP systems.

  • Data Security and Database System Security

A challenge for traditional security protection measures, such as access control and SQL injection prevention, is the improvement of system security and data protection without compromising the performance of database systems. To improve both security and efficiency of database systems, the lab continuously improves the key technologies of database systems, such as encrypted data query and update (based on homomorphic encryption), oblivious random access, and differential privacy. The rapid development of security hardware has brought new opportunities to improve the security of database systems. For example, users can use security hardware such as Intel SGX to create a new encrypted database system.

  • Autonomous and Intelligent Databases

The lab analyzes the system operating status and log data to prepare for system modeling based on machine learning technologies. These technologies can dynamically tune system parameters and optimize systems to reduce the O&M costs of system DBAs. The use of these technologies on key database system modules, such as the query optimizer, makes it possible to evolve from rule-based optimization to cost-based optimization, and then to machine learning-based optimization. Machine learning technologies can also help implement more accurate and efficient online warning and real-time monitoring mechanisms to intelligently manage O&M tasks performed by DBAs and allocate resources. In addition, analytical modeling of large amounts of structured, semi-structured, and unstructured data has called for the research on the implementation of intelligent database systems for deep data analysis.

  • New Hardware Acceleration and Data Storage

To maximize the performance of database systems, the lab focuses on the R&D of a heterogeneous computing architecture that combines benefits of CPUs, GPUs, and FPGAs. When optimizing multi-core and high-concurrency data query and analysis tasks, developers must take note of the system hardware architecture (such as the NUMA architecture), reduce data transfer, and implement a paradigm shift from computing-centric to data-centric in storage systems. New hardware applications such as NVM and RDMA have also spawned new data storage and management technologies that require system designers to consider the separation of storage and computing.

  • Fundamental Database Algorithms and Structures

Various levels of database system designs are faced with challenges on fundamental algorithms and datastructure such as concurrency control, data processing, system scheduling, approximate query processing (AQP), unstructured data analysis, and feature extraction. Algorithm design ideas and the operating status and features of database systems must all be considered to address these challenges. This gives rise to new challenges and requirements in the construction of the fundamental algorithms and data structures.

Products and Applications
  • Application in areas such as postal service and real estate

    The lab has helped Vanke and China Post substantially improve the overall storage and computing capacity of their databases by using core capabilities such as the scalability of distributed databases. The lab has also supported the core business systems’ upgrades of these two companies by deploying the horizontal and vertical fragmentation provided by the distributed transaction processing engine. And the database operation and maintenance costs are significantly reduced.

  • Technical Support for Major National Projects

    The Database and Storage Lab supports major national projects in the public and private cloud domains, such as Shanghai City Brain and National Tax projects.

Research Team
Feifei LiHead of Database and Storage Lab

Feifei Li is a tenured professor of Computer Science at University of Utah. He is a recipient of numerous awards and honors from ACM, IEEE, Visa, Google, HP, and Huawei, which include the IEEE ICDE 2014 10 The Most Influential Papers Award, ACM SIGMOD 2016 Best Paper Award, ACM SIGMOD 2015 Best System Presentation Award, IEEE ICDE 2004 Best Paper Award, US NSF Career Award, NSFC Oversea Collaboration Grant, and ACM Distinguished Member in 2018. He has served as a member of the editorial board and the chairman of many leading international academic journals and conferences.

Academic Achievements
  • AnalyticDB-V: A Hybrid Analytical Engine towards Query Fusion for Structured and Unstructured Data, by C. Wei, B. Wu, S. Wang, R. Lou, C. Zhan, F. Li, Y. Cai. VLDB 2020
  • LedgerDB: A Centralized Ledger Database for Universal Audit and Verification, by X. Yang, Y. Zhang, S. Wang, B. Yu, F. Li, Y. Li, W. Yan. VLDB 2020
  • Diagnosing Root Causes of Intermittent Slow Queries in Cloud Databases, by M. Ma, Z. Yin, S. Zhang, S. Wang, C. Zheng, X. Jiang, H. Hu, C. Luo, Y. Li, N. Qiu, F. Li, C. Chen, D. Pei. VLDB 2020
  • Timon: A Timestamped Event Database for Efficient Telemetry Data Processing and Analytics, by W. Cao, Y. Gao, F. Li, S. Wang, B. Lin, K. Xu, X. Feng, Y. Wang, Z. Liu, G. Zhang. SIGMOD 2020
  • Two-Level Data Compression using Machine Learning in Time Series Database, by X. Yu, Y. Peng, F. Li, S. Wang, X. Shen, H. Mai, Y. Xie. ICDE 2020
  • FPGA-Accelerated Compactions for LSM-based Key-Value Store, by T. Zhang, J. Wang, X. Cheng, H. Xu, N. Yu, G. Huang, T. Zhang, D. He, F. Li, W. Cao, Z. Huang, J. Sun. FAST 2020
  • HotRing: A Hotspot-Aware In-Memory Key-Value Store, by J. Chen, L. Chen, S. Wang, G. Zhu, Y. Sun, H. Liu, F. Li. FAST 2020
  • POLARDB Meets Computational Storage: Efficiently Support Analytical Workloads in Cloud-Native Relational Database, by W. Cao, Y. Liu, Z. Cheng, N. Zheng, W. Li, W. Wu, L. Ouyang, P. Wang, Y. Wang, R. Kuan, Z. Liu, F. Zhu, T. Zhang. FAST 2020
  • Leaper: A Learned Prefetcher for Cache Invalidation in LSM-tree based Storage Engines, by L. Yang, H. Wu, T. Zhang, X. Cheng, F. Li, L. Zou, Y. Wang, R. Chen, J. Wang, G. Huang. VLDB 2020
  • Realization of the Low Cost and High Performance MySQL Cloud Database, by W. Cao, F. Yu, J. Xie. VLDB 2014
  • TcpRT: Instrument and Diagnostic Analysis System for Service Quality of Cloud Databases at Massive Scale in Real-time, by W. Cao, Y. Gao, B. Lin, X. Feng, Y. Xie, X. Lou, P. Wang. SIGMOD 2018
  • PolarFS: An Ultra-low Latency and Failure Resilient Distributed File System for Shared Storage Cloud Database, by W. Cao, Z. Liu, P. Wang, S. Chen, C. Zhu, S. Zheng, Y. Wang, G. Ma. VLDB 2018
  • X-Engine: An Optimized Storage Engine for Large-Scale E-Commerce Transaction Processing, by G. Huang, X. Cheng, J. Wang, Y. Wang, D. He, T. Zhang, F. Li, S. Wang, W. Cao, Q. Li. SIGMOD 2019
  • X-Engine: An Optimized Storage Engine for Large-Scale E-Commerce Transaction Processing, by G. Huang, X. Cheng, J. Wang, Y. Wang, D. He, T. Zhang, F. Li, S. Wang, W. Cao, Q. Li. SIGMOD 2019
  • iBTune: Individualized Buffer Tuning for Largescale Cloud Databases, by J. Tan, T. Zhang, F. Li, J. Chen, Q. Zheng, P. Zhang, H. Qiao, Y. Shi, W. Cao, R. Zhang. VLDB 2019
  • AnalyticDB: Real-time OLAP Database System at Alibaba Cloud, by C. Zhan, M. Su, C. Wei, X. Peng, L. Lin, S. Wang, Z. Chen, F. Li, Y. Pan, F. Zheng, C. Chai. VLDB 2019
  • Cloud Native Database Systems: Challenges and Opportunities, by F. Li. VLDB 2019
  • Cao, Wei and Yu, Feng and Xie, Jiasen, Realization of the Low Cost and High Performance MySQL Cloud Database, VLDB 2014

Scan QR code
关注Ali TechnologyWechat Account