Research Focus
  • Environmental Perception

Multi-sensor cooperative sensing and fusion technologies are studied to achieve the key information collection and knowledge extraction for autonomous vehicles, which include the detection and identification of drivable areas, lane lines, traffic signs, traffic lights, other vehicles, people and obstacles as well as the prediction ofposition and motion. Our technology exploits an appropriate multi-sensor layout to fulfill the 360-degree coverage and the enhanced coverage effect for key areas. Moreover, we also design a software system with safety redundancy design to ensurethe accuracy of surroundings perception and eventually achieve the safety and stability of driving.

  • High-precision Localization

High-precision intelligent localization based on high-precision maps and exactmappings between physical world and digital world will provide the accurate traffic elements, POI, and other surrounding information for autonomous vehicles. Thus, we study the new positioning technology of multi-source fusion in the cloud, build up the cloud platform for information fusion and processing, and eventually establish a dynamic location network to connect “vehicle to human” and “vehicle to vehicle”. This network not only provides the location services for autonomous driving, but also supports the data collection and backhaul with accurate location to online update the high-precision maps, which dramatically improves the accuracy, reliability, and intelligence of the localization system. 

  • Decision and Planning

The studies on decision and planning can be divided into two parts: driving decision-making system and motion planning system. The driving decision-making system is to explore the nature mechanism of human’s driving behaviors, train the optimal behavior model and parameters, and optimize the online driving behavior through the analysis of high volume real-life high-precision traffic data. Moreover, with consideration of the limited computing resource, the motion planning system is to search thousands of possible trajectories to choose the best one based on the surrounding environment information, which achieves better and more steady driving experience. 

  • Smart Control

Smart control consists of horizontal control and vertical control. Horizontal control helps to improve autonomous vehicles’ path tracking ability, i.e., control the vehicles to safely and steadily drive along the planned way. On the other hand, vertical control manages autonomous vehicles’ speed tracking ability to let them cruise at the predefined speed.

  • Autonomous Driving Simulation Platform

The simulation platform consists of four parts:(1)Traffic intelligent system is to extract realistic behavioral models from large traffic data, and simulate the behaviors of vehicles, pedestrians and other objects.(2)Scene editing and generation system is to edit various extreme scenarios and generate the similar scenarios, which can fulfill the stress test on the autonomous driving algorithm.(3)Rendering system for virtual world is to visualize the simulated scenarios and improve the intuitive feeling of problem analysis.(4)Large-scale server deployment system targets on improving the scalability and efficiency of simulation, which can fulfill the simulation test for hundreds of millions of kilometers per year.

The simulation platform enables to save huge time and cost on autonomous driving test and greatly reduce the test risks. It can randomly simulate various extreme scenarios and cover all the tests.

  • Data Platform

Data platform collects abundant data for perception to improve the robustness of autonomous driving. Our studies focus on the following aspects:(1)how to build an efficient labeling system to reduce the labeling cost and improve the labeling efficiency and accuracy;(2)how to effectively monitor the vehicle’ status and problem and how to appropriately visualize the data.(3)how to combine the actual data and simulated data. The objective is to provide the integration and index of structured, semi-structured, and unstructured data.

  • Cooperative Vehicle-Infrastructure Systems (CVIS)

Cooperative vehicle-infrastructure system constructs a new model of “vehicle-road-cloud”. Thus, we target to study the frontier road technologies, i.e., multi-sensor perception, multi-modal data fusion and processing, low power edge computing and direct communications for medium and short range, as well as implement a brand-new intelligent device to provide the real-time data services for vehicles and offer the traffic information for the cloud.

Products and Applications
  • Intelligent Driving

    Autonomous Driving Lab participates in the construction of national intelligent logistics backbone networks. It works on building a huge logistics network to connect different cities and various in-transit warehouses and maximizing the utilization of road resource at different time periods, which eventually facilitates the smart, accurate, fast, safe, and green circulation and delivery for goods.


Research Team
Li ChengVice President of DAMO Academy/Head of Autonomous Driving Laboratory of DAMO AcademyHome page >

Li Cheng has served as the Chief Technology Officer of Alibaba Group since December 2019. Previously, he served as the Chief Technology Officer of Ant Group and Chief Operating Officer of Ant Group International Business Group.

Academic Achievements
Publications and Presentations
  • H Ding, X Jiang, B Shuai, AQ Liu, G Wang. Semantic segmentation with context encoding and multi-path decoding. TIP, 2020.
  • C Lin, J Lu, G Wang, J Zhou. Graininess-aware deep feature learning for robust pedestrian detection. TIP, 2020.
  • K Yuan, Z Guo, and ZJ Wang. RGGNet: Tolerance Aware LiDAR-Camera Online Calibration With Geometric Deep Learning and Generative Model. RAL, 2020.
  • M Zhang, X Xu, Y Chen, M Li. A Lightweight and Accurate Localization Algorithm Using Multiple Inertial Measurement Units. RAL, 2020.
  • J Liu, A Shahroudy, ML Perez, G Wang, LY Duan, AK Chichung. NTU RGB+D 120: A large-scale benchmark for 3d human activity understanding. TPAMI, 2019.
  • J Liu, A Shahroudy, G Wang, LY Duan, AK Chichung. Skeleton-based online action prediction using scale selection network. TPAMI, 2019.
  • J Liu, H Ding, A Shahroudy, LY Duan, X Jiang, G Wang. AK Chichung Feature boosting network for 3D pose estimation. TPAMI, 2019.
  • H Ding, X Jiang, B Shuai, AQ Liu, G Wang. Semantic correlation promoted shape-variant context for segmentation. CVPR, 2019.
  • J Gu, S Joty, J Cai, H Zhao, X Yang, G Wang. Unpaired image captioning via scene graph alignments. ICCV, 2019.
  • Jiuxiang Gu, Jianfei Cai, Shafiq Joty, Li Niu, Gang Wang. Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models. CVPR, 2018. Spotlight
  • Ping Hu, Gang Wang, Xiangfei Kong, Jason Kuen,Yap-Peng Tan. Motion-Guided Cascaded Refinement Network for Video Object Segmentation. CVPR, 2018. Poster
  • Jason Kuen, Xiangfei Kong, Zhe Lin, Gang Wang, Jianxiong Yin, Simon See, Yap-Peng Tan. Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks. CVPR, 2018. Poster
  • Jianlou Si, Honggang Zhang, Chun-Guang Li, Jason Kuen, Xiangfei Kong, Alex C. Kot, Gang Wang. Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identifcation. CVPR, 2018. Poster
  • Jun Liu, Amir Shahroudy, Gang Wang, Ling-Yu Duan, Alex C. Kot. SSNet: Scale Selection Network for Online 3D Action Prediction. CVPR, 2018. Spotlight
  • Henghui Ding, Xudong Jiang, Bing Shuai, Ai Qun Liu, Gang Wang. Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation. CVPR, 2018. Oral
  • Yicheng Wang, Zhenzhong Chen, Feng Wu, Gang Wang. Person Re-identification with Cascaded Pairwise Convolutions. CVPR, 2018. Poster
  • Lu Zhang, Ju Dai, Huchuan Lu, You He, Gang Wang. A Bi-directional Message Passing Model for Salient Object Detection. CVPR, 2018. Poster
  • Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, Gang Wang. Progressive Attention Guided Recurrent Network for Salient Object Detection.CVPR, 2018. Poster
  • Jiuxiang Gu, Jianfei Cai, Gang Wang, Tsuhan Chen. Stack-Captioning: Coarse-to-Fine Learning for Image Captioning. AAAI, 2018. Oral
  • Rana Hanocka, Noa Fish, Zhenhua Wang, Raja Giryes, Shachar Fleishman, Daniel Cohen-Or. ALIGNet: Partial-Shape Agnostic Alignment via Unsupervised Learning. ACM Transactions on Graphics (TOG),2018.
  • Bin Wang, Guofeng Wang, Andrei Sharf, Yangyan Li, Fan Zhong, Xueying Qin, Daniel Cohen-Or, Baoquan Chen. Active Assembly Guidance with Online Video Parsing. IEEE VR, 2018.
  • Gang Zhang, Hu Han, Shiguang Shan, Xingguang Song, Xilin Chen. Face Alignment across Large Pose via MT-CNN based 3D Shape Reconstruction. FG, 2018.
  • Gang Zhang, Meina Kan, Shiguang Shan, Xilin Chen. Generative Adversarial Network with Spatial Attention for Face Attribute Editing. ECCV, 2018.
  • Jun Liu and Gang Wang. Global Context-Aware LSTM Networks for 3D Action Recognition. CVPR, 2017.
  • Ping Hu, Bing Shuai, Gang Wang. Deep Level Sets for Salient Object Detection. CVPR, 2017.
  • Abrar Abdulnabi, Bing Shuai, Gang Wang. Episodic CAMN: Contextual Attention-based Memory Networks for Scene Labeling. CVPR, 2017.
  • Jiuxiang Gu, Gang Wang, Jianfei Cai, and Tsuhan Chen. An Empirical Study of Language CNN for Image Captioning. ICCV, 2017.
  • Zhenhua Wang, Jiuxiang Gu, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang and Gang Wang. Recent Advances in Convolutional Neural Networks. Pattern Recognition (PR), 2017.
  • Zhenhua Wang, Xingxing Wang, Gang Wang. Learning Fine-grained features via a CNN tree for Large-scale Classification. Neurocomputing, 2017.
  • Zhenwei Miao, Kim-Hui Yap, Xudong Jiang, Subbhuraam Sinduja, Zhenhua Wang. Laplace Gradient based Discriminative and Contrast Invertible Descriptor. ICASSP, March 2017.
  • Amir Shahroudy, Tian-Tsong Ng, Yihong Gong, and Gang Wang. Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos. IEEE.
  • Mingliang Chen, Qingxiong Yang, Qing Li, Gang Wang, and Ming-Hsuan Yang. Spatiotemporal Background Subtraction. IEEE.
  • Bing Shuai, Zhen Zuo, Bing Wang, ang Gang Wang. Scene Segmentation with DAG-Recurrent Neural Networks. IEEE.
  • Jun Liu, Amir Shahroudy, Dong Xu, Alex Kot Chichung, and Gang Wang. Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates. IEEE.

Contact Us

Scan QR code
关注Ali TechnologyWechat Account