Research Focus
  • Environmental Perception

Multi-sensor cooperative sensing and fusion technologies are studied to achieve the key information collection and knowledge extraction for autonomous vehicles, which include the detection and identification of drivable areas, lane lines, traffic signs, traffic lights, other vehicles, people and obstacles as well as the prediction ofposition and motion. Our technology exploits an appropriate multi-sensor layout to fulfill the 360-degree coverage and the enhanced coverage effect for key areas. Moreover, we also design a software system with safety redundancy design to ensurethe accuracy of surroundings perception and eventually achieve the safety and stability of driving.

  • High-precision Localization

High-precision intelligent localization based on high-precision maps and exactmappings between physical world and digital world will provide the accurate traffic elements, POI, and other surrounding information for autonomous vehicles. Thus, we study the new positioning technology of multi-source fusion in the cloud, build up the cloud platform for information fusion and processing, and eventually establish a dynamic location network to connect “vehicle to human” and “vehicle to vehicle”. This network not only provides the location services for autonomous driving, but also supports the data collection and backhaul with accurate location to online update the high-precision maps, which dramatically improves the accuracy, reliability, and intelligence of the localization system. 

  • Decision and Planning

The studies on decision and planning can be divided into two parts: driving decision-making system and motion planning system. The driving decision-making system is to explore the nature mechanism of human’s driving behaviors, train the optimal behavior model and parameters, and optimize the online driving behavior through the analysis of high volume real-life high-precision traffic data. Moreover, with consideration of the limited computing resource, the motion planning system is to search thousands of possible trajectories to choose the best one based on the surrounding environment information, which achieves better and more steady driving experience. 

  • Smart Control

Smart control consists of horizontal control and vertical control. Horizontal control helps to improve autonomous vehicles’ path tracking ability, i.e., control the vehicles to safely and steadily drive along the planned way. On the other hand, vertical control manages autonomous vehicles’ speed tracking ability to let them cruise at the predefined speed.

  • Autonomous Driving Simulation Platform

The simulation platform consists of four parts:(1)Traffic intelligent system is to extract realistic behavioral models from large traffic data, and simulate the behaviors of vehicles, pedestrians and other objects.(2)Scene editing and generation system is to edit various extreme scenarios and generate the similar scenarios, which can fulfill the stress test on the autonomous driving algorithm.(3)Rendering system for virtual world is to visualize the simulated scenarios and improve the intuitive feeling of problem analysis.(4)Large-scale server deployment system targets on improving the scalability and efficiency of simulation, which can fulfill the simulation test for hundreds of millions of kilometers per year.

The simulation platform enables to save huge time and cost on autonomous driving test and greatly reduce the test risks. It can randomly simulate various extreme scenarios and cover all the tests.

  • Data Platform

Data platform collects abundant data for perception to improve the robustness of autonomous driving. Our studies focus on the following aspects:(1)how to build an efficient labeling system to reduce the labeling cost and improve the labeling efficiency and accuracy;(2)how to effectively monitor the vehicle’ status and problem and how to appropriately visualize the data.(3)how to combine the actual data and simulated data. The objective is to provide the integration and index of structured, semi-structured, and unstructured data.

  • Cooperative Vehicle-Infrastructure Systems (CVIS)

Cooperative vehicle-infrastructure system constructs a new model of “vehicle-road-cloud”. Thus, we target to study the frontier road technologies, i.e., multi-sensor perception, multi-modal data fusion and processing, low power edge computing and direct communications for medium and short range, as well as implement a brand-new intelligent device to provide the real-time data services for vehicles and offer the traffic information for the cloud.

Products and Applications
  • Intelligent Driving

    The Intelligent Transportation Lab participates in the construction of national intelligent logistics backbone networks. It works on building a huge logistics network to connect different cities and various in-transit warehouses and maximizing the utilization of road resource at different time periods, which eventually facilitates the smart, accurate, fast, safe, and green circulation and delivery for goods.


Research Team
Gang WangHead of Intelligent Transportation Lab

He holds a Ph.D. from the University of Illinois at Urbana-Champaign and is a top expert in machine learning and computer vision. His research fields include deep learning and its application in computer vision, natural language processing, and voice recognition. Before joining Alibaba, he was a professor at Nanyang Technological University in Singapore. He is a global MIT TR35 winner, a member of the editorial board of the leading artificial intelligence magazine named IEEE TPAMI, and an area chair of ICCV 2017 and CVPR 2018.

Ying ChenPrinciple Engineer of Intelligent Transportation Lab

He received his Ph.D in Computing and Electrical Engineering from Tampere University of Technology, Finland and both BS. and MS. from Peking University. Dr. Chen has served as a co-editor of MPEG video coding standard specifications (the extensions of H.264/AVC and H.265/HEVC). He is a senior member of IEEE and has been a member of several technical committees of IEEE Circuit and Systems (CAS) society. Before joining Alibaba, Dr. Chen was a principal engineer and manager (director level) of Qualcomm, San Diego, CA, USA, working on multimedia coding and transmission, computer vision and edge computing. His publications got more than 6,000 citations.He is a winner of Qualcomm IP Achievement Award.

Academic Achievements
  • Jiuxiang Gu, Jianfei Cai, Shafiq Joty, Li Niu, Gang Wang. Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models. CVPR, 2018. Spotlight
  • Ping Hu, Gang Wang, Xiangfei Kong, Jason Kuen,Yap-Peng Tan. Motion-Guided Cascaded Refinement Network for Video Object Segmentation. CVPR, 2018. Poster
  • Jason Kuen, Xiangfei Kong, Zhe Lin, Gang Wang, Jianxiong Yin, Simon See, Yap-Peng Tan. Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks. CVPR, 2018. Poster
  • Jianlou Si, Honggang Zhang, Chun-Guang Li, Jason Kuen, Xiangfei Kong, Alex C. Kot, Gang Wang. Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identifcation. CVPR, 2018. Poster
  • Jun Liu, Amir Shahroudy, Gang Wang, Ling-Yu Duan, Alex C. Kot. SSNet: Scale Selection Network for Online 3D Action Prediction. CVPR, 2018. Spotlight
  • Henghui Ding, Xudong Jiang, Bing Shuai, Ai Qun Liu, Gang Wang. Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation. CVPR, 2018. Oral
  • Yicheng Wang, Zhenzhong Chen, Feng Wu, Gang Wang. Person Re-identification with Cascaded Pairwise Convolutions. CVPR, 2018. Poster
  • Lu Zhang, Ju Dai, Huchuan Lu, You He, Gang Wang. A Bi-directional Message Passing Model for Salient Object Detection. CVPR, 2018. Poster
  • Xiaoning Zhang, Tiantian Wang, Jinqing Qi, Huchuan Lu, Gang Wang. Progressive Attention Guided Recurrent Network for Salient Object Detection.CVPR, 2018. Poster
  • Jiuxiang Gu, Jianfei Cai, Gang Wang, Tsuhan Chen. Stack-Captioning: Coarse-to-Fine Learning for Image Captioning. AAAI, 2018. Oral
  • Rana Hanocka, Noa Fish, Zhenhua Wang, Raja Giryes, Shachar Fleishman, Daniel Cohen-Or. ALIGNet: Partial-Shape Agnostic Alignment via Unsupervised Learning. ACM Transactions on Graphics (TOG),2018.
  • Bin Wang, Guofeng Wang, Andrei Sharf, Yangyan Li, Fan Zhong, Xueying Qin, Daniel Cohen-Or, Baoquan Chen. Active Assembly Guidance with Online Video Parsing. IEEE VR, 2018.
  • Gang Zhang, Hu Han, Shiguang Shan, Xingguang Song, Xilin Chen. Face Alignment across Large Pose via MT-CNN based 3D Shape Reconstruction. FG, 2018.
  • Gang Zhang, Meina Kan, Shiguang Shan, Xilin Chen. Generative Adversarial Network with Spatial Attention for Face Attribute Editing. ECCV, 2018.
  • Jun Liu and Gang Wang. Global Context-Aware LSTM Networks for 3D Action Recognition. CVPR, 2017.
  • Ping Hu, Bing Shuai, Gang Wang. Deep Level Sets for Salient Object Detection. CVPR, 2017.
  • Abrar Abdulnabi, Bing Shuai, Gang Wang. Episodic CAMN: Contextual Attention-based Memory Networks for Scene Labeling. CVPR, 2017.
  • Jiuxiang Gu, Gang Wang, Jianfei Cai, and Tsuhan Chen. An Empirical Study of Language CNN for Image Captioning. ICCV, 2017.
  • Zhenhua Wang, Jiuxiang Gu, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang and Gang Wang. Recent Advances in Convolutional Neural Networks. Pattern Recognition (PR), 2017.
  • Zhenhua Wang, Xingxing Wang, Gang Wang. Learning Fine-grained features via a CNN tree for Large-scale Classification. Neurocomputing, 2017.
  • Zhenwei Miao, Kim-Hui Yap, Xudong Jiang, Subbhuraam Sinduja, Zhenhua Wang. Laplace Gradient based Discriminative and Contrast Invertible Descriptor. ICASSP, March 2017.
  • Amir Shahroudy, Tian-Tsong Ng, Yihong Gong, and Gang Wang. Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos. IEEE.
  • Mingliang Chen, Qingxiong Yang, Qing Li, Gang Wang, and Ming-Hsuan Yang. Spatiotemporal Background Subtraction. IEEE.
  • Bing Shuai, Zhen Zuo, Bing Wang, ang Gang Wang. Scene Segmentation with DAG-Recurrent Neural Networks. IEEE.
  • Jun Liu, Amir Shahroudy, Dong Xu, Alex Kot Chichung, and Gang Wang. Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates. IEEE.

Contact Us

Scan QR code
关注Ali TechnologyWechat Account