A Method of Video Compression Based on Machine Vision


Image and Video


A Method of Video Compression Based on Machine Vision


Traditional video compression technologies have been developed over decades with the aim of enhancing compression efficiency. However, the signal fidelity driven coding pipeline design limits the capability of the existing video coding frameworks to fulfill the needs of both human vision and machine. In the existing video coding framework, the pixels are reconstructed by decoding a bitstream, and then those pixel data are fed into machine learning based systems to extract key information such as visual features for realizing intelligent analysis and machine tasks. With the emergence of numerous deep learning based applications, it motivates us to incorporate the aforementioned “key information” used by machine into the video bitstream to reduce run time and complexity.


  • Design a new video compression framework to incorporate key information used by machine into video bitstream, and explore more collaborative operations between pixel and feature data. The new encoder and decoder in this framework are expected to achieve comparable compression ratio with VVC.
  • Provide an exampled bitstream to show the ability of enabling machine vision tasks. After decoding this bitstream, rather than reconstructing all pixel data, only extracting some pieces of key information is sufficient for some vision tasks, for example, object recognition, object detection and image retrieval.
  • Provide source code, documents to detail the implementation of the encoder and decoder
  • At least one domestic/foreign patent application
  • At least one qualified conference/journal paper

Related Research Topics

  • Design machine vision based key information for different tasks, for example, object detection, behavior detection or track description.
  • Develop video compression standard for machine vision
  • Design an encoder dedicated for machine vision, and achieve higher compression efficiency compared to traditional compression methods.


Suggested Collaboration Method

AIR (Alibaba Innovative Research), one-year collaboration project. 


Scan QR code
关注Ali TechnologyWechat Account