【CCF-AIR青年基金】Nationally differentiated search algorithm and technology for cross-border electronic commerce

Research Themes



With the rapid development of cross-border e-commerce in recent years, nationally differentiated search algorithms and technologies have become increasingly important. Take AliExpress (AE) as an example, it is one of the biggest cross-border business-to-consumer (B2C) electronic commerce platforms in China owned by the Alibaba Group. AE serves users from more than 200 countries and regions around the world. Consumers from different overseas countries behave differently due to the differences in geography, language, culture, politics and economy. Among the top-four countries with the highest traffic volumes in AE, being Russia, the United States, France and Spain, the overlapping rate for products displayed in AE is 31.8%. However, the overlapping rate is only 17.1% for clicked products, and 5.1% for the purchased products, validating that fact that the user behavior patterns may differ across countries. In Poland and France, which are geographically close in Europe, the overlap of the top 10 thousand queries in English is only about 20% a day. Taking into account the impact of historical, cultural and linguistic origins, there is an inherent close correlation between consumers in different countries, which could be well used to design nationally differentiated search algorithms and technologies.


Meanwhile, it is a big challenge to model and capture the consumers' shopping interests and preferences, since most overseas consumers have very few online shopping behavior records. Amongst AE's daily active users , about 20% have not visited the site in the past one month and about 40% have viewed fewer than six product pages in history. What is worse, about 50% of users do not have a single order record in the past month. How to design and build a cross-border nationally differentiated algorithm model under the current situation of space and low-activity data is not only of algorithmic theoretical value, but also useful for guiding the domestic e-commerce companies in opening up the new sinking markets (such as the domestic second-tier and third-tier cities).


In the process of cross-border e-commerce development, especially the fast expansion of the local sellers selling to the local consumer business model, we are facing the following problems:

1) How to store and index both the local sellers' products and the global sellers' products in search engine?

2) How to efficiently and intelligently allocate the quota volume for the locally selling products and the globally selling products in the seeking phase within a search ranking process?

3) How to balance and control the exposure of the locally selling products and the globally selling products?

If we can well solve the above problems, it will bring in the long-term business development of the global main-website as well as the local national-website.


Mostly importantly, considering the EU’s privacy protection policy, the Russia data export control restrictions and other overseas local policies and regulations, how to provide efficient personalized search and recommendation algorithm services in the absence of some user core identification information and online behavior data information while protecting the user’s privacy? As we know, machine learning algorithms such as federated learning and few-shot learning (FSL) have been successful applied to deal with similar problems in industry. For example, federated learning can train a machine learning algorithm model, for instance deep neural networks, on multiple local datasets contained in local nodes without explicitly exchanging data samples. In light of the development in machine learning, we wish to adopt and design some novel nationally differentiated search algorithms to protect the user’s privacy.


  • An effective nationally differentiated search algorithm for modeling and predicting the users' shopping interest and preference, which could be used in AE online ranking system.
  • Some novel large-scale product indexing and recall algorithms and technologies realized easy, can work steady and economize memory space.
  • A user privacy protection prototype system.

Related Research Topics

  • Learning to rank in E-commerce
  • Multi-task learning in E-commerce
  • Deep Neural Network for CTR/CVR prediction
  • Document indexing and data compression technology
  • Federated learning in E-commerce
  • Transfer learning for CTR/CVR prediction
  • Zero sample learning and few-shot learning (FSL)

Scan QR code
关注Ali TechnologyWechat Account