Natural Language Processing
Domain Knowledge Graph Construction and Reasoning Techniques on Multi-Source Data
In recent years, the demand for knowledge graph has changed from simple data support to complex decision support, which requires the knowledge graph to evolve from a single source to multi-source.
At present, there are many types of raw materials can be used for knowledge graph construction, including table data, text data, voice data, picture data and other structured, semi-structured and unstructured data. However, in most cases, the data scatter, meanwhile the standards of each data source are different, there are data errors or conflicts. For example, in many scenes, the problem of knowledge attribute alignment and mapping among various sub platforms within the enterprise is common, which can lead to the lack of information between demand insight side and supply side.
There are many methods can be applied to help build the domain knowledge graph on multi-source data. For example, extraction and fusion technology can be applied to the attribute completion, alignment and normalization, which can pull through the mapping between the CPV attribute and the retail downstream attribute of each side, and better establish the connection between the demand side and the supply side. Meanwhile, relation reasoning technology can establish a clear industrial chain to enrich our knowledge graph, which can improve the platform's planning, sourcing ability, and then help customers find more efficient sources of goods.
- A methodical knowledge extraction model for extracting key domain entities and attributes accurately from our multimodal large-scale unstructured data
- A reliable fusion mechanism for entities and attributes extracting from the multi-source data
- A reliable relationship reasoning model that compatible with large-scale computation and real-time reasoning tasks, such as similarity evaluation, deep search, and so on
Related Research Topics
- Domain keyword and information extraction from our large-scale unstructured data
- Construction of bottom-up concept graph from multi source data automatically and integration with the top-down manual concept graph
- Construction of quality evaluation system for knowledge graph, such as credibility evaluation for entities, attributes and relationships extracted from multiple data sources
- Denoising techniques for knowledge graph with noises and conflicts
- A noise insensitive knowledge representation framework to mine and filter the potential noise and conflict in the knowledge graph
- Attribute fusion and entity links between multiple domain knowledge Graph
- Relationship extraction and reasoning from our multimodal large-scale unstructured data
- Graph embedding and storage method that compatible with large-scale computing and real-time search
- A knowledge system prototype including knowledge extraction，knowledge representation，knowledge fusion and knowledge reasoning for our business
Suggested Collaboration Method
ARF (Alibaba Research Fellowship), three months ~ one year onsite.