Machine Learning (algorithm)
Explainable Semi-Supervised Anomaly Detection for Urban Computing
With the rapid increase of urbanization, people leverage technologies to manage the city more efficiently and improve the citizens’ quality of life, which is called “Smart City”. Every moment, enormous number of data are produced in various forms including texts, images, videos, etc. The vast majority of these data are normal, while our major concerns are those outliers which may cause trouble. Typically anomaly detection(AD) is treated as an unsupervised learning problem, e.g. Isolation Forest. However, people are only interested in specific kind of anomalies in most of urban computing applications. In practice, a small set of samples are labeled to denote the “anomaly”, along with extremely large amounts of unlabeled data.
Recently, a lot of deep semi-supervised anomaly detection methods are proposed to learn the distribution of normal data and then to identify data samples distant from the distribution as anomalies, such as nearest-neighbor based techniques, clustering approaches, and one-class classification approaches. These methods have disadvantages as follows:
- Lack of utilizations of labeled anomalies. Most of methods above only incorporate the use of labeled normal samples but not labeled anomalies, which will lead to under-fitting the distribution of anomalies. While some existing methods do utilize the labeled anomaly samples but most of them are domain specific.
- Few-shot learning problem. In practice, anomaly samples are labeled by domain experts and always be rare. Existing methods cant handle this core issue well.
- Unexplainable Problem. People need to figure out the interpretations before disposing anomalies. But due to deep learning methods’ black-box nature, it is inherently difficult to understand which aspects of the input data drive the decisions.
We hope to mitigate the aforementioned challenges with experienced researchers.
- Propose better performance semi-supervised anomaly detection methods in urban computing.
- Model/ Framework that can be applied to different tasks effectively.
- Impactful research paper publication.
Related Research Topics
- Few-shot learning.
- Anomaly detection.
- Urban computing.
- Explainable deep learning.
Suggested Collaboration Method
AIR (Alibaba Innovative Research), one-year collaboration project.