Deep anomaly detection methods have become increasingly popular in recent
years, with methods like Stacked Autoencoders, Variational Autoencoders, and
Generative Adversarial Networks greatly improving the state-of-the-art. Other
methods rely on augmenting classical models (such as the One-Class Support
Vector Machine), by learning an appropriate kernel function using Neural
Networks. Recent developments in representation learning by self-supervision
are proving to be very beneficial in the context of anomaly detection. Inspired
by the advancements in anomaly detection using self-supervised learning in the
field of computer vision, this thesis aims to develop a method for detecting
anomalies by exploiting pretext tasks tailored for text corpora. This approach
greatly improves the state-of-the-art on two datasets, 20Newsgroups, and AG
News, for both semi-supervised and unsupervised anomaly detection, thus proving
the potential for self-supervised anomaly detectors in the field of natural
language processing.

Deep anomaly detection methods have gained significant popularity in recent years due to advancements in techniques such as Stacked Autoencoders, Variational Autoencoders, and Generative Adversarial Networks. These methods have revolutionized the state-of-the-art in anomaly detection by effectively learning complex representations and capturing subtle patterns in data.

In addition to these deep learning approaches, researchers have also explored augmenting traditional models like One-Class Support Vector Machines with neural networks to learn appropriate kernel functions. This fusion of classical models with deep learning has further improved the performance of anomaly detection systems.

One of the key recent breakthroughs in representation learning is self-supervision, where models are trained on pretext tasks that require them to predict certain properties of the input data without explicit labels. This self-supervised learning paradigm has shown great potential in the field of computer vision.

This thesis builds upon these advancements in self-supervised learning and aims to develop a method specifically tailored for detecting anomalies in text corpora. By leveraging pretext tasks adapted for text data, the proposed approach significantly enhances the state-of-the-art in both semi-supervised and unsupervised anomaly detection on two benchmark datasets: 20Newsgroups and AG News.

This research highlights the multi-disciplinary nature of anomaly detection. It bridges concepts from deep learning, natural language processing, and self-supervised learning to tackle the challenges of detecting abnormalities in text. By combining these disparate fields, the proposed method brings forth novel insights and pushes the boundaries of anomaly detection in the domain of natural language processing.

Read the original article