Deep neural networks have shown remarkable performance in image
classification. However, their performance significantly deteriorates with
corrupted input data. Domain generalization methods have been proposed to train
robust models against out-of-distribution data. Data augmentation in the
frequency domain is one of such approaches that enable a model to learn phase
features to establish domain-invariant representations. This approach changes
the amplitudes of the input data while preserving the phases. However, using
fixed phases leads to susceptibility to phase fluctuations because amplitudes
and phase fluctuations commonly occur in out-of-distribution. In this study, to
address this problem, we introduce an approach using finite variation of the
phases of input data rather than maintaining fixed phases. Based on the
assumption that the degree of domain-invariant features varies for each phase,
we propose a method to distinguish phases based on this degree. In addition, we
propose a method called vital phase augmentation (VIPAug) that applies the
variation to the phases differently according to the degree of domain-invariant
features of given phases. The model depends more on the vital phases that
contain more domain-invariant features for attaining robustness to amplitude
and phase fluctuations. We present experimental evaluations of our proposed
approach, which exhibited improved performance for both clean and corrupted
data. VIPAug achieved SOTA performance on the benchmark CIFAR-10 and CIFAR-100
datasets, as well as near-SOTA performance on the ImageNet-100 and ImageNet
datasets. Our code is available at https://github.com/excitedkid/vipaug.
Improving Robustness of Deep Neural Networks with Vital Phase Augmentation
Deep neural networks have revolutionized image classification tasks and have achieved remarkable performance. However, these models are highly sensitive to corrupted or out-of-distribution input data, which poses a significant challenge in real-world scenarios. In order to address this issue, domain generalization methods have been proposed to train models that are robust against such data.
Data augmentation is a common technique used to enhance the generalization ability of models. In the context of image data, frequency domain augmentation has emerged as an effective approach. This technique allows models to learn phase features, which are essential for establishing domain-invariant representations. By altering the amplitudes of the input data while preserving the phases, models can learn robust features that are invariant to changes in amplitude.
However, a limitation of existing frequency domain augmentation methods is their reliance on fixed phases. This can make the models susceptible to phase fluctuations, which commonly occur in out-of-distribution data. To overcome this limitation, the authors of this study propose an innovative approach that introduces finite variation in the phases of the input data.
The key idea behind this approach is that the degree of domain-invariant features may vary for each phase. By distinguishing and analyzing each phase based on this degree, the authors propose a method to determine the vital phases that contain more domain-invariant features. This information is used to guide the variation applied to the phases in the vital phase augmentation (VIPAug) method.
By making the model rely more on the vital phases, which are expected to be more robust to amplitude and phase fluctuations, the proposed approach aims to improve the model’s overall performance on both clean and corrupted data.
The experimental evaluations presented in this study demonstrate the effectiveness of the proposed approach. VIPAug achieved state-of-the-art (SOTA) performance on benchmark datasets such as CIFAR-10 and CIFAR-100. Moreover, it achieved near-SOTA performance on the challenging ImageNet-100 and ImageNet datasets.
The interdisciplinary nature of this research is notable. It combines concepts from deep learning, signal processing, and image classification. The study highlights the importance of considering both the amplitude and phase information in training robust models. By leveraging domain-invariant features in the frequency domain, the proposed approach showcases the potential of combining multiple disciplines to tackle a fundamental challenge in machine learning.
The availability of code on Github (https://github.com/excitedkid/vipaug) further emphasizes the authors’ commitment to reproducibility and knowledge sharing in the research community. Researchers and practitioners can utilize this code to implement VIPAug and explore its applications in their own projects.