Parameter-efficient tuning (PET) methods such as LoRA, Adapter, and Visual
Prompt Tuning (VPT) have found success in enabling adaptation to new domains by
tuning small modules within a transformer model. However, the number of domains
encountered during test time can be very large, and the data is usually
unlabeled. Thus, adaptation to new domains is challenging; it is also
impractical to generate customized tuned modules for each such domain. Toward
addressing these challenges, this work introduces PLUTO: a Plug-and-pLay
modUlar Test-time domain adaptatiOn strategy. We pre-train a large set of
modules, each specialized for different source domains, effectively creating a
“module store”. Given a target domain with few-shot unlabeled data, we
introduce an unsupervised test-time adaptation (TTA) method to (1) select a
sparse subset of relevant modules from this store and (2) create a weighted
combination of selected modules without tuning their weights. This
plug-and-play nature enables us to harness multiple most-relevant source
domains in a single inference call. Comprehensive evaluations demonstrate that
PLUTO uniformly outperforms alternative TTA methods and that selecting $leq$5
modules suffice to extract most of the benefit. At a high level, our method
equips pre-trained transformers with the capability to dynamically adapt to new
domains, motivating a new paradigm for efficient and scalable domain
adaptation.

PLUTO: A Plug-and-Play Modular Test-Time Domain Adaptation Strategy

Domain adaptation is a crucial task in natural language processing, especially when dealing with new and unlabeled domains. Parameter-efficient tuning (PET) methods like LoRA, Adapter, and Visual Prompt Tuning (VPT) have shown promise in enabling adaptation to new domains by fine-tuning small modules within a transformer model. However, these methods have limitations when the number of domains encountered during test time is large and data is unlabeled.

To address these challenges, researchers have proposed a new strategy called PLUTO. The goal of PLUTO is to create a plug-and-play modular test-time domain adaptation approach that overcomes the limitations of existing methods. The key idea behind PLUTO is to pre-train a large set of modules, each specialized for different source domains, effectively creating a “module store”.

When faced with a target domain with few-shot unlabeled data, PLUTO uses an unsupervised test-time adaptation (TTA) method. This method has two main steps: (1) selecting a sparse subset of relevant modules from the pre-trained module store and (2) creating a weighted combination of these selected modules without tuning their weights. This plug-and-play nature allows PLUTO to leverage multiple relevant source domains in a single inference call.

The results of comprehensive evaluations show that PLUTO consistently outperforms alternative TTA methods. Surprisingly, the experiments reveal that selecting as few as five modules is enough to extract most of the benefit. This means that PLUTO is both efficient and scalable in real-world scenarios with a large number of domains.

At a high level, the PLUTO method equips pre-trained transformers with the capability to dynamically adapt to new domains. This is a significant advancement that motivates a new paradigm for efficient and scalable domain adaptation. The multi-disciplinary nature of the concepts used in PLUTO, combining unsupervised learning, module selection, and weighted combination, demonstrates the importance of integrating different research areas to solve complex natural language processing challenges.

Read the original article