Two-Level Test-Time Adaptation in Multimodal Learning

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Test-time adaptation (TTA) aims to adjust the parameters of a pre-trained source model using samples from the target domain, without requiring access to the source data. While recent studies have shown the potential of TTA across various computer vision tasks, most TTA methods are limited to uni-modal adaptation, and the domain shift caused by unimodal data corruption in multimodal tasks is not adequately addressed. Although some recent approaches have reduced cross-modal information discrepancy through modality-sharing modules, the domain adaptation for modality-specific modules has been overlooked. In this paper, we introduce a two-level test-time adaptation method (2LTTA) that accounts for both intra-modal distribution shifts and cross-modal reliability bias in multimodal learning (MML). Unlike conventional TTA methods, which focus primarily on fine-tuning normalization layers, 2LTTA modulates all normalization layers, self-Attention modules of the encoder related to the corrupted modality, and the modality-sharing block. Additionally, we design a two-level objective function that addresses both intra-modal distribution shift and cross-modal reliability bias in the modality fusion block. First, Shannon entropy with sample reweighting is used to mitigate intra-modal distribution shifts caused by data corruption. Second, a diversity-promoting loss is incorporated to reduce cross-modal information discrepancy. Our experiments show that 2LTTA outperforms baseline methods across various datasets.
Original languageEnglish
Title of host publicationInternational Joint Conference on Neural Networks, IJCNN 2025 - Proceedings
PublisherIEEE
ISBN (Electronic)979-8-3315-1042-8
DOIs
Publication statusPublished - 14 Nov 2025
Event2025 International Joint Conference on Neural Networks, IJCNN 2025 - Rome, Italy
Duration: 30 Jun 20255 Jul 2025
https://2025.ijcnn.org/

Publication series

NameProceedings of the International Joint Conference on Neural Networks
ISSN (Print)2161-4393
ISSN (Electronic)2161-4407

Conference

Conference2025 International Joint Conference on Neural Networks, IJCNN 2025
Country/TerritoryItaly
CityRome
Period30/06/255/07/25
Internet address

Keywords

  • fine-tuning
  • multimodal learning (MML)
  • reliability bias
  • test-time adaptation

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fields of Expertise

  • Information, Communication & Computing

Fingerprint

Dive into the research topics of 'Two-Level Test-Time Adaptation in Multimodal Learning'. Together they form a unique fingerprint.

Cite this