arXiv:2501.14728v1 Announce Type: new
Abstract: While large generative artificial intelligence (GenAI) models have achieved significant success, they also raise growing concerns about online information security due to their potential misuse for generating deceptive content. Out-of-context (OOC) multimodal misinformation detection, which often retrieves Web evidence to identify the repurposing of images in false contexts, faces the issue of reasoning over GenAI-polluted evidence to derive accurate predictions. Existing works simulate GenAI-powered pollution at the claim level with stylistic rewriting to conceal linguistic cues, and ignore evidence-level pollution for such information-seeking applications. In this work, we investigate how polluted evidence affects the performance of existing OOC detectors, revealing a performance degradation of more than 9 percentage points. We propose two strategies, cross-modal evidence reranking and cross-modal claim-evidence reasoning, to address the challenges posed by polluted evidence. Extensive experiments on two benchmark datasets show that these strategies can effectively enhance the robustness of existing out-of-context detectors amidst polluted evidence.

The Impact of Artificial Intelligence on Online Information Security

The rise of generative artificial intelligence (GenAI) models has brought about significant advancements in various fields, but it has also raised concerns about the potential misuse of these models for generating deceptive content. In particular, the issue of out-of-context (OOC) multimodal misinformation detection has become increasingly challenging as the evidence used for identifying false contexts may be polluted by GenAI.

Existing works in this area have focused on simulating GenAI-powered pollution at the claim level through stylistic rewriting to conceal linguistic cues. However, they have largely overlooked the issue of evidence-level pollution, which is crucial for information-seeking applications. This work aims to fill this gap and investigate how polluted evidence affects the performance of existing OOC detectors.

The researchers conducted extensive experiments on two benchmark datasets to assess the impact of polluted evidence on the performance of OOC detectors. The results revealed a significant performance degradation of over 9 percentage points when polluted evidence was present. This highlights the urgent need to address this issue and develop strategies to enhance the robustness of existing detectors.

Cross-Modal Evidence Reranking

One strategy proposed in this work is cross-modal evidence reranking. This approach involves reevaluating the relevance and reliability of evidence by considering multiple modalities. By incorporating visual information, such as analyzing the consistency between textual claims and accompanying images, the authors aim to mitigate the impact of polluted evidence on the detection of out-of-context misinformation. This strategy leverages the multidisciplinary nature of multimedia information systems and demonstrates the importance of integrating different data modalities for accurate analysis.

Cross-Modal Claim-Evidence Reasoning

The second strategy proposed is cross-modal claim-evidence reasoning. This approach aims to exploit the correlation between claims and evidence across different modalities to improve the detection accuracy. By jointly modeling textual and visual information, the authors enable more comprehensive reasoning and inference, effectively addressing the challenges posed by polluted evidence in OOC detection. This strategy demonstrates the potential of interdisciplinary research in the fields of artificial reality, augmented reality, and virtual realities, as it combines linguistic and visual cues to enhance the detection capabilities of OOC detectors.

Overall, this study brings valuable insights into the impact of polluted evidence on the performance of OOC detectors and proposes effective strategies to mitigate this issue. The multi-disciplinary nature of the concepts explored in this work highlights the importance of integrating various disciplines, including multimedia information systems, animations, and artificial reality. By combining expertise from these fields, researchers can develop robust systems to combat the challenges posed by deceptive content generated using GenAI models.

Read the original article