Attention-Based Multimodal Entity Linking with High-Quality Images

Li Zhang, Zhixu Li, Qiang Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Multimodal entity linking (MEL) is an emerging research field which uses both textual and visual information to map an ambiguous mention to an entity in a knowledge base (KB). However, images do not always help, which may also backfire if they are irrelevant to the textual content at all. Besides, the existing efforts mainly focus on learning a representation of both mentions and entities from their textual and visual contexts, without considering the negative impact brought by noisy irrelevant images, which happens frequently with social media posts. In this paper, we propose a novel MEL model, which not only removes the negative impact of noisy images, but also uses multiple attention mechanism to better capture the connection between mention representation and its corresponding entity representation. Our empirical study on a large real data collection demonstrates the effectiveness of our approach.
Original languageEnglish (US)
Title of host publicationDatabase Systems for Advanced Applications
PublisherSpringer International Publishing
Pages533-548
Number of pages16
ISBN (Print)9783030731960
DOIs
StatePublished - Apr 6 2021

Fingerprint Dive into the research topics of 'Attention-Based Multimodal Entity Linking with High-Quality Images'. Together they form a unique fingerprint.

Cite this