Towards Efficient Cross-Modal Visual Textual Retrieval using Transformer-Encoder Deep Features

Bibtex entry :

@inproceedings { messina:cbmi2021,
    author = { Nicola Messina and Giuseppe Amato and Fabrizio Falchi and Claudio Gennaro and Stephane Marchand-Maillet },
    title = { Towards Efficient Cross-Modal Visual Textual Retrieval using Transformer-Encoder Deep Features },
    booktitle = { 18th International Conference on Content-Based Multimedia Indexing, {CBMI} 2021, Lille, France, June 28-30, 2021 },
    pages = { 1--6 },
    publisher = { {IEEE} },
    year = { 2021 },
    url = { https://doi.org/10.1109/CBMI50038.2021.9461890 },
}
--

Keywords: machine learning, information geometry, data mining, Big Data, affective information retrieval (recherche d'information), information visualisation, content-based image and video retrieval (CBIR, CBR, CBVR, CBMR, CBMIR), information mining, classification, multimedia and multimodal information management, semantic web, knowledge base (RDF, OWL, XML, metadata, auto-annotation, description), multimodal information fusion