Online Multimodal Knowledge Discovery

Welcome

New social technologies and widespread access to the internet have allowed for new forms of content creation, connectivity and information sharing. With vast unstructured data and limited labels, organizing and reconciling information from different sources and modalities with bounded supervision is one of the current challenges in machine learning. This tutorial focuses on using multimodal representations for graph-regularized or semi-supervised learning, and uses as case study two real-world multi-domain datasets which prompt for understanding the fine-grained visual and linguistic semantics.

Venue

The Online Multimodal Knowledge Discovery tutorial will be held virtually at ICDM 2020: 20th IEEE International Conference on Data Mining on November 18th, 2020, from 14:30 to 16:30 CET.

Outline

Section	Subsection	min
Introduction	The landscape of online content	10
Introduction	A case for multimodal knowledge reconciliation	5
Natural Language Processing	From word embeddings to contextualized representations	10
	Fine-tuning pretrained models on downstream tasks	5
	The textual entailment problem	5
Structured Data	Semi-structured and tabular text	5
Structured Data	Knowledge graphs	5
Neural Graph Learning	Leveraging structured signals with Neural Structured Learning	10
Break	-	5
Multimodal Learning	Learning joint representations for visual and language tasks	20
	Self-Supervised Multimodal Versatile Networks	20
	Multimodal representations for knowledge reconciliation	10
Final considerations	Closing notes	5
Final considerations	Q&A	5
Total	–	120

Slides

Reading list

Tutors

Cesar Ilharco
Senior Research Engineer,
Google Research

Ricardo Marino
Data Scientist,
Google Research

Jannis Bulian
Senior Software Engineer,
Google Research

Arsha Nagrani‎
Research Scientist,
Google Research

Lucas Smaira
Senior Research Engineer,
DeepMind

Afsaneh Shirazi
Senior Staff Software Engineer,
Google Research

Acknowledgements

We would like to thank Gabriel Ilharco, Abe Ittycheriah, Thomas Leung, Felipe Ferreira, Mor Naaman, Isabelle Augenstein, Arkaitz Zubiaga, Elena Kochkina, Arjun Gopalan, Da-Cheng Juan, Jordan Boyd-Graber, Chen Sun, Cong Yu, Tania Bedrax-Weiss, Cordelia Schmid, Chris Bregler‎ and Rahul Sukthankar.

Welcome

Venue

Outline

Slides

Reading list

Natural Language Processing

Textual Entailment

Structured Data

Neural Graph Learning

Multimodal Learning

Datasets

Tutors

Acknowledgements