Draw Me Like My Triples: Leveraging Generative AI for Wikidata Image Completion (Poster)
Résumé
We leverage generative AI for the task of creating images for Wikidata items that do not have them. Our approach uses knowledge contained in Wikidata triples of items describing fictional characters and uses the fine-tuned T5 model based on the WDV dataset to generate natural text descriptions of items about fictional characters with missing images. We use those natural text descriptions as prompts for a transformer-based text-to-image model, Stable Diffusion (SD) v2.1, to generate plausible candidate images for Wikidata image completion. We motivate this choice by the fact that querying
Wikidata shows that only 7% out of the 83.7K instances of the fictional character class have an image.
Our work addresses the following Research Questions (RQs):
- RQ1: To what extent can different types of prompts based on triples be used in text-to-image models to produce high-quality images?
- RQ2: To what extent can the output of generative AI be used for Wikidata image completion?
- RQ3: How can generative text-to-image models be evaluated?
Origine : Fichiers produits par l'(les) auteur(s)