Towards Visual Question Generation System

Boucif, Miyyada; Rahim, Ikram; Ouahrani, L. ( Promotrice)

Towards Visual Question Generation System

Boucif, Miyyada; Rahim, Ikram; Ouahrani, L. ( Promotrice)

URI: https://di.univ-blida.dz/jspui/handle/123456789/25250

Date: 2023-07

Résumé:

In recent years, researchers have focused on developing and training visual question generation models that based on deep neural networks. these models have a wide range of applications in various domains, However, there have been no specialized works conducted on visual question generation in the Arabic language. Our work aims to automate the process of generating Arabic educational questions from visual content. We propose a visual Arabic question generation multi-modal, which integrates two distinct models. The ﬁrst model is a ﬁne-tuned Arabic image captioning model, obtained by ﬁne-tuning the Google Vision transformer and AraBert transformer using a new collected dataset. The second model is an Arabic natural question generation ﬁne-tuned model. Our proposed multi-model has been evaluated using the Transparent Human benchmark protocol, and the results demonstrate its ability to generate relevant captions. 51% of the captions received a rating between 2 to 4 out of 5 on the scale, indicating their relevance. Additionally, the model produced relevant questions based on these captions, achieving an average rating of 3.33 out of 5 in term of relevance. Keywords: Visual question generation, Arabic image captioning, Transformers, Vision transformer, deep learning.