Topical classification of Wikipedia images - Master Thesis

Topical classification of Wikipedia images - Master Thesis

About this project

Wikipedia is full of articles… and images! Having over 53 million articles in 299 languages containing 11.5 million unique images, there is a great need for automated organization of all this data. Inspired by ORES, an ensemble of machine learning systems in Wikipedia that provides among others automated labeling of articles, the semester project aimed at automated topic labeling of images in Wikipedia. In this project, experiments are made using images labeled with the ORES labels of the articles where they are present, and with the custom labels that were attributed with a heuristic in the taxonomy part of this semester project. Two different models (EfficientNetB0 and EfficientNetB2) are trained on this data using 10 or 20 labels. As the main insights we understood that:

As for the master thesis that I am currently undertaking, the focus lies on:

Alt Text

Structure of the designed network.

Alt Text

Class distribution.
rss facebook twitter github mail instagram linkedin