Precision is one of the main reasons why AI fails to recognise objects. This is because AI only recognises what it already knows. Abstraction and variation are not its strengths. This is one of the findings of the master’s thesis ‘Interpreting YOLO: A Study on Explainability in Multi-Class Object Detection Models’ by Gideon Antwi, a working student at innomatik AG. In his thesis, Antwi investigates how object recognition takes place in artificial intelligence and which procedure sustainably improves the quality of the results.
This Is How AI Works
What happens in an AI? And how do you shed light on the black box that AI is for most people? Antwi wanted to investigate these questions in his master’s thesis and chose the topic of image recognition. For his study, he used around 10,000 images as the dataset for training the model (YOLO), each showing a variation of 12 different objects. The objects came from a set of 81 playing cards with different patterns, which also set 81 classes for recognising and classifying the results.
Understanding AI
In a first step, Antwi arranged the cards in an even grid of four cards across and three cards lengthways on a dark background. This was later followed by a light-coloured background. Antwi also did a test on a wood-mastered tabletop. The researcher also varied the lighting, as well as the order of the cards, which were later placed partly overlapping each other or diagonally.
Variation Is Key
‘Essentially, the variations offered in the dataset images should be as wide as possible so that, in the best-case scenario, all eventualities are already taken into account during training and the AI works reliably in everyday life,’ explains Antwi. This is because an AI can only recognise exactly what it already knows. This applies not only to the actual object, but also to all accompanying factors. Unlike humans, an AI cannot abstract from basic knowledge but has to learn each piece of knowledge separately.

Thinking Like An AI
‘Basically, you have to try to put yourself in the AI’s shoes and ask yourself what I am thinking when I recognise the object and why I might fail,’ says Antwi about his approach, which gradually improved the recognition performance. The key here was to analyse which objects were not recognised and why. Causes included light reflections, low contrasts, barely visible boundaries between individual cards or a heavily patterned background that was recognised as another card. Antwi identified these sources of error and had them systematically retrained so that the algorithm could eliminate them.
Heatmaps As Indicators
In order to understand exactly what the AI recognised in each run and where there were gaps in its recognition performance, Antwi used a self-developed program that displayed the AI’s activities in the form of a heat map. This showed that even in the same training run, the AI processed identical images with very different intensity, for example only the centre area in one run and the entire image in a subsequent run.
KI Is Learning Step By Step
The study shows that an AI learns step by step and that each cause of error must be identified and understood in order to systematically retrain the model with matching content. ‘Intensive training is the only way to guarantee reliable detection in practice,’ advises Antwi as a consequence of his observation. The time required for detailed error analysis and targeted training of specific details quickly pays off in productive use thanks to the high quality of the results.

Picture: innomatik AG/Gideon Antwi