We are delighted to share a very special success story from our team today: our colleague Ernesto Curmen has been honored for his outstanding master’s thesis. In his own contribution, he offers a personal insight into the fascinating topics he explored and tells us what inspired him most along the way. Enjoy reading!
Retrieval Augmented Generation, also known as RAG, has permanently changed the landscape of document search. The basic idea is quite straightforward: documents are divided into smaller sections (so-called chunks). These chunks are then converted into embeddings (vector representations that contain the semantic meaning of the text) and stored in a database. After this process, a user can ask a question, its embedding is also created and placed in the database. The most semantically relevant chunks are retrieved and then used to generate the final answer.
However, there is a major challenge with RAG: traditional solutions ignore the broader context. This often means that the system fails to find relevant information in the knowledge base. For example, let’s imagine a user asks the system: “What decisions were made regarding the next steps at Max Mustermann School?” The user wants to know about deadlines, upcoming actions, and agreements related to the construction of the school. But the relevant chunk that contains the answer might not explicitly reference “Max Mustermann School” and therefore cannot be retrieved successfully to generate the answer.

In my master’s thesis, I addressed this problem by implementing the approach of “Contextual-RAG” using the plenary protocol corpus from the DisLab research group at TU Darmstadt. Contextualization happens after splitting the text into chunks but before converting them into embeddings. Each chunk is enriched with relevant context, either from preceding or following passages, or even connected to the main idea of the entire text. This way, each chunk is supplemented with semantic hints that were not explicitly present before, but that could now match a user’s query. This contextualization was performed using local language models that are publicly available. The results show that Contextual-RAG improves recall by up to 18% compared to standard RAG when searching political corpora. In other words, more relevant matches appear among the top-k results.

The insights gained and methods developed in this work will be incorporated directly into the business software Inno:docs. Inno:docs is an AI-powered chatbot that makes company data searchable in natural language and provides fact-based answers. Thanks to the Contextual-RAG approach, our system can find more relevant documents, since each chunk considers the complete context of the text and is not stored in isolation in the database.
Picture: Innomatik AG
