Juhu! Wir sind Gastgeber des Munich Datageek Meetups am 26. Oktober.
Unser Data Scientist Jan wird einen Use Case und unsere Learnings zu Retrieval-Augmented Generation (RAG) präsentieren. Wir freuen uns auf einen interessanten Abend und die Münchner Datageeks!
Jans Vortrag im Detail:
Talk 1: Jan Hauffa - A Case Study on Retrieval-Augmented Generation for Document Q&A: Experiences and Future Perspectives
Abstract: Neural language models based on the Transformer architecture have been successfully applied to a wide range of Natural Language Generation tasks, but are held back by their limited context length, that is, their inability to simultaneously “pay attention” to all parts of a long document. Retrieval-Augmented Generation (RAG) is currently the most promising approach to overcome this limitation. By means of semantic similarity search, one can identify the parts of a document that are most relevant to the task at hand, and use only those parts as input to the language model. In this talk, I demonstrate how RAG can be used to build a system for answering arbitrary questions, posed in natural language, about the content of documents (“document Q&A”). I discuss the challenges we faced when implementing document Q&A at NorCom, how to improve the performance of a document Q&A system, and how to reliably measure the performance in the first place.