Answering natural language questions using knowledge graphs with a human-like, modular approach to question interpretation and answer generation
Aleksandr Perevalov and Prof. Dr Andreas Both from the Web & Software Engineering (WSE) research group in the Computer Science department of the Faculty of Computer Science and Media at HTWK Leipzig achieved a significant success at the TEXT2SPARQL Challenge at the renowned Extended Semantic Web Conference (ESWC 2025) in Slovenia at the beginning of June.
With their system mKGQAgent and the associated publication the associated publication "Text-to-SPARQL Goes Beyond English: Multilingual Question Answering Over Knowledge Graphs through Human-Inspired Reasoning", the team took first place in the "Overall Performance" and "DBpedia - Spanish Language" categories.
The TEXT2SPARQL Challenge is dedicated to the research field of Knowledge Graph Question Answering (KGQA). This involves automatically converting natural language questions about facts from users into formal SPARQL queries that can be used to query structured databases, so-called knowledge graphs. The particular challenge is that - in contrast to traditional search engines - it is not a list of possible results that must be provided, but exactly one correct answer.
Numerous international teams competed against each other in the competition. It was not the theoretical concept that was judged, but the concrete, executable implementation of the approach. All participants in the competition were automatically tested against a benchmark that had been kept secret up to that point, so that all systems were tested under realistic and comparable conditions. The results were then compared, with the HTWK Leipzig's approach proving to be the leader across the board ("Overall Performance") and in particular with Spanish questions on the DBpedia knowledge base ("DBpedia - Spanish Language").
At the centre of the successful process is "mKGQAgent", a novel AI agent approach for answering natural language questions using knowledge graphs. This follows a human-like, modular approach to question interpretation and answer generation: Instead of relying solely on purely neural or rule-based methods, mKGQAgent breaks down the complex task of SPARQL generation into comprehensible, interpretable sub-steps - including planning, entity linking and query refinement. The system uses a coordinated interaction of LLM (Large Language Models) components with an experience-based learning pool that allows efficient in-context adaptation - without the underlying model having to be retrained on a recurring basis.
The strong performance in the multilingual area of the competition is particularly noteworthy. While many approaches in the field of AI language processing are primarily focussed on the English language, the WSE research group has been consciously pursuing a language-agnostic approach for years, which also offers suitable solutions for less common languages. This strategic orientation has now been impressively confirmed by the international competition success. "The award is not only a confirmation of our many years of research, but also proof that future-proof AI systems can be flexible, transparent, linguistically inclusive and inexpensive," says Prof Dr Andreas Both.
Aleksandr Perevalov, who is a doctoral student in the research group, will complete his doctorate on the topic of "Multilingual Question Answering over Knowledge Graphs" in summer 2025. The success of the mKGQAgent approach is largely driven by these results and is an example of excellent early career research at HTWK Leipzig as well as a confirmation of his personal achievements. The work is embedded in the WSE research group at HTWK Leipzig, which focuses on modern software development methods, web technologies, data processing and AI methods. The aim is to develop innovative and practice-relevant solutions at the interface of software technology and intelligent information processing.
The results of the research group provide important impetus for the further development of multilingual question-answer systems and demonstrate the potential of human-like reasoning processes in semantic information processing.
