BioHackEU25 report: Mining the potential of knowledge graphs for metadata on training
Published: 09 December 2025
Tags: BHEU2025 Preprint BioHackrXiv
We are pleased to announce another BioHackrXiv preprint from 2025. It reports on the work done during the past BioHackathon Europe 2025 by the group Mining the potential of knowledge graphs for metadata on training.
Abstract:
Training metadata in the life‑science community is increasingly standardized through Bioschemas, yet remains fragmented and under‑utilized. In this work we harvested training records from ELIXR’s TeSS platform and the Galaxy Training Network, converting them into a unified knowledge graph. A dedicated pipeline parses RDF/Turtle dumps, deduplicates entries, and builds rich indexes (keyword, provider, location, date, topic) that power a Model Context Protocol (MCP) server. The MCP offers live and offline search tools—including keyword, provider, location, date, topic, and SPARQL queries—enabling natural‑language access to training resources via LLM‑driven clients. User‑story driven evaluations demonstrate the system’s ability to generate custom learning paths, assemble trainer profiles, and link training data to external repositories. Findings highlight gaps in persistent identifiers (ORCID, ROR) and location granularity, informing recommendations for metadata providers. The project showcases how knowledge‑graph‑backed metadata can enhance discoverability, interoperability, and AI‑assisted exploration of scientific training materials.
Citation and link:
D. Panouris, H. Gupta, V. Emonet, J. Miranda, J. Bolleman, P. Reed, F. Bacall, G. van Geest, Mining the potential of knowledge graphs for metadata on training, (2025). doi:10.37044/osf.io/gv2ac_v1.
News
09 December 2025 - BHEU2025 Preprint BioHackrXiv
20 July 2025 - Event