Bioschemas Tutorial at SWAT4HCLS Leiden

Dates: 10 January 2022 - 13 January 2022

Venue: Virtual and Fletcher Wellness-Hotel, Leiden, The Netherlands

Contacts:

Meeting page: https://www.swat4ls.org/workshops/leiden2022/

During the main conference, there will be a paper on the construction of the Intrinsically Disordered Protein Knowledge Graph which is generated using Bioschemas markup.

Bioschemas – Deploying and Harvesting Markup

Bioschemas makes life sciences resources more discoverable by embedding machine readable markup within web pages. The markup uses the Schema.org vocabulary which has been extended to include life sciences specific types such as Gene, MolecularEntity, and Taxon. The vocabulary enables a high-level overview of the content of each page, e.g. basic information about a Gene, Protein, or Drug, to be provided in an interoperable, machine-processable form.

Embedded markup can be harvested by search engines and other applications without needing to understand separate APIs for each resource. The extracted markup can be integrated and used to power specialised search portals, e.g. TeSS, fed into global knowledge graphs, e.g. OpenAIRE, or used to form domain specific knowledge graphs, e.g. IDP-KG.

In this tutorial you will be given an overview of Bioschemas, covering the types that have been included into Schema.org, the usage profiles that have been agreed over these types, and the new types and profiles that the community are working on. We will then cover how to deploy markup within a web page so that the page and the whole site become more discoverable on the Web. Finally, we will discuss how to harvest data from websites and what considerations there are in reusing that data for search portals or knowledge graph construction.

Attendees: