What is Bioschemas?
Bioschemas aims to improve the Findability of data in the life sciences. It does this by encouraging people in the life sciences to use Schema.org markup in their websites so that they are indexable by search engines and other services. Bioschemas encourages the consistent use of markup to ease the consumption of the contained markup across many sites. This structured information then makes it easier to discover, collate, and analyse distributed data.
Bioschemas is making two main contributions:
- Proposing new types and properties to Schema.org to allow for the description of life science resources.
- Profiles over the Schema.org types that identify the essential properties to use in describing a resource.
Schema.org is a community effort supported by the main search engines, and is already widely implemented across the web.
Schema.org provides a way to add semantic markup to web pages. It describes ‘types’ of information, which then have ‘properties’. The types are things that we can talk about and the properties are the things that we can say about the type. For example, Event is a type that has properties like startDate, endDate, and description.
If types or properties needed in the life sciences are missing, then Bioschemas is developing proposals for new types and properties to be included into Schema.org.
To simplify the marking up of web resources, and to provide consistency of markup within the life sciences community, Bioschemas are defining profiles over types that state which properties must be used (minimum), should be used (recommended), and could be used (optional). The profiles also state the cardinality of usage of a property, and identify domain ontologies to use for the value of properties.
For example, if we look at the schema.org/Dataset type there are over 90 properties available to use. The Bioschemas profile over Dataset brings this down to a more manageable number, with 5 mandatory properties and 8 recommended properties. Many of the other properties have little relevance for a Dataset. The dataset markup properties that Bioschemas specifies as mandatory will also make them findable by Google's Dataset Search tool.
The Bioschemas community are defining profiles over relevant existing Schema.org types, e.g. DataCatalog, Course, and SoftwareApplication, and over the new types being defined for the life sciences, e.g. Gene, Protein, Taxon.
The Bioschemas Community have received funding through the ELIXIR-EXCELERATE grant and ELIXIR Implementation Studies. Full details of funding can be on our funding page.