Skip to content
GitHub
Twitter

What is Bioschemas?

Bioschemas aims to improve the Findability on the Web of life sciences resources such as datasets, software, and training materials. It does this by encouraging people in the life sciences to use Schema.org markup in their websites so that they are indexable by search engines and other services. Bioschemas encourages the consistent use of markup to ease the consumption of the contained markup across many sites. This structured information then makes it easier to discover, collate, and analyse distributed resources.

Bioschemas is making two main contributions:

  1. Proposing new types and properties to Schema.org to allow for the description of life science resources.
  2. Defining usage profiles over the Schema.org types that identify the essential properties to use in describing a resource.

Endorsement of Bioschemas

Including Bioschemas markup within a web resource is a simple first step to making your data Findable, c.f. the FAIR Principles. In particular, search engines index markup from webpages to populate their registries, e.g. Google Dataset Search.

Use of Bioschemas to make resources more discoverable has been endorsed by the European Research Council in their Open Research Data and Data Management Plans policy ('metadata' section, page 11). Including Bioschemas markup in a resource's metadata means that you meet some of the Findability criteria of the FAIR Data Principles.

Bioschemas is a flagship policy of ELIXIR (the European life-sciences Infrastructure for biological Information), and a key component of their 2024-2028 Scientific Programme.

The use of Bioschemas markup is also recommended by the International Society for Biocuration in order to help make resources more discoverable.

Bioschemas Community

Bioschemas started as a community effort in November 2015. It operates as an open community initiative with representatives from a wide variety of institutions. You are welcome to join the community.

For details about related community efforts, please see our related communities page.

Schema.org

Schema.org is a community effort supported by the main search engines, and is already widely implemented across the web.

Schema.org provides a way to add semantic markup to web pages. It describes ‘types’ of information, which then have ‘properties’. The types are things that we can talk about and the properties are the things that we can say about the type. For example, Event is a type that has properties like startDate, endDate, and description.

If types or properties needed in the life sciences are missing, then Bioschemas is developing proposals for new types and properties to be included into Schema.org.

Bioschema Profiles

To simplify the marking up of web resources, and to provide consistency of markup within the life sciences community, Bioschemas are defining profiles over types that state which properties must be used (minimum), should be used (recommended), and could be used (optional). The profiles also state the cardinality of usage of a property, and identify domain ontologies to use for the value of properties.

For example, if we look at the schema.org/Dataset type there are over 100 properties available to use. The Bioschemas profile over Dataset brings this down to a more manageable number, with 5 mandatory properties and 8 recommended properties. Many of the other properties have little relevance for a Dataset. The dataset markup properties that Bioschemas specifies as mandatory will also make them findable by Google's Dataset Search tool.

The Bioschemas community are defining profiles over relevant existing Schema.org types, e.g. DataCatalog, Course, and SoftwareApplication, and over the new types being defined for the life sciences, e.g. Gene, Protein, and Taxon.

Funding

The Bioschemas Community have received funding through the ELIXIR-EXCELERATE grant and ELIXIR Implementation Studies. Full details of funding can be on our funding page.

News

Read more news on our news page.

Upcoming Meetings

Monthly Call

The Bioschemas Community/Technical Call takes place on the 3rd Monday of each month at 16:00 UTC (see it in your own time zone). We alternate between community and technical discussions. Call in details available in the agenda.

  • Next call: 18 November, 2024 16:00 UTC
  • Agenda - Community call
See more meetings.

Latest Presentation

Click here to watch a webinar about Bioschemas.

See more presentations and publications.