GitHub Twitter

DataCatalog DRAFT Profile

Version: 0.2-DRAFT-2018_11_13 (30 November 2018)


If you spot any errors or omissions with this type, please file an issue in our GitHub.


Contributors

The following people have been involved in the creation of this specification document. They are all members of the Data Repositories group.

Group Leader(s)

Henning Hermjakob

Other team members

Schema.org hierarchy

This Profile fits into the schema.org hierarchy as follows:

Thing > CreativeWork > DataCatalog

Description

A guide for how to describe data catalogs/repositories in the life-sciences using Schema.org-like annotation.

Latest profiles

Latest release: 0.3-RELEASE-2019_07_01

Previous profiles

Previous version: 0.1-DRAFT-2018_04_25


Group Use Cases Cross Walk Task & Issues Examples Live Deploys
Data Repositories

You can read the release version of this specification here.




Key to specification table

Schema.org properties where the Expected Types have been changed, or new (i.e., Bioschemas created) properties/types are green.

Schema.org properties/types are red.

Pending Schema.org properties/types are blue.

External (i.e., from 3rd party ontology) properties/types are black.


CD = Cardinality


View all examples
Property Expected Type Description CD Controlled Vocabulary Example
Marginality: Minimum.
description Text
Schema:

A description of the item.


Bioschemas:

A short summary describing a dataset.

ONE
View ≪description≫ example
identifier PropertyValue
Text
URL
Schema:

The identifier property represents any kind of identifier for any kind of Thing, such as ISBNs, GTIN codes, UUIDs etc. Schema.org provides dedicated properties for representing many of these, either as textual strings or as URL (URI) links. See background notes for more details.


Bioschemas:

Identifier of the DataCatalog in Curie form. Eg. prefix:accession.

MANY
View ≪identifier≫ example
keywords Text
Schema:

Keywords or tags used to describe this content. Multiple entries in a keywords list are typically delimited by commas.


Bioschemas:

These keywords provide a summary of the dataset.

MANY
View ≪keywords≫ example
name Text
Schema:

The name of the item.


Bioschemas:

A descriptive name of the dataset.

ONE
View ≪name≫ example
url URL
Schema:

URL of the item.


Bioschemas:

The location of a page describing the dataset.

ONE
View ≪url≫ example
Marginality: Recommended.
about Thing
Schema:

The subject matter of the content. Inverse property: subjectOf.


MANY
citation CreativeWork
Text
Schema:

A citation or reference to another creative work, such as another publication, web page, scholarly article, etc.


Bioschemas:

A citation for a publication that describes the dataset.

MANY
View ≪citation≫ example
creator Organization
Person
Schema:

The creator/author of this CreativeWork. This is the same as the Author property for CreativeWork.


Bioschemas:

The name of the dataset creator (person or organization).

MANY
dataset Dataset
Schema:

A dataset contained in this catalog. Inverse property: includedInDataCatalog.


ONE
license CreativeWork
URL
Schema:

A license document that applies to this content, typically indicated by URL.


Bioschemas:

A license under which the dataset is distributed.

ONE
View ≪license≫ example
measurementTechnique Text
URL
Schema:

A technique or technology used in a Dataset (or DataDownload, DataCatalog), corresponding to the method used for measuring the corresponding variable(s) (described using variableMeasured). This is oriented towards scientific and scholarly dataset publication but may have broader applicability; it is not intended as a full representation of measurement, but rather as a high level summary for dataset discovery. For example, if variableMeasured is: molecule concentration, measurementTechnique could be: “mass spectrometry” or “nmr spectroscopy” or “colorimetry” or “immunofluorescence”. If the variableMeasured is “depression rating”, the measurementTechnique could be “Zung Scale” or “HAM-D” or “Beck Depression Inventory”. If there are several variableMeasured properties recorded for some given data object, use a PropertyValue for each variableMeasured and attach the corresponding measurementTechnique.


MANY
version Number
Text
Schema:

The version of the CreativeWork embodied by a specified resource.


Bioschemas:

The version number for this dataset.

ONE

Top ▲