GitHub Twitter

DataRecord Profile

Version: 0.1 (25 February 2018)

Bioschemas specification describing a record in a dataset.


Contributors

The following people have been involved in the creation of this specification document. They are all members of the Datasets group.

Group Leader(s)
Other team members

Schema.org hierarchy

This Profile fits into the schema.org hierarchy as follows:

Thing > CreativeWork > Dataset

Description

A Record acts itself as a dataset although it refers to what could be seen as the minimum compact, complete and auto-descriptive unit in a dataset, i.e., a record. Bioschemas usage In Life Sciences, records will represent a BioChemEntity.





Key to specification table

Schema.org properties where the Expected Types have been changed, or new (i.e., Bioschemas created) properties/types are green.

Schema.org properties/types are red.

Pending Schema.org properties/types are blue.

External (i.e., from 3rd party ontology) properties/types are black.


CD = Cardinality


Property Expected Type Description CD Controlled Vocabulary Example
Marginality: Minimum.
identifier PropertyValue
Text
URL
Schema:

The identifier property represents any kind of identifier for any kind of Thing, such as ISBNs, GTIN codes, UUIDs etc. Schema.org provides dedicated properties for representing many of these, either as textual strings or as URL (URI) links. See background notes for more details.


ONE
mainEntity Thing
Schema:

Indicates the primary entity described in some page or other CreativeWork. Inverse property: mainEntityOfPage.


Bioschemas:

Bioschemas usage. Link to the BioChemEntity represented by this record.

ONE
rdf:type URL
Bioschemas:

This is used by validation tools to indentify the profile used. You must use the value specified in the Controlled Vocabulary column.

ONE
Marginality: Recommended.
additionalType URL
Schema:

An additional type for the item, typically used for adding more specific types from external vocabularies in microdata syntax. This is a relationship between something and a class that the thing is in. In RDFa syntax, it is better to use the native RDFa syntax - the ‘typeof’ attribute - for multiple types. Schema.org tools may have only weaker understanding of extra types, in particular those defined externally.


Bioschemas:

Although not required, additionalType can be used to specify the nature of the record. For instance, a UniProt protein record would have UP:Protein as type.

MANY
Marginality: Optional.
additionalProperty PropertyValue
Schema:

A property-value pair representing an additional characteristics of the entitity, e.g. a product feature or another characteristic for which there is no matching property in schema.org. Note: Publishers should be aware that applications designed to use specific schema.org properties (e.g. http://schema.org/width, http://schema.org/color, http://schema.org/gtin13, …) will typically expect such data to be provided using those properties, rather than using the generic property/value mechanism.


Bioschemas:

Additional to the use of name and description to describe this property in a human-readable way, additionalType should be used to specify the nature of the property/relation. For instance, if the property refers to a gene/protein disease association, you could use SIO:000983 (gene-disease association) as the additionalType for the additionalProperty.

MANY
citation CreativeWork
Text
Schema:

A citation or reference to another creative work, such as another publication, web page, scholarly article, etc.


MANY
dateCreated Date
DateTime
Schema:

The date on which the CreativeWork was created or the item was added to a DataFeed.


ONE
dateModified Date
DateTime
Schema:

The date on which the CreativeWork was most recently modified or when the item’s entry was modified within a DataFeed.


ONE
datePublished Date
Schema:

Date of first broadcast/publication.


ONE
distribution DataDownload
Schema:

A downloadable form of this dataset, at a specific location, in a specific format.


MANY
image ImageObject
URL
Schema:

An image of the item. This can be a URL or a fully described ImageObject.


MANY
isBasedOn CreativeWork
Product
URL
Schema:

A resource that was used in the creation of this resource. This term can be repeated for multiple sources. For example, http://example.com/great-multiplication-intro.html. Supersedes isBasedOnUrl.


Bioschemas:

Whenever possible use Evidence Codes (ECO)

MANY

ECO

isBasisFor CreativeWork
Product
URL
Bioschemas:

A resource for which this resource is basis for. Inverse property: isBasedOn.

MANY

ECO

keywords Text
Schema:

Keywords or tags used to describe this content. Multiple entries in a keywords list are typically delimited by commas.


ONE
sameAs URL
Schema:

URL of a reference Web page that unambiguously indicates the item’s identity. E.g. the URL of the item’s Wikipedia page, Wikidata entry, or official website.


MANY
seeAlso Thing
URL
Bioschemas:

A pointer to any (somehow related) Thing. To be used whenever you are not so sure about the nature of the relation. Otherwise, use more precise terms from pre-existing Controlled Vocabularies.

MANY
url URL
Schema:

URL of the item.


MANY

Top ▲