GitHub Twitter

Protein Profile

Version: 0.6 (10 November 2018)

Bioschemas profile describing a Protein in Life Sciences.


Contributors

The following people have been involved in the creation of this specification document. They are all members of the Proteins group.

Group Leader(s)

Maria Martin

Leyla Garcia

Other team members

Schema.org hierarchy

This Profile fits into the schema.org hierarchy as follows:

Thing > BioChemEntity > Protein

Description

This Protein profile specification presents the markup when describing a Protein.





Key to specification table

Schema.org properties where the Expected Types have been changed, or new (i.e., Bioschemas created) properties/types are green.

Schema.org properties/types are red.

Pending Schema.org properties/types are blue.

External (i.e., from 3rd party ontology) properties/types are black.


CD = Cardinality


View all examples
Property Expected Type Description CD Controlled Vocabulary Example
Marginality: Minimum.
identifier PropertyValue
Text
URL
Schema:

The identifier property represents any kind of identifier for any kind of Thing, such as ISBNs, GTIN codes, UUIDs etc. Schema.org provides dedicated properties for representing many of these, either as textual strings or as URL (URI) links. See background notes for more details.


ONE
View ≪identifier≫ example
name Text
Schema:

The name of the item.


ONE
View ≪name≫ example
Marginality: Recommended.
associatedDisease MedicalCondition
URL
Schema:

Disease associated to this BioChemEntity.


Bioschemas:

Disease associated to this protein, if any.

MANY
View ≪associatedDisease≫ example
description Text
Schema:

A description of the item.


Bioschemas:

Protein function. We recommend to start the description with “Function: […]”.

ONE
View ≪description≫ example
image ImageObject
URL
Schema:

An image of the item. This can be a URL or a fully described ImageObject.


MANY
isEncodedBy BioChemEntity
Gene
URL
Schema:

BioChemEntity from which this protein was encoded from.


Bioschemas:

Gene(s) from which this protein was encoded.

MANY

Any suitable ontology

View ≪isEncodedBy≫ example
isPartOfBioChemEntity BioChemEntity
URL
Schema:

Indicates a BioChemEntity that is (in some sense) a part of this BioChemEntity. Inverse property: hasBioChemEntityPart


Bioschemas:

Bioschemas Protein: For proteins, it can be used to link to protein sequence annotations such as domains, sites, regions, etc.

MANY

Any suitable ontology

View ≪isPartOfBioChemEntity≫ example
taxonomicRange Taxon
Text
URL
Schema:

The taxonomic grouping of the organism that expresses, encodes, or in someway related to the BioChemEntity.


Bioschemas:

Bioschemas Protein: For proteins, it is recommended to use this property to specify the taxon/organism corresponding to a genome including a expressed gene that can be translated to this protein. For taxon/organism, it is a good practice to use hasCategoryCode to point to a controlled vacabulary such as NCBI taxon or UniProt Taxonomy.

MANY

Taxonomies or any suitable controlled vocabulary

View ≪taxonomicRange≫ example
url URL
Schema:

URL of the item.


Bioschemas:

Link to the official webpage associated to this entity.

ONE
View ≪url≫ example
Marginality: Optional.
additionalProperty PropertyValue
Schema:

A property-value pair representing an additional characteristics of the entitity, e.g. a product feature or another characteristic for which there is no matching property in schema.org.

Note: Publishers should be aware that applications designed to use specific schema.org properties (e.g. http://schema.org/width, http://schema.org/color, http://schema.org/gtin13, …) will typically expect such data to be provided using those properties, rather than using the generic property/value mechanism.


Bioschemas:

Whenever possible, please use a property coined in a third-party well-known vocabulary. For instance, you can directly use RO ObjectProperty: enables as a property to express how a protein or gene enables some GO molecular function. If you still want or need to use additionalProperty, please use (i) property name to specify the name of the property, (ii) additionalType (if possible) to better specify the nature of the property, and (iii) value to link to the object/range of this property. We recommed to look at the OBO Relations Ontology (RO) or the Semanticscience Integrated Ontology (SIO) as starting points.

Bioschemas Protein: If no suitable property exists in this profile, use any ontology term coined as a property and suitable for your needs. For instance sio:SIO_000095 (is member of) could be used to model the relation between a protein and a protein clan.

MANY
View ≪additionalProperty≫ example
additionalType URL
Schema:

An additional type for the item, typically used for adding more specific types from external vocabularies in microdata syntax. This is a relationship between something and a class that the thing is in. In RDFa syntax, it is better to use the native RDFa syntax - the ‘typeof’ attribute - for multiple types. Schema.org tools may have only weaker understanding of extra types, in particular those defined externally.


Bioschemas:

Any ontology term describing the protein concept. This is in addition to the official type used in Bioscheamas to describe a protein.

The official type for the Protein profile is PR 000000001

MANY

wikidata:protein SIO:010043

View ≪additionalType≫ example
alternateName Text
Schema:

An alias for the item.


MANY
View ≪alternateName≫ example
enablesMF CategoryCode
PropertyValue
DataRecord
ProteinAnnotation
Bioschemas:

RO:0002327 (enables). GO molecular function enabled by the gene/protein. Recommended range: BioChemEntity or CategoryCode, ProteinAnnotation if evidence should be included.

MANY

Gene Ontology (GO)

View ≪enablesMF≫ example
hasBioChemEntityPart BioChemEntity
URL
Schema:

Indicates a BioChemEntity that (in some sense) has this BioChemEntity as a part. Inverse property: isPartOfBioChemEntity


Bioschemas:

Bioschemas Protein: This property can be used to include GO cellular locations; for cellular locations it is a good practice to use hasCategorryCode to point to a GO Cellular Location term.

MANY

Any suitable ontology

View ≪hasBioChemEntityPart≫ example
hasCategoryCode CategoryCode
Schema:

A Category code contained in this code set.


MANY

Any suitable controlled vocabulary

View ≪hasCategoryCode≫ example
hasRepresentation PropertyValue
Text
URL
Schema:

A common representation such as a protein sequence or chemical structure for this entity. For images use schema.org/image.


Bioschemas:

Bioschemas Protein: This property could be used, for instance, to register a sequence protein as it is a representation of the protein. If you want to better define the nature of the representation, use a PropertyValue as described in additionalProperty.

MANY
View ≪hasRepresentation≫ example
involvedInBP CategoryCode
PropertyValue
DataRecord
ProteinAnnotation
Bioschemas:

RO:0002331 (is involved in). GO biological process this gene/protein is involved in. Recommended range: BioChemEntity or CategoryCode, ProteinAnnotation if evidence should be included.

MANY

Gene Ontology (GO)

View ≪involvedInBP≫ example
mainEntityOfPage CreativeWork
URL
Schema:

Indicates a page (or other CreativeWork) for which this thing is the main entity being described. See background notes for details. Inverse property: mainEntity.


Bioschemas:

Link via DataRecord to the main DataRecord representing this entity in a dataset.

MANY
View ≪mainEntityOfPage≫ example
sameAs URL
Schema:

URL of a reference Web page that unambiguously indicates the item’s identity. E.g. the URL of the item’s Wikipedia page, Wikidata entry, or official website.


Bioschemas:

Link to any resource other than the Record and the official webpage, for instance a Wikipedia page.

MANY
View ≪sameAs≫ example

Top ▲