Dates: 14 April 2021 - 16 April 2021
Meeting page: ELIXIR-UK Hackathon
Agenda: Meeting Notes
We will work on collaborations around ELIXIR-UK services and projects. The event will be centered on collaborations across our existing services and projects such as Bioschemas, FAIRSharing, Galaxy, InterMine, ISAtools, Jalview, KnetMiner, RDMKit, TeSS, WorkflowHub and more.
The Bioschemas activities will centre around the projects detailed below. Notes from these projects can be found here.
Bioschemas Live Deployments
Project details: The Bioschemas live deploys page lists the deployments known to the Bioschemas community. Currently the data for the page is represented in YAML and is page type oriented rather than resource oriented. This means that there is a lot of repeated information, e.g. resource name, URL, node, and also that the count of the number of marked up resources is inaccurate.
This project would migrate the data to a JSON representation and update the webpages that display the content of the data. The data fields should be extended to include the resource sitemap, and possibly hints about the deployment technology (e.g. Single page application) that can make consuming the content easier. It should also provide a mechanism to allow for new deployments to be more easily registered. The outcome should enable easier consumption of the list of live deploys by aggregators such as FAIRsharing.
UK Bioschemas Knowledge Graph: Scrape markup deployment and make available for exploitation
Project details: The promise of Bioschemas is that it makes consuming data from multiple resources more straightforward. This project would explore scraping some of the known Bioschemas deployments, aligning the page centric markup to concept oriented, and deploy these in a triplestore for further exploration.
As the project progresses, we can verify that the existing tutorials are sufficient to support developers to deploy their markup.
Bioschemas infrastructure for profile development
Project details: Bioschemas profiles are community recommendations specifying the Schema.org properties to use for describing a particular type of resource. Profiles are currently developed in GSheets and then converted to YAML for display on the Bioschemas website using GoWeb. We’d like to move to a more machine readable format for the profiles, such as JSON-Schema or ShEx, with the canonical form of the profile being the GitHub version rather than the GSheet. Some initial requirements for the infrastructure have been identified, but two crucial elements are that the format should be machine processable and that it should be editable by non-technical users. We would also like to more closely align the profiles with their underlying Schema.org type. Some initial work has been done in this regard.
One possibility for the non-technical user interface could be JSONschema.net. We would need to see how this can be integrated with GitHub.
We would also like to explore the use of GitHub actions to verify the correctness of the markup within the profiles to provide a level of validation for deployment.
Extending Bioschemas to Agri-food applications
Project details: This is based on some draft work that has already been done to use Bioschemas in the domain of plant biology, agriculture and food, and extend its types or profiles where necessary. Possible activities for the hackathon might be: reviewing the draft modelling that has already been done, considering new use cases (eg, BrAPI/MIAPPE, WheatIS, PHI-BASE), converting real datasets.
RO-Crate, Bioschemas, WorkflowHub, Common Workflow Language (Stian Soiland-Reyes, Stuart Owen)
Project details: RO-Crate is a community-led set of recommendations for packaging research output as Research Objects with rich metadata, based on schema.org. WorkflowHub.eu is a registry of computational workflow for life sciences, agnostic to workflow languages and platforms. It uses RO-Crate as packaging of workflows with their metadata, for import, export and publishing, and entry pages in the WorkflowHub is annotated with Bioschemas markup, including the new ComputationalWorkflow profile now being aligned with the ComputationalTool profile. The WorkflowHub currently exposes the packaged RO-Crate as a single downloadable zip file.
Goals of the project:
- Adding an RO-Crate preview page to the WorkflowHub, which will include Bioschemas Dataset JSON-LD markup exposed for Google Dataset search. May use GA4GH TRS API, already used for UseGalaxy.* integration.
- The RO-Crate preview page and JSON-LD will be made available as separate Web resources, linked from the Workflow entry page
- The RO-Crate preview page will also include and visual overview of the RO-Crate package, and the ability to browse and download individual components (e.g. using makehtml)
- Test and improve prototype RO-Crate validator https://github.com/KockataEPich/CheckMyCrate of Workflow RO-Crate profile
- Implement more of Marc Portier’s thoughts on RO-Crate profiles
- See https://www.researchobject.org/ro-crate/profiles.html