FAIR track metadata - FAIRification of Genomic Tracks started at 2018 as an ELIXIR Implementation Study focused on “FAIRifying” the metadata related to genomic annotation track files contained in track hubs. To achieve this, we have developed a common data model and technical solutions compatible with the existing TrackHub exchange format for genome browser tracks and implemented demonstrators to show the feasibility of this proposal across systems and programming languages.

News and updates

May 31, 2021:
The ELIXIR Interoperability Platform will be organising a workshop in autumn 2021 on the FAIRification of genomic tracks using FAIRtracks. We are currently running a survey to collect input from potential users. We invite the broad community of researchers, data providers, developers, data curators, and other interested parties to participate. Please forward the link through your local channels if relevant. Please click here to take the survey.

Apr 27, 2021:
The 7th ELIXIR All Hands meeting will be held virtually 1-11 June 2021. Visit the official event page for more information and registration. As part of this event, the FAIRtracks team is organising a workshop in collaborations with FAIRplus and the FAIR Cookbook titled "Tooling up for FAIR: Documenting the FAIRtracks use case in the FAIR Cookbook". We will demonstrate the FAIRification process of genomic trakcs metadata, discussing the general state of FAIR tooling and invite the audience to discuss the implementation of a comprehensive data processing framework for custom FAIRification pipelines. The event will take place on June 2nd at 13:00.

Apr 1, 2021:
The first version of the manuscript "Recommendations for the FAIRification of genomic track metadata" describing FAIRtracks and its ecosystem is published on F1000Research and in the process of being peer reviewed.

Feb 15, 2021:
The FAIRtracks ecosystem is included to the ELIXIR portfolio as a Recommended Interoperability Resource (RIR). Read the official news item here. RIRs are ELIXIR services considered to be vital to enable the interoperability of databases and metadata. These resources include registries, tools and standards to ensure data is Findable, Accessible, Interoperable, Reusable (FAIR).

Note: These web pages will be updated shortly to a new design.

Overview of the implemented FAIRtracks infrastructure

Publications and presentations

FAIRtracks is a JSON Schema defining a draft standard for minimal genomic track metadata. FAIRtracks is supported by the TrackHub registry, the TrackFind metadata search and curation engine and the downstream track analysis tools GSuite HyperBrowser and EPICO.

GitHub repository of the FAIRtracks draft standard:

https://github.com/fairtracks/fairtracks_standard

FAIRtracks validator service

The FAIRtracks validator is a JSON Schema validator, that is able to check additional constraints specific to the FAIRtracks draft standard. Such extra constraints have been declared using reference extensions on the JSON Schema vocabulary.

The FAIRtracks validator is hosted online as a REST service:

http://fairtracks.bsc.es/api/

The FAIRtracks validator can also be installed locally from this GitHub repository:

https://github.com/fairtracks/fairtracks_validator

Screencasts showing how to use the FAIRtracks validator:

TrackFind is a track search engine and metadata FAIRification service. TrackFind supports crawling of the Track Hub Registry and other data portals to fetch track metadata. Crawled metadata can be accessed through hierarchical browsing or by search queries, both through a web-based user interface, and as a REST API. TrackFind supports advanced SQL-based search queries that can be easily built in the user interface, and the search results can be browsed and exported in JSON or GSuite format. The RESTful API allows downstream tools and scripts to easily integrate TrackFind search, as demonstrated by the GSuite HyperBrowser and EPICO.

TrackFind is available with a web-based user interface from here:

https://trackfind.elixir.no/

TrackFind is also available as a REST API, as documented here:

https://app.swaggerhub.com/apis-docs/FAIRtracks/TrackFind/1.0.0

GitHub repository for TrackFind:

https://github.com/elixir-no-nels/trackfind

The Track Hub Registry services, maintained by EMBL-EBI, allows independent researchers to distribute their track hubs. Each track hub is a set of text files with links to data files, display configuration for each file, but also some metadata, which is used by the browsers to dynamically create selection menus. We extended the Track Hub Registry with support for distributing FAIRtracks-formatted metadata alongside the existing Track Hub metadata content, with the future goal of better integrating these. Also, the REST endpoints were improved to better support metadata queries by outside services, e.g., TrackFind.

URL to The Track Hub Registry:

https://www.trackhubregistry.org/

GitHub repository for the Track Hub Registry:

https://github.com/Ensembl/trackhub-registry

TrackFind client in:

The GSuite HyperBrowser is a general purpose web-based platform for rigorous statistical analysis of track data, built upon the Galaxy framework. The HyperBrowser already has support for a track search mechanism (limited prototype), making use of the GSuite format to move collections of track data (typically resulting from a track search operation) through both basic and advanced data manipulation and analysis steps. A TrackFind client has been implemented to replace the existing prototype, and proved to work with BLUEPRINT data.

A test version of GSuite HyperBrowser containing the TrackFind client tool is available from the following URL:

https://hyperbrowser.uio.no/trackfind_test/

Note:
The TrackFind client tool is available from the left-hand tool menu, under the header "Create a GSuite of genomic tracks". The name of the tool is: "Create a remote GSuite from a public repository (TrackFind client)".

GitHub repository for GSuite HyperBrowser:

https://github.com/hyperbrowser/genomic-hyperbrowser

TrackFind client in:

EPICO is an open-access reference set of tools, libraries and APIs to develop comparative epigenomic data portals, as well as a data and metadata validator and database loader. EPICO components work with a customizable, rich data model where ontology term checks can be introduced for specific fields, as a generalization to enumerated values. EPICO has been used to implement:

The BLUEPRINT Data Analysis Portal

The BLUEPRINT Data Analysis Portal provides a virtual desktop for the comparative analysis of epigenetic features, recorded features (genes, transcripts, etc.) and pathways in the context of differentiation of hematopoietic lineages. The EPICO system is modified to upload the FAIR metadata from TrackFind.

GitHub repositories with changes on the EPICO system: