Detalles del proyecto
Descripción
1) We will build a robust, dynamic Translator Standards and Reference Implementations
Component (SRI) that integrates the collaborations and investments that the NCATS Translator
has made to date. This component will consist of a suite of standards and products, a model for
their governance, and processes to coordinate integration and shared implementation:
● Community governance coordination will be developed with community buy-in to ensure
an effective collaborative environment, and drive consortium-wide consensus on the other
components.
● Architecture and API specifications will drive community efforts to define details of project
architecture and communication protocols across Translator Knowledge Providers (KPs),
Autonomous Relay Agents (ARAs), and the Autonomous Relay System (ARS).
● The BioLink model will define the standard entity types, relationship types, and a schema
shared by all Translator components. This includes related utility libraries and a novel
approach to accommodate multiple alternate data modeling perspectives.
● Integrated reference ontologies will provide BioLink-compliant terms and relationships. We
will draw on the ROBOKOP Ubergraph framework [1] , the Monarch integrated ontologies,
and other ontologies from Open Biological and Biomedical Ontologies (OBO) [2] .
● A continually-updated knowledge graph and data lake will provide Translator with a
standardized and integrated global view of the whole information landscape.
● Next-generation Shared Translator Services will integrate features of ROBOKOP [3] ,
Monarch [4] , BioLink [5] , and the reasoner APIs to remove integration barriers. These
services will provide validation, lookup, and mapping functionality for use across Translator.
● A registry of Translator KPs, ARAs, and shared services will increase efficiency,
eliminate duplication of effort, and promote collaboration.
2) Our proposed SRI will address the problem of connecting together different components and
data/information sources at scale, with community buy-in, and with a plan for sustainability.
3) For the development of the standards component of the SRI, our plan will begin with
accepted Translator standards, and we will work with the ARS, ARAs, and KPs to identify gaps.
We will have a community process for contributing to the standards, making use of GitHub pull
requests and voting, to help everyone contribute effectively and fairly with clear attribution. We
will ensure rigorous documentation and testing. For the reference implementation component,
we will stand up core Translator services, and will include additional services if they are useful to
more than one Translator component rather than used by only one.
4) Consensus-building is hard. Our team has proven expertise and resources to identify needs,
refine solutions, and find agreement , thereby successfully bringing infrastructure to fruition. Our
team also has the technical and biological expertise to design and test the necessary standards,
having been at the forefront of multiple ontology, data standards, and large enterprise software
initiatives.
5) The Translator infrastructure is by nature heterogeneous, distributed, and growing;
consequently, the most significant data and infrastructure challenge is managing the validity,
currency, equivalency, and typing of entities (diseases, phenotypes, drugs, etc.). Our group has
developed several innovative algorithms for managing this and related problems; these
algorithms are in use for other integration projects and will be modified to suit Translator needs.
Component (SRI) that integrates the collaborations and investments that the NCATS Translator
has made to date. This component will consist of a suite of standards and products, a model for
their governance, and processes to coordinate integration and shared implementation:
● Community governance coordination will be developed with community buy-in to ensure
an effective collaborative environment, and drive consortium-wide consensus on the other
components.
● Architecture and API specifications will drive community efforts to define details of project
architecture and communication protocols across Translator Knowledge Providers (KPs),
Autonomous Relay Agents (ARAs), and the Autonomous Relay System (ARS).
● The BioLink model will define the standard entity types, relationship types, and a schema
shared by all Translator components. This includes related utility libraries and a novel
approach to accommodate multiple alternate data modeling perspectives.
● Integrated reference ontologies will provide BioLink-compliant terms and relationships. We
will draw on the ROBOKOP Ubergraph framework [1] , the Monarch integrated ontologies,
and other ontologies from Open Biological and Biomedical Ontologies (OBO) [2] .
● A continually-updated knowledge graph and data lake will provide Translator with a
standardized and integrated global view of the whole information landscape.
● Next-generation Shared Translator Services will integrate features of ROBOKOP [3] ,
Monarch [4] , BioLink [5] , and the reasoner APIs to remove integration barriers. These
services will provide validation, lookup, and mapping functionality for use across Translator.
● A registry of Translator KPs, ARAs, and shared services will increase efficiency,
eliminate duplication of effort, and promote collaboration.
2) Our proposed SRI will address the problem of connecting together different components and
data/information sources at scale, with community buy-in, and with a plan for sustainability.
3) For the development of the standards component of the SRI, our plan will begin with
accepted Translator standards, and we will work with the ARS, ARAs, and KPs to identify gaps.
We will have a community process for contributing to the standards, making use of GitHub pull
requests and voting, to help everyone contribute effectively and fairly with clear attribution. We
will ensure rigorous documentation and testing. For the reference implementation component,
we will stand up core Translator services, and will include additional services if they are useful to
more than one Translator component rather than used by only one.
4) Consensus-building is hard. Our team has proven expertise and resources to identify needs,
refine solutions, and find agreement , thereby successfully bringing infrastructure to fruition. Our
team also has the technical and biological expertise to design and test the necessary standards,
having been at the forefront of multiple ontology, data standards, and large enterprise software
initiatives.
5) The Translator infrastructure is by nature heterogeneous, distributed, and growing;
consequently, the most significant data and infrastructure challenge is managing the validity,
currency, equivalency, and typing of entities (diseases, phenotypes, drugs, etc.). Our group has
developed several innovative algorithms for managing this and related problems; these
algorithms are in use for other integration projects and will be modified to suit Translator needs.
Estado | Activo |
---|---|
Fecha de inicio/Fecha fin | 23/1/20 → 30/11/24 |
Enlaces | https://reporter.nih.gov/project-details/11073187 |
!!!ASJC Scopus Subject Areas
- Informática (todo)
Huella digital
Explore los temas de investigación que se abordan en este proyecto. Estas etiquetas se generan con base en las adjudicaciones/concesiones subyacentes. Juntos, forma una huella digital única.