CIF21 DIBBS: EI: VIFI:Virtual Information-Fabric Infrastructure (VIFI) for Data-Driven Decisions from Distributed Data

  • Tolone, William W.J. (PI)
  • Djorgovski, Stanislav S.G. (CoPI)
  • Talukder, Ashit A. (CoPI)
  • Hadzikadic, Mirsad M. (CoPI)
  • Al-shaer, Ehab E.S. (CoPI)
  • Tao, Yong Y.X. (CoPI)

Project Details

Description

Data discovery and data analytics often rely on the use of multiple data sources and data residing in distributed locations. This project builds infrastructure that encourages data-driven discovery from distributed, fragmented datasets without requiring movement of massive amounts of data and without exposing sensitive raw datasets to end users. The capability will be applied to a wide range of science topics: to the large sky surveys of astronomy, for which the collecting instruments are distributed nationally and internationally; to classify Earth science satellite data; for the management of sickle-cell disease and antimicrobial resistance surveillance studies; and to integrate the highly distributed and fragmented data sources needed for multi-hazard mitigation and for sustainable and resilient human-building ecosystem research. The project outlines an ambitious and will enable interdisciplinary training in multiple universities and institutions, and contribute to the training of early career researchers

A Virtual Information-Fabric Infrastructure (VIFI) is created, allowing scientists to search, access, manipulate, and evaluate fragmented, distributed data in the information 'fabric' (the infrastructure to facilitate data sharing) without directly accessing or moving large amounts of data. The system addresses the challenges of coordinating loosely federated infrastructure, distributed data management, security and privacy. The architecture combines a set of loosely coupled components representing some proven capabilities with several emerging components. The VIFI infrastructure includes a novel orchestration layer for on-site analytics and hybrid-infrastructure (GPU, CPU) management, a dynamic secure container-based infrastructure which enables online adaptive analytics from unshareable data at distributed locations, and enhanced data and code management tools. The layer also provides search, access and query based on improvements using persistent identifiers and automated semantic descriptions (or metadata) of raw data using semantic data mining techniques. By integrating several NSF-funded components into a coherent whole, VIFI allows researchers to search, access, manipulate and evaluate data elements without requiring detailed familiarity with the data infrastructure itself. The system contributes to and expands the sets of resources serving diverse communities, and is extensible to additional communities. The project contains a substantial outreach effort, including training of early career scientists.

StatusFinished
Effective start/end date1/10/1630/9/22

Funding

  • National Science Foundation: US$3,999,531.00

ASJC Scopus Subject Areas

  • Ecology
  • Computer Science(all)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.