III: Small: Optimization Techniques for Scalable Semantic Web Data Processing in the Cloud

  • Anyanwu-ogan, Kemafor K. (Investigador principal)

Detalles del proyecto

Descripción

The use of cloud-based data processing platforms is an increasingly attractive alternative for large-scale data processing. There is active investigation into their use for various types of processing tasks on large-scale unstructured and structured data. However, due to an increased interest in many communities to enable more automatic sharing and exchange of data on the Web using Semantic Web techniques, there is a rapid surge in the availability of very large, real-world, Semantic Web datasets. Such data are semi-structured and have more complex processing requirements than relational data processing due to the fine-grained modeling of data and also the need for inferencing during processing. Consequently, existing optimization techniques for cloud data processing platforms which often adapt relational processing optimization techniques do not address the needs of such workloads. Further, such techniques do not adequately account for the nuances of cloud runtime platforms such as Hadoop e.g., dataflow length as a cost metric, no a-priori existence of indexes and statistics.

This project contributes insight into query optimization requirements for Semantic Web data processing on Map Reduce platforms. Its contributions include a novel Nested TripleGroup data model and Algebra (NTGA), algebraic and dynamic cost query optimization techniques; inter and intra-work sharing techniques, data representation formats and system architecture issues of integrating Semantic Web optimization techniques into frameworks such as Apache Pig. The impact of this project will cut across the increasing range of communities that are aggressively adopting Semantic Web tenets such as, scientific, business, government and other general-purpose communities.

EstadoFinalizado
Fecha de inicio/Fecha fin1/9/1230/6/17

Financiación

  • National Science Foundation: USD446,942.00

!!!ASJC Scopus Subject Areas

  • Álgebra y teoría de números
  • Informática (todo)

Huella digital

Explore los temas de investigación que se abordan en este proyecto. Estas etiquetas se generan con base en las adjudicaciones/concesiones subyacentes. Juntos, forma una huella digital única.