Collaborative Research: CSR: Small: Cross-layer learning-based Energy-Efficient and Resilient NoC design for Multicore Systems

  • Wang, Ke (PI)

Project Details

Description

The proliferation of multiple cores on the chip has signaled the advent of communication-centric, rather than computation-centric systems. Consequently, the design of low latency, high bandwidth, power-efficient, and reliable Network-on-Chips (NoCs) is proving to be one of the most critical challenges to achieving the performance potential of future multicore systems. However, as multicores are facilitating an enormous integration capacity, rapid transistor scaling has led to a steady degradation of the device and circuit reliability: unpredictable device behavior will undeniably increase and will result in a significant increase in faults (both permanent and transient), and hardware failures. The ramifications for the NoC are immense: a single fault in the NoC may paralyze the working of the entire chip. While considerable efforts are undertaken to tackle the reliability challenge of NoCs, most current solutions concentrate on local optimizations within the entire NoC abstractions (e.g., circuit, message, and network layers). These solutions tend to possess limited knowledge of the overall system and are therefore reactive in behavior, making worst-case assumptions and overprovisioning, and as a result, they introduce significant power, area, and performance overheads while not completely solving the reliability challenge.This research project tackles the critical NoC reliability challenge by developing a comprehensive, cooperative, and adaptive multi-layer approach for designing reliable NoCs from fault-susceptible components, with globally-optimized power, performance, and costs. To achieve this research goal, this project is organized into four interrelated research tasks. First, this research project conducts a comprehensive study of the fundamental mechanisms that underlie the reliability issues across NoC abstractions. A detailed analysis of the dynamic interactions of NoC abstractions and design trade-offs. Second, the research project develops a cross-layer NoC architecture for resilient on-chip communication with machine-learning-based optimization. Third, this research project aims to incorporate the application layer and off-chip communications and develops a holistic design framework that can automatically capture and adapt to the various computation and communication requirements of different applications with optimized performance, power, and reliability. Finally, the project evaluates the designed framework by developing a cycle-accurate simulation framework and an FPGA prototype. This project will significantly advance the fundamental understanding of the interplay between the NoC and the rest of the components on the chip (cores, memory, etc.) as well as design tradeoffs between performance, power, reliability, and cost in future massively defective nanometer technologies. The developed NoC framework will benefit future multi-core architectures and computing systems with system-level performance and reliability improvements.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
StatusActive
Effective start/end date1/10/2330/9/26

Funding

  • National Science Foundation: US$225,000.00

ASJC Scopus Subject Areas

  • Computer Science(all)
  • Computer Networks and Communications
  • Engineering(all)

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.