A HEMOGENOUS ACCELERATOR FOR DEEP NEURAL NETWORKS

2022 COE Engineering Design Project (FM05)

Faculty Lab Coordinator

Farah Mohammadi

Topic Category

Embedded Systems

Preamble

Deep Neural Network (DNN) algorithms have shown significant advantages in AI-based applications. Pattern recognition, prediction, and control optimization are some examples of these artificial intelligence (AI) based applications. Due to the massive parallel processing of DNN algorithms, the performance of the existing computational platforms for running of ML algorithms is often limited by their huge communication overheads and storage requirements. As a result, architecting a low-overhead homogenous framework for execution of DNN techniques to improve of future AI-based system is a crucial task.

Objective

Architecting a high-performance homogenous accelerator (Pure multi-CPUs or Pure multi-GPUs) to execute DNN algorithms is the goal of this work. Popular companies such Intel and NVDIA are looking for low-cost high-performance hardware platforms to execute complex DNN algorithms. Homogeneous architectures for running complex algorithms are more accessible than heterogenous platforms. Therefore, the target of this work is comparing the existing homogenous architectures and proposing a low-overhead high-performance homogenous framework for future DNN algorithms.

Partial Specifications

-Simulating of Multi-CPU architectures for each DNN algorithm.
-Simulating of Multi-GPU architectures for each DNN algorithm.
- Studying on Energy, Performance, and Area of each DNN algorithm on multi-CPU.
- Studying on Energy, Performance, and Area of each DNN algorithm on multi-GPU.
- Gem5 should be installed.
-GPGPUSim should be installed.
- McPAT and HotSpot should be installed.

Suggested Approach

1- Developing a framework for different topologies of Multi-CPUs.
2- Developing a framework for different topologies of Multi-GPUs.
3- Simulating a Multi-CPU architecture on GEM5.
4- Simulating a Multi-GPU architecture on GPGPUSim.
5- Comparing of Energy Consumption and Performance of Multi-CPU and Multi-GPU under different DNN workloads and proposing an optimum architecture for each DNN workload.

Group Responsibilities

- Studying on DNN algorithms resource usage in existing Simulators such as GEM5 and GPGPUSim that Intel and AMD are working on them.
- Working on GEM5.
-Working on GPGPUSim.
- Comparison between Multi-CPUs and Multi-GPUs in terms of energy consumption and performance.
-Proposing an optimum architecture for each DNN workload.
-Prepare a technical report and present the results at the end of the program.

Student A Responsibilities

-Designing a Model for resource sharing for each DNN algorithm on Multi-CPUs.
-Working with GEM5 Simulator for the Multi-CPUs.

Student B Responsibilities

-Designing a Model for resource sharing for each DNN algorithm on Multi-GPUs.
-Working with GPGPUSim Simulator for the Multi-GPUs.

Student C Responsibilities

-Installing the GPGPUSim Simulator for the GPUs and extracting related performance and power log.
-Installing the GEM5 Simulator for the CPUs and extracting related performance and power log.

Student D Responsibilities

-Simulating the final proposed architecture for each DNN algorithm.

Course Co-requisites

Digital Systems, Programming in C, Microprocessors and Microsystems.

To ALL EDP Students

Due to COVID-19 pandemic, in the event University is not open for in-class/in-lab activities during the Winter term, your EDP topic specifications, requirements, implementations, and assessment methods will be adjusted by your FLCs at their discretion.

FM05: A HEMOGENOUS ACCELERATOR FOR DEEP NEURAL NETWORKS | Farah Mohammadi | Sunday September 11th 2022 at 01:41 PM