Parallel Computer Architecture
Credit: 3
Objective
-
To understand the principles of parallel computer architecture
-
To understand the design of parallel computer systems including modern parallel architectures
-
To assess the communication and computing possibilities of parallel system architecture and to predict the performance of parallel applications
Unit – I Fundamentals of Computer Design
Defining Computer Architecture – Trends in Technology – Trends in Power in Integrated Circuits – Trends in Cost – Dependability – Measuring, Reporting and Summarizing Performance – Quantitative Principles of Computer Design – Basic and Intermediate concepts of pipelining – Pipeline Hazards – Pipelining Implementation issues.
Unit – II Instruction-Level Parallelism and Its Exploitation
Instruction-Level Parallelism: Concepts and Challenges – Basic Compiler Techniques for Exposing ILP – Reducing Branch Costs with Prediction – Overcoming Data Hazards with Dynamic Scheduling – Dynamic Scheduling: Algorithm and Examples – Hardware-Based Speculation – Exploiting ILP Using Multiple Issue and Static Scheduling – Exploiting ILP Using Dynamic Scheduling, Multiple Issue and Speculation – Studies of the Limitations of ILP – Limitations on ILP for Realizable Processors – Hardware versus Software Speculation – Using ILP Support to Exploit Thread-Level Parallelism
Unit – III Data-Level and Thread-Level Parallelism
Vector Architecture – SIMD Instruction Set Extensions for Multimedia – Graphics Processing Units – Detecting and Enhancing Loop-Level Parallelism – Centralized Shared-Memory Architectures – Performance of Shared-Memory Multiprocessors – Distributed Shared Memory and Directory Based Coherence – Basics of Synchronization – Models of Memory Consistency – Programming Models and Workloads for Warehouse-Scale Computers – Computer Architecture of Warehouse-Scale Computers – Physical Infrastructure and Costs of Warehouse-Scale Computers
Unit – IV Memory Hierarchy Design
Cache Performance – Six Basic Cache Optimizations – Virtual Memory – Protection and Examples of Virtual Memory – Ten Advanced Optimizations of Cache Performance – Memory Technology and Optimizations – Protection: Virtual Memory and Virtual Machines – The Design of Memory Hierarchies
Unit – V Storage Systems & Case Studies
Advanced Topics in Disk Storage – Definition and Examples of Real Faults and Failures – I/O Performance, Reliability Measures and Benchmarks – Designing and Evaluating an I/O System – The Internet Archive Cluster
Case Studies / Lab Exercises: INTEL i3, i5, i7 processor cores, NVIDIA GPUs, AMD, ARM processor cores – Simulators – GEM5, CACTI, SIMICS, Multi2sim and INTEL Software development tools.
Outcome
-
Students accustomed with the representation of data, addressing modes, and instructions sets.
-
Students able to understand parallelism both in terms of a single processor and multiple processors
-
Technical knowhow of parallel hardware constructs to include instruction-level parallelism for multi core processor design
Text Books
-
David.A.Patterson, John L.Hennessy, "Computer Architecture: A Quantitative approach", Elsevier, 5th Edition 2012.
-
K.Hwang, Naresh Jotwani, “Advanced Computer Architecture, Parallelism, Scalability, Programmability”, Tata McGraw Hill, 2nd Edition 2010.