Credit: 3
Objective
To understand the principles of parallel computer architecture
To understand the design of parallel computer systems including modern parallel architectures
To assess the communication and computing possibilities of parallel system architecture and to predict the performance of parallel applications
Unit – I Fundamentals of Computer Design
Defining Computer Architecture – Trends in Technology – Trends in Power in Integrated Circuits – Trends in Cost – Dependability – Measuring, Reporting and Summarizing Performance – Quantitative Principles of Computer Design – Basic and Intermediate concepts of pipelining – Pipeline Hazards – Pipelining Implementation issues.
Unit – II Instruction-Level Parallelism and Its Exploitation
Instruction-Level Parallelism: Concepts and Challenges – Basic Compiler Techniques for Exposing ILP – Reducing Branch Costs with Prediction – Overcoming Data Hazards with Dynamic Scheduling – Dynamic Scheduling: Algorithm and Examples – Hardware-Based Speculation – Exploiting ILP Using Multiple Issue and Static Scheduling – Exploiting ILP Using Dynamic Scheduling, Multiple Issue and Speculation – Studies of the Limitations of ILP – Limitations on ILP for Realizable Processors – Hardware versus Software Speculation – Using ILP Support to Exploit Thread-Level Parallelism
Unit – III Data-Level and Thread-Level Parallelism
Vector Architecture – SIMD Instruction Set Extensions for Multimedia – Graphics Processing Units – Detecting and Enhancing Loop-Level Parallelism – Centralized Shared-Memory Architectures – Performance of Shared-Memory Multiprocessors – Distributed Shared Memory and Directory Based Coherence – Basics of Synchronization – Models of Memory Consistency – Programming Models and Workloads for Warehouse-Scale Computers – Computer Architecture of Warehouse-Scale Computers – Physical Infrastructure and Costs of Warehouse-Scale Computers
Unit – IV Memory Hierarchy Design
Cache Performance – Six Basic Cache Optimizations – Virtual Memory – Protection and Examples of Virtual Memory – Ten Advanced Optimizations of Cache Performance – Memory Technology and Optimizations – Protection: Virtual Memory and Virtual Machines – The Design of Memory Hierarchies
Unit – V Storage Systems & Case Studies
Advanced Topics in Disk Storage – Definition and Examples of Real Faults and Failures – I/O Performance, Reliability Measures and Benchmarks – Designing and Evaluating an I/O System – The Internet Archive Cluster
Case Studies / Lab Exercises: INTEL i3, i5, i7 processor cores, NVIDIA GPUs, AMD, ARM processor cores – Simulators – GEM5, CACTI, SIMICS, Multi2sim and INTEL Software development tools.
Outcome
Students accustomed with the representation of data, addressing modes, and instructions sets.
Students able to understand parallelism both in terms of a single processor and multiple processors
Technical knowhow of parallel hardware constructs to include instruction-level parallelism for multi core processor design
Text Books
David.A.Patterson, John L.Hennessy, "Computer Architecture: A Quantitative approach", Elsevier, 5th Edition 2012.
K.Hwang, Naresh Jotwani, “Advanced Computer Architecture, Parallelism, Scalability, Programmability”, Tata McGraw Hill, 2nd Edition 2010.