Parallel Computer Architecture
 
																									
	Credit: 3
	Objective
	- 
		
			To understand the principles of parallel computer architecture 
- 
		
			To understand the design of parallel computer systems including modern parallel architectures  
- 
		
			To assess the communication and computing possibilities of parallel system architecture and to predict the performance of parallel applications 
	 
	Unit – I Fundamentals of Computer Design
	Defining Computer Architecture – Trends in Technology – Trends in Power in Integrated Circuits – Trends in Cost – Dependability – Measuring, Reporting and Summarizing Performance – Quantitative Principles of Computer Design – Basic and Intermediate concepts of pipelining – Pipeline Hazards – Pipelining Implementation issues.
	 
	Unit – II Instruction-Level Parallelism and Its Exploitation
	Instruction-Level Parallelism: Concepts and Challenges – Basic Compiler Techniques for Exposing ILP – Reducing Branch Costs with Prediction – Overcoming Data Hazards with Dynamic Scheduling – Dynamic Scheduling: Algorithm and Examples – Hardware-Based Speculation – Exploiting ILP Using Multiple Issue and Static Scheduling – Exploiting ILP Using Dynamic Scheduling, Multiple Issue and Speculation – Studies of the Limitations of ILP – Limitations on ILP for Realizable Processors – Hardware versus Software Speculation – Using ILP Support to Exploit Thread-Level Parallelism
	 
	Unit – III Data-Level and Thread-Level Parallelism
	Vector Architecture – SIMD Instruction Set Extensions for Multimedia – Graphics Processing Units – Detecting and Enhancing Loop-Level Parallelism – Centralized Shared-Memory Architectures – Performance of Shared-Memory Multiprocessors – Distributed Shared Memory and Directory Based Coherence – Basics of Synchronization – Models of Memory Consistency – Programming Models and Workloads for Warehouse-Scale Computers – Computer Architecture of Warehouse-Scale Computers – Physical Infrastructure and Costs of Warehouse-Scale Computers
	 
	Unit – IV Memory Hierarchy Design
	Cache Performance – Six Basic Cache Optimizations – Virtual Memory – Protection and Examples of Virtual Memory – Ten Advanced Optimizations of Cache Performance – Memory Technology and Optimizations – Protection: Virtual Memory and Virtual Machines – The Design of Memory Hierarchies
	 
	Unit – V Storage Systems & Case Studies
	Advanced Topics in Disk Storage – Definition and Examples of Real Faults and Failures – I/O Performance, Reliability Measures and Benchmarks – Designing and Evaluating an I/O System – The Internet Archive Cluster 
	Case Studies / Lab Exercises: INTEL i3, i5, i7 processor cores, NVIDIA GPUs, AMD, ARM processor cores – Simulators – GEM5, CACTI, SIMICS, Multi2sim and INTEL Software development tools.
	Outcome
	- 
		
			Students accustomed with the representation of data, addressing modes, and instructions sets. 
- 
		
			Students able to understand parallelism both in terms of a single processor and multiple processors 
- 
		
			Technical knowhow of parallel hardware constructs to include instruction-level parallelism for multi core processor design 
	 
	Text Books
	- 
		
			David.A.Patterson, John L.Hennessy, "Computer Architecture: A Quantitative approach", Elsevier, 5th Edition 2012. 
- 
		
			K.Hwang, Naresh Jotwani, “Advanced Computer Architecture, Parallelism, Scalability, Programmability”, Tata McGraw Hill, 2nd Edition 2010.