Intel Multicore Lab
Equipment Details
Intel Software Development Tools
Intel Integrated Performance Primitives
Intel Thread Building Blocks
Intel Vtune Amplifier
Intel Cluster Toolkit
Intel MPI library
Intel Trace Analyser and Collector
Eucalyptus cloud computing tool
MPI programming
MPI_Send()
MPI_Receive()
SESC (Super Escalar Simulator)
-
Multi-processor simulator used for modeling caches, out-of-order pipeline.
-
Capable of simulating static and dynamic instructions.
M5 Sim
-
It is an event driven simulation tool
-
Enables users to simulate a multi-core environment
-
It models CPU core, caches as objects.
List of M.Tech Projects in INTEL MULTICORE LAB
S. No.
|
Roll No.
|
Name
|
Title of the Project
|
Month & Year
|
1
|
CSA 0514
|
G. Pravinth
|
A Cache-Aware Scheduling Scheme for Real Time Tasks on Multicore Platforms
|
Dec-06
|
2
|
CSA 0513
|
V. SenthilKumar
|
Parallelization Methodology for Multicore Architecture Simulation
|
Dec-06
|
3
|
CSA 0515
|
M. Sivaram
|
A Cycle Accurate ISS for a Dynamically Reconfigurable Processor Architecture
|
Dec-06
|
4
|
CSA 0501
|
Sunitha P. George
|
A Generic Dual-Core Architecture
|
Dec-06
|
5
|
CSA 0513
|
V. SenthilKumar
|
Parallelization and Power Evaluation Methodology for Multicore Architecture Simulation
|
May-07
|
6
|
206107001
|
B R Prasad
|
Interconnection In Multicore Architecture Design
|
May-09
|
7
|
206108019
|
Srinivas Reddy A
|
Study of Ear Segmentation For Implementing Face Recognition
|
Dec-09
|
8
|
206108002
|
Atul Baban Chavan
|
A Proposal Of Thread Scheduler Framework For Multi-core Platform (Phase-I)
|
Dec-09
|
9
|
206108021
|
Dhawaleswar Rao
|
Study on The Performance of Some Web Caching Replacement Algorithms
|
Dec-09
|
10
|
206108002
|
Atul Baban Chavan
|
A Proposal Of Thread Scheduler Framework For Multi-core Platform(Phase-II)
|
May-10
|
11
|
206109020
|
Hathiram Banoth
|
Study of Performance Issues on Multicore Architecture
|
Dec-10
|
12
|
206109026
|
Pavan Kumar Paruchuri
|
Study of Cache-Aware Real-Time Schedulers In Multicore Platforms
|
Dec-10
|
13
|
206109026
|
Pavan Kumar Paruchuri
|
Global Scheduling Algorithms For Small To Medium Multicore Platforms
|
May-11
|
14
|
206109020
|
Hathiram Banoth
|
Studies of Cache Performance Evaluation of Multicore Architectures For The ISAs
|
May-11
|
15
|
206111035
|
Amit Kumar Singh
|
Techniques For Better Utilization Of Shared Caches In Multicore Architectures
|
Dec-12
|
16
|
206111035
|
Amit Kumar Singh
|
Study on performance of cache coherence protocols for multicore architectures
|
May-13
|
17
|
206111008
|
Tanmoy Kundu
|
Studies On The Impact Of Memory Management On Process Scheduling In The Context Of Multicore Architecture
|
Dec-12
|
18
|
206111008
|
Tanmoy Kundu
|
Implementation of scheduling schemes to mitigate shared resource contention in multicore architecture
|
May-13
|
19
|
206112029
|
Prabhin
|
BIOS Design for Thunderbolt in Next Generation Intel Platforms
|
Dec-13
|
20
|
206112029
|
Prabhin
|
BIOS Design for Thunderbolt in Next Generation Intel Platforms
|
May-14
|
21 |
206113017
|
S. Vinod kumar
|
A proposed schema for efficient Packet classification in network Processors
|
July-14
|
22 |
206113034
|
K. Hemalatha
|
Memory aware task scheduling for real Time operating systems
|
July-14
|
23 |
206113017
|
S Vinod Kumar
|
Implementing ravel and gruu feature for ims 3gpp release-11
|
Dec-14
|
24 |
206113034
|
K. Hemalatha
|
Adaptive bitrate transcoding for power efficient video streaming in mobile devices
|
Dec-14
|
25 |
206113017 |
S Vinod Kumar
|
Architecture for roaming user scenario for voice over ims with local breakout
|
May-15
|
26 |
206113034 |
K. Hemalatha
|
Pose estimation technique using modified posit method for mobile devices
|
May-15
|
27 |
206114015
|
Sangeetha Vikraman
|
Performance comparison for Reconfigurable and partial reconfigurable SOC
|
Dec-15
|
28 |
206114015
|
Sangeetha Vikraman
|
performance enhancement of low power mode in 3g firmware
|
May-15
|
29 |
206114015
|
Sangeetha Vikraman
|
Implementing digrf driver host test with multiple excecution context
|
May-16
|
30 |
206115007
|
Sreedeep C
|
Prevention of Side Channel Attacks using Hardware
|
May-16
|
31 |
206115013
|
Siraj P S
|
Hardware Security using Bloom Filters
|
May-16
|
32 |
206115008
|
Pranita Solanke
|
Implementation of MESI Cache Coherence Protocol using Snoop Filtering Technique
|
May-16
|
33 |
206115007
|
Sreedeep C
|
Dynamic Partial Re-configuration of Image Processing Blocks on FPGA
|
Dec-16
|
34 |
206115013
|
Siraj P S
|
Improving the Performance of H.265 Video Encoding using CPU+GPU Systems
|
Dec-16
|
35 |
206115008
|
Pranita Solanke
|
Efficient Hardware Implementation of Multi-Modular Exponentiation in RSA Algorithms
|
Dec-16
|
On Going and Completed Projects in INTEL LAB 2014-2016
Energy efficient modular exponentiation for PKC (Completed)
Modular exponentiation and modular multiplications are two fundamental operations in various cryptographic applications, and hence the performance of public-key cryptographic algorithms is strongly influenced by the efficient implementation of these operations. Reducing the frequency of modular multiplications and the time requirements for modular multiplication will help in developing efficient modular exponential algorithms. This work proposes an energy efficient modular exponential algorithm based on bit forwarding techniques. In particular, two algorithms, Bit Forwarding 1-bit (BFW1) and Bit Forwarding 2-bits (BFW2), which are modifications of the existing binary exponential algorithm, have been developed. Hardware realizations of the proposed algorithms have been evaluated in terms of throughput, power and energy. Results show increased throughput of the order of 11.02% and 15.13%, reduction in power to 1.93% and 6.35% and energy saving of the order of 1.9% and 6.35% for BFW1 and BFW2 algorithms respectively. Xilinx ISE-14.2 on Virtex-5 evaluation board and ICARUS Verilog simulation and synthesis tool are used for hardware realization for FPGA and synthesized using Cadence for ASIC.
©2016 Elsevier B.V. All rights reserved. (http://www.sciencedirect.com/science/article/pii/S0020019016301715)
Usability aware Resource saving in handheld devices (Completed)
The emergence of new operating systems and applications for mobile phones and tablets has necessitated the need for power optimization. Storage space has become another matter of concern as new operating systems have started supporting video codec and formats originally meant for desktop application without compression and conversion. The proposed approach tries to identify the region of interest for video by combining the approach of feature extraction with natural statistics for dynamic analysis of the scene. The portion outside the region of interest in the original video is depreciated in order to increase redundancy for pixel value in a frame. Open CV is being used for the implementation of saliency map. It is expected to reduce power consumption, file size and average CPU consumption for handheld device using this approach.
Adaptive Bitrate Transcoding for Power Efficient Video Streaming in Mobile Devices (Completed)
Video applications are an important part of mobile devices. Capacity of battery is increasing max of 10% per year, which is not sufficient for upcoming application & Operating System. Power consumption by video application depends on factors like network load, signal quality etc., and it can be optimized through heuristics based streaming. The work exploits adaptive bitrate streaming to determine the optimum bitrate as per available bandwidth. Selection of optimum bitrate ensures high quality delivery of video as well as optimum power consumption of the device. MPEG-DASH has been used for implementing the switching between the bitrates with fluctuating bandwidth using Java script, HTML, CSS in Android 4.0.4 operating system. The four bitrates selected for encoding are closer to the mean value which is available for streaming. The proposed method will lead to a low power consuming video streaming with high quality.
Packet classification in network processor (Completed)
Packet classification is the essential function in various applications like as, Router, switches and firewalls. Because of their performance and scalability limitations, current packet classification solutions are insufficient in addressing the growing network bandwidth and increasing new application. So we necessitate implementing the efficient techniques in software as well as hardware. The proposed work tries to reduce the memory space and increasing the performance, using efficient hash technique to reduce the memory space for predefined rule set that stored in the RAM. For improving the performance, the rule sets are grouped based on clustering method and applying Simulated Annealing technique further optimization.
Security in Reconfigurable Computing
The requirement for highly parallel computation and reduced heat production gives way for FPGA co-processors. Traditional processors are based on fetch and execute technology, while FPGAs act as program as circuit on the device. This makes FPGAs highly parallel and reduced frequency requirement which in turn reduces the heat produced. Existing processors have limit on the extension to which the amount of parallelism that can be achieved and speedup that can be increased without reducing the heat production. In cloud where the amount of computational requirements growing day by day, FPGAs can act as accelerators. The efficient hardware based implementation of algorithms and their security issues while deploying them in FPGAs on cloud are major issues. While using FPGAs on data centers, speed is the main factor to be considered that resource usage. Deploying our hardware design on data center face different security threats like hardware Trojans, cloning etc. While coming to other applications of reconfigurable computing, evolvable hardware using different evolutionary algorithms also comes into picture. With the availability of large amount of resources, reconfigurable hardware, a new era for designing hardware, changing already built hardware, providing security as needed etc. can be created. Whenever we find some application is killing the overall system can be found out, we could design the application specific hardware and deploy the application from main CPU to FPGA.
A study on High performance hybrid cache design for multi-core architecture using optimal cache partitioning techniques
Designers are responsible for selecting the appropriate cache according to the requirements of the system. The cache design space is big, as there are many variables that can affect the system’s behavior and performance These include the total size of the cache, it is associatively, the size of each cache line, the policy according to which lines are placed or replaced inside the cache array and the actual placement of the cache in the architecture and its distance from the processing cores The increasing number of transistors per chip widens this range of options, as it is now possible to bring bigger caches closer to the processor or introduce multi-level cache hierarchies on chip A straightforward solution to increase the cache’s effectiveness and thus improve the overall performance would be to increase the cache’s size and associatively. Consequently, the cache would be able to hold more data blocks and reduce conflicts between data lines that map on the same cache line. This approach is primarily limited by the available area on the chip. Additionally, it makes the cache slower and more power hungry, which could ultimately have a negative effect on the system clearly, the effectiveness of this solution is limited and designers have been searching for other alternatives. Memory subsystem is an essential part of such architectures. We focus the problem of cache partitioning for energy optimization on MCAs, and propose hybrid cache architecture for optimizing partition-sharing. The architecture shows that the problem of partition-sharing is reducible to the problem of partitioning. The technique uses dynamic programming to optimize partitioning for overall miss ratio, and for two different kinds of fairness. The hybrid cache architecture contains SRAM banks, STT-RAM banks, and STT-RAM/SRAM or any other hybrid banks for chip multiprocessors. The proposed optimization based hybrid architecture can significantly improve equal partitioning but not free-for-all sharing. Each optimization result is obtained from a very large solution space for different ways to share the cache along with the block placement and replacement policies.
Multi-core Cache Coherence and Related Issues
Multi-core processor architecture has become dominant in todays' computing environment. We use multi-core processor on each step of life, starting from personal mobile devices to large scale servers used for high performance computing environment like cloud. The underlying computing capacity can be utilized by writing code with parallel programming constructs like threads, OpenMP clauses etc. These are some of the effective ways to properly utilize shared memory architectures. Shared memory architectures face a problem of cache coherence.
There are two basic schemes that deals with the problem of cache coherence viz. snoopy bus protocol and directory protocol. The snoopy bus protocol is easy to implement but doesn't scale beyond certain number of cores. The directory protocol is complex to implement but scales properly for number of cores. The survey of cache coherence schemes tells that, there is a need of a scheme that will scale with respect to number of cores appropriately along with energy saving and good performance. Recent studies show that the usage of hybrid protocol that will use characteristics of both snoopy bus and directory with different key techniques has been discover to achieve good performance with low energy consumption.
List of Students
S.No
|
Name
|
Roll No
|
Category
|
Year of Admission
|
Lab
|
1.
|
Satyanarayana
|
406112003
|
PhD, Full Time TEQIP
|
2012
|
INTEL
|
2.
|
P.S.Tamilzharasan
|
406912001
|
PhD, Part Time
|
2012
|
INTEL
|
3.
|
R. Sangeetha
|
406913002
|
PhD, Part Time
|
2013
|
INTEL
|
4.
|
Manjith B C
|
406114001
|
PhD, QIP, Full Time
|
2014
|
INTEL
|
5.
|
Praveen Kumar Yadav
|
306112001
|
M.S, Full Time
|
2012
|
INTEL
|
6.
|
Anand Prem Kumar V
|
306113001
|
M.S, Full Time
|
2013
|
INTEL
|