My Projects

There are two groups in this category:

Supervised Projects

These are projects that were completed as assignments.

(The following list is in reverse chronological order.)

Improving Execution of GPU Applications with Inter-Kernel Data Dependencies
Supervised By: Prof. D. Wong, Prof. N. Abu-Ghazaleh
December 2018—August 2021

—Developed software-hardware support and scheduling techniques to accelerate the execution of multiple data-dependent kernels at the same time.
Attempted to predict data dependencies between GPU kernels using an ML model (in Python).
Evaluated by modifying GPGPU-Sim simulator in C++.

Wireframe: Implementing Data Dependency Awareness in GPUs
Supervised By: Prof. D. Wong, Prof. L. Bhuyan
July 2016—April 2017

—Proposed and developed cross-stack software-hardware techniques to support data dependent parallelism in GPUs. Project proposed new programming paradigm and hardware scheduling techniques, resulting in 45% speedup.
—Evaluated by modifying GPGPU-Sim simulator in C++ and modifying CUDA benchmarks

LLVM compiler block and edge profiling

Supervised By: Prof. R. Gupta
March 2016—April 2016
—Using C++, modified LLVM to profile an arbitrary program written in C++ for number of basic blocks, edges and loops

Implementation of two-level round robin warp scheduling and dynamic warps on GPGPU-Sim

Supervised By: Prof. N. Abu-Ghazaleh
November 2015—December 2015
—Used C++ to modify the source code of GPGPU-Sim, specifically its warp scheduling mechanism, to increase performance.

A compact AES-256 encryption/decryption module on FPGA with graphic interface
Supervised By: Prof. R. Stern
April 2015

—Using VHDL, I designed all stages of the AES-256, including key expansion, encryption and decryption and successfully implemented and tested it on a Spartan6 FPGA.
—I utilized Visual C++ to create a PC interface to maintain control and interact with the module remotely by transmitting data and commands.
—An elaborate 3-pin handshaking protocol was also used to make the device more compatible with its slower adjacent modules and also manage data and key transmission and reception.

Neural Spike Processing on Reconfigurable Hardware (MS Thesis)
Supervised By: Prof. N. Sertac Artan
September 2013—May 2014

—Downloaded on a Spartan6 FPGA, this system can detect neural spikes and also classify them based on their characteristics.
—The classification is done by finding minimum distance of a new data from all the existing clusters.
—It can use ADCs and external memory as entries and contains a visual interface using VGA. It can display multiple channels in real time.
—This project was done by a group of five people. I was tasked to design the spike detection and classification blocks, one of the ADC drivers and also the VGA interface.

Low-area streaming image processor
Supervised By: Prof. N. Sertac Artan
April 2013—May 2013

—There were two groups working on image processors, but with different approaches: low area and high speed. Our priority was to minimize the area and thereby, the potential cost of our processor.
—Using a paper as our source and inspiration, we created several processing units of the same architecture, each of which could perform a single function on a sample image stored in a Flash ROM, like edge detection or blurring.
—The image would flow from the ROM to a “window” where the first processor used to perform its actions. Then it was stored in a RAM where we could see its effects on a monitor screen.
—For the next effects, the system uses the data on the RAM which contained the previous stage.
—The performance time for a 640×480 image was about 0.86s.

Automatic reconfiguration of spares in memories
Supervised By: Prof. M. Maniatakos
March 2013—May 2013

—Using Verilog, we designed a system which can reroute data from damaged memory cells into new spare ones.
—Also, it has an additional feature of cloning vital memory data into a spare space for stability.

Design and layout of a 4Kb static RAM with column decoder
Supervised By: Prof. H. Li
December 2012

—With the help of the Cadence environment, we created the schematic and the layout of a typical SRAM memory.

Design and Implementation of an Automatic Programmable Spin Coater (BS Thesis)
Supervised By: Prof. M. Fardmanesh — Sharif’s Superconductor Electronics Research Lab (SERL)
December 2011—May 2012

—Using PIC as the acting processor, we are designing the speed control and the user interface of a spin coater which can be used in coating silicon substrates with photoresist material. This was our most challenging puzzle yet!
—This project was done in a group of two, with me building the digital circuit and the  computer interface, the other taking care of the motor driving and other analog circuits, and both dealing with the mechanics!
—The mechanics were not our specialty, but we learned to design some parts ourselves. Although, we always had help from the people who were adept at mechanic implementation. This was the most exciting part of the project, I dare say!
—In June 2012, a ceremony was held to determine the best projects in the EE Department which were done up to that point. This project was awarded 2nd place among more than 20 top projects.

Unsupervised Projects

These projects were completed on our own.

(The following list is in reverse chronological order.)

Design of a memory game in Android using Java
March 2015

—This is my first serious Android project. The idea is based on memorizing a sequence of squares and repeating it. It is written in Java using NetBeans.

Multispectral palmprint recognition using different features
July 2014 – December 2014

—In multispectral palmprint recognition, the image of the palmprint is taken under more than one illumination, providing us with more accurate data with every sample, and the samples can be collected almost as simply as the regular palmprint methods.

—We tried developing precise palmprint recognition algorithms using the majority voting classifier. We tried three different sets of features as the base of the project (statistical-wavelet, textural and DCT-wavelet), gathered the data and developed the algorithms. The results were excellent and outperformed all previous known methods in multispectral palmprint method!

—The phase involving textural features was presented in SPMB14 in December 2014 in Philadelphia and is now published on IEEE.