Projects
Some cool projects I'm working/worked on.
-
> yafDB
(C++, Rust)
(in-progress) Currently implementing a concurrent write-optimized NoSQL storage engine based on
modern LSM tree designs from academic literature.
-
> ChickadeeOS
(C++, x86-64, RISC-V)
(in-progress) ChickadeeOS is a multicore x86-64 operating system that supports kernel task suspension
and scheduling, standard POSIX system calls, virtual memory, multithreading, and various process
synchronization primitives (wait queues, spinlocks, futexes). Currently working on porting Chickadee
to 64-bit RISC-V while also adding other various kernel features, such as buddy allocation and slab
allocation, per-process concurrency and true blocking, a VFS layer, an extent-based filesystem with
buffer caching, user-level multithreading, and futex support.
-
> Aviary
(Go, MongoDB, Docker)
Aviary is a research project on serverless computing, serving as a proof-of-concept DBMS-managed
serverless framework and runtime environment that allows users to easily execute MapReduce workloads on a
sharded MongoDB database set up over multiple Docker containers. Users can define arbitrary MAP
and REDUCE functions in Go, which is then compiled into a Go plugin and saved by Aviary.
Initial results showed up to a 11.3x latency improvement using Aviary as opposed to naively
using a distributed file system like GridFS.
-
> ShardKV
(Go)
ShardKV is a distributed fault-tolerant key-value store with linearizable semantics that shards
its data for greater horizontal scalability. Implemented Raft, a replicated state machine protocol
(leader election, state snapshotting, log compaction, fast-backup RPC optimizations), for maintaining
consistent state in the configuration manager and replication groups, as well as server-side message
deduplication and shard migration in a simulated unreliable network.
-
> Leiserchess AI Bot
(C)
Worked in a team of four to performance engineer a UCI-compliant engine for Leiserchess, a board
game similar to Khet and Laser Chess. Implemented and experimented with various optimizations across
the board, such as utilizing bit hacks for game state representation, computing optimal opening book moves,
using speculative parallelism via YBWC and Lazy SMP algorithms, multithreading using a thread pool, and
writing lock-free implementations of concurrent data structures, resulting in a search speed of millions of NPS.
-
> Oat Compiler
(OCaml)
Implemented a compiler in OCaml for Oat, a C-like statically-typed imperative language with basic primitive types,
top-level, mutually-recursive functions and global variables, structs and function pointers, and null-pointer safety.
Programmed the compiler backend that takes LLVM IR to x86lite assembly and the compiler frontend that
lexes and parses Oat programs to synthesize LLVM IR. Additional features include compiler typechecking and compile-time optimizations such as
dataflow analysis, alias analysis, dead code elimination, and constant propagation.
-
> miniDB
(C)
miniDB is an column-oriented NoSQL database that efficiently handles millions of tuples per column,
supporting selects, fetches, joins, loads, and multiple aggregation types. Also implemented the parser, query optimizer, and primary/secondary indexes
using B-trees, zone maps, hash joins, and multithreaded shared scans using a pthreads-based task queue. Used Linux perf and cachegrind
to analyze performance bottlenecks and hotspots. Optimizations involved minimizing
data movement and writing cache-conscious data structures amenable for SIMD vectorization.
-
> xv6 labs
(C)
Completed all of MIT's 6.1810 OS labs, which involved implementing COW fork, user-level multithreading,
a network driver for the E1000, improving buffer cache performance, adding large files and symlinks
to the file system, and writing mmap/munmap system calls.
-
> N-Body Simulation
(C++, Python)
Programmed a fast N-body simulation from scratch, simulating the gravitational force interactions
between planetary bodies over thousands of time-step iterations using the all-pairs algorithm and
the Barnes-Hut algorithm with a point-region quad-tree. Used OpenMPI, OpenMP, and SIMD intrinsics
to improve simulation runtime, achieving a 20.7x speedup for the all-pairs algorithm and a 12.3x
speedup for the Barnes-Hut algorithm in comparison to their sequential baselines. Visualizations
were generated with simple Python scripts.
-
> CS61CPU
(RISC-V, Logisim)
Designed and implemented a 2-stage pipelined RISC-V CPU that can execute a majority of RV32 ISA instructions.
-
> Gitlet
(Java)
Designed and implemented a local version-control system that mimics the basic features of git, featuring
init, add, commit, rm, log, branch, status, checkout, reset, and merge commands. Used Java serialization and
SHA-1 hashing to compress and preserve user-inputted data, files, and states of gitlet objects.