skip to content
Kevin Su

Projects

Some cool projects I'm working/worked on.
> yafDB (C++, Rust)
(in-progress) Currently implementing a concurrent write-optimized NoSQL storage engine based on modern LSM tree designs from academic literature.
> ChickadeeOS (C++, x86-64, RISC-V)
(in-progress) ChickadeeOS is a multicore x86-64 operating system that supports kernel task suspension and scheduling, standard POSIX system calls, virtual memory, multithreading, and various process synchronization primitives (wait queues, spinlocks, futexes). Currently working on porting Chickadee to 64-bit RISC-V while also adding other various kernel features, such as buddy allocation and slab allocation, per-process concurrency and true blocking, a VFS layer, an extent-based filesystem with buffer caching, user-level multithreading, and futex support.
> Aviary (Go, MongoDB, Docker)
Aviary is a research project on serverless computing, serving as a proof-of-concept DBMS-managed serverless framework and runtime environment that allows users to easily execute MapReduce workloads on a sharded MongoDB database set up over multiple Docker containers. Users can define arbitrary MAP and REDUCE functions in Go, which is then compiled into a Go plugin and saved by Aviary. Initial results showed up to a 11.3x latency improvement using Aviary as opposed to naively using a distributed file system like GridFS.
> ShardKV (Go)
ShardKV is a distributed fault-tolerant key-value store with linearizable semantics that shards its data for greater horizontal scalability. Implemented Raft, a replicated state machine protocol (leader election, state snapshotting, log compaction, fast-backup RPC optimizations), for maintaining consistent state in the configuration manager and replication groups, as well as server-side message deduplication and shard migration in a simulated unreliable network.
> Leiserchess AI Bot (C)
Worked in a team of four to performance engineer a UCI-compliant engine for Leiserchess, a board game similar to Khet and Laser Chess. Implemented and experimented with various optimizations across the board, such as utilizing bit hacks for game state representation, computing optimal opening book moves, using speculative parallelism via YBWC and Lazy SMP algorithms, multithreading using a thread pool, and writing lock-free implementations of concurrent data structures, resulting in a search speed of millions of NPS.
> Oat Compiler (OCaml)
Implemented a compiler in OCaml for Oat, a C-like statically-typed imperative language with basic primitive types, top-level, mutually-recursive functions and global variables, structs and function pointers, and null-pointer safety. Programmed the compiler backend that takes LLVM IR to x86lite assembly and the compiler frontend that lexes and parses Oat programs to synthesize LLVM IR. Additional features include compiler typechecking and compile-time optimizations such as dataflow analysis, alias analysis, dead code elimination, and constant propagation.
> miniDB (C)
miniDB is an column-oriented NoSQL database that efficiently handles millions of tuples per column, supporting selects, fetches, joins, loads, and multiple aggregation types. Also implemented the parser, query optimizer, and primary/secondary indexes using B-trees, zone maps, hash joins, and multithreaded shared scans using a pthreads-based task queue. Used Linux perf and cachegrind to analyze performance bottlenecks and hotspots. Optimizations involved minimizing data movement and writing cache-conscious data structures amenable for SIMD vectorization.
> xv6 labs (C)
Completed all of MIT's 6.1810 OS labs, which involved implementing COW fork, user-level multithreading, a network driver for the E1000, improving buffer cache performance, adding large files and symlinks to the file system, and writing mmap/munmap system calls.
> N-Body Simulation (C++, Python)
Programmed a fast N-body simulation from scratch, simulating the gravitational force interactions between planetary bodies over thousands of time-step iterations using the all-pairs algorithm and the Barnes-Hut algorithm with a point-region quad-tree. Used OpenMPI, OpenMP, and SIMD intrinsics to improve simulation runtime, achieving a 20.7x speedup for the all-pairs algorithm and a 12.3x speedup for the Barnes-Hut algorithm in comparison to their sequential baselines. Visualizations were generated with simple Python scripts.
> CS61CPU (RISC-V, Logisim)
Designed and implemented a 2-stage pipelined RISC-V CPU that can execute a majority of RV32 ISA instructions.
> Gitlet (Java)
Designed and implemented a local version-control system that mimics the basic features of git, featuring init, add, commit, rm, log, branch, status, checkout, reset, and merge commands. Used Java serialization and SHA-1 hashing to compress and preserve user-inputted data, files, and states of gitlet objects.