I am a MCDS (Master of Computational Data Science) student at CMU (Carnegie Mellon University). Before pursuing a master degree, I was an analyst in the Core Engineering Department at Goldman Sachs (GS) for 2 years. My team built in-house monitoring tools on top of a graph database.
I am a Committer and Technical Steering Committee Member for an open-source project, janusgraph, which is one of the most popular graph databases. I was a maintainer for an open-source project, coala. I was a Google Summer of Code 2018 Student, Google Code-in 2018 mentor and Google Summer of Code 2019 mentor.
Before joining GS, I got my bachelor degree in Computer Science from The University of Hong Kong (HKU). I was supervised by Dr. Heming Cui at Systems and Networking Group, and Dr. Reynold Cheng at Data Engineering Group. I also spent a semester as an exchange student at University of Toronto, where I met and worked with Prof. Peter Marbach.
Master of Computational Science, 2021-2022
Carnegie Mellon University
BEng in Computer Science (First Class Honors), 2015-2019
The University of Hong Kong
Exchange Student, 2018
University of Toronto
Summer Student, 2016
University of California, Berkeley
Adventure in industry
Tech Stack: Java, JanusGraph, Terraform, Hadoop, Cassandra, MongoDB, Elasticsearch, Spring
• Contributed to building a large graph-based topology monitoring system used by 30+ teams
• Developed a Terraform provider for users to manage graph resources via Infrastructure as code (IAC) solution, greatly reducing manual efforts to compare and update the graph by 90%
• Optimized core graph queries to improve average query latency by 50%
• Implemented a Spark streaming application to transform and ingest ~10M process telemetries daily, enabling resiliency monitoring and quick incident troubleshooting
• Built a framework on top of Hadoop MapReduce to run OLAP queries against ~20M vertices and edges, reducing latency of analytical queries by 95%
• Built a RESTful microservice on Kubernetes that auto-scales based on event metrics, reducing hardware resources by half on average
• Refactored the entire Java codebase with Spring, enabling inversion of control and dependency injection
• Independently mentored an intern to develop a Maven plugin to generate and validate IAC resources
Tech Stack: C++, MongoDB, python
• Developed cold and hot backup modules for large, distributed image recognition platform with 1 billion data entries
• Built RESTful web services in C++ to process image feature extraction, storing and retrieval requests with high QPS
• Implemented an automatic performance and failover testing pipeline using Python, reducing manual efforts by 80%