I am a senior software engineer at Microsoft, working on SCOPE query optimizer, sitting at the core of Microsoft’s in-house data lake that processes nearly one million jobs and exabytes of data every day.
I am also an independent researcher in AI infra. I have years of experience on:
Master of Computational Science, 2022
LTI, Carnegie Mellon University
Bachelor in Computer Science, 2019
The University of Hong Kong
Exchange Student, 2018
University of Toronto
Summer Student, 2016
University of California, Berkeley
Adventure in industry
I work in Scope query optimizer team. Scope is the language for Microsoft’s internal big data platform. Scope carries the company, processing Exabytes of data and nearly a million jobs every day.
I lead the performance improvement efforts, including building sophisticated statistics model, tuning TPC-DS benchmarks, and developing state-of-the-art algorithms for join operations. My work has led to millions of dollars of savings per year for the company.
My team builds a massive live knowledge graph that enables company-wide business intelligence, machine learning, monitoring, and data governance. My contribution includes:
• Graph query optimizations that reduce average system latency by 50%.
• An Infrastructure as Code (IAC) solution that reduces data governance labor by 90%.
• A Spark streaming pipeline to ingest ~10 million production telemetries daily.