Zhaoyuan Bi EmbroiderSnow

Hi, I'm Zhaoyuan Bi

CS undergraduate at Peking University

I like building model systems that leave the paper, touch real hardware, and survive the awkward details of latency, memory, power, kernels, and devices.

About Me

I am interested in the systems side of AI: how models are represented, lowered, scheduled, profiled, optimized, and deployed on constrained hardware.

My favorite kind of work sits between research and engineering:

turning a model idea into something runnable;
finding where the real bottleneck hides;
making benchmarks reproducible enough to trust;
building small teams that can move from uncertainty to working systems.

Things I Enjoy

Reading systems, ML, architecture, and security papers with an engineer's eye.
Profiling code until performance numbers become explainable.
Working close to hardware: CPU backends, mobile devices, heterogeneous compute, memory layout, kernels, and runtime details.
Writing clean notes, reproducible scripts, and documentation that future-me will not resent.
Exploring tools that make engineering collaboration sharper, faster, and calmer.

What I Can Work With

Model systems and inference

LLM inference pipelines, low-bit model deployment, runtime integration, benchmark design, throughput / latency / memory / power evaluation.

Systems and backend engineering

C/C++, Python, CUDA, CMake, Linux.
CPU backend debugging and optimization.
Familiar with SIMD / NEON / AVX optimization ideas.

Mobile and heterogeneous computing

Android / iOS deployment workflows.
OpenCL backend adaptation.
Mobile GPU kernel optimization exploration.

Performance analysis

perf, ncu, NVIDIA Nsight.
CPU/GPU profiling, hotspot diagnosis, kernel-level performance analysis.

Collaboration

Task decomposition, code review, experiment organization, technical writing, and keeping a small engineering group pointed in the same direction.

What I Want To Explore Next

Efficient and trustworthy deployment of large models on edge devices.
Model safety from a systems perspective: evaluation, runtime behavior, deployment reliability, and hardware-aware constraints.
Better runtimes for low-bit and non-standard model representations.
Mobile GPU kernels and heterogeneous scheduling for practical AI inference.
Reproducible benchmarking infrastructure for model systems.
The boundary between AI systems, architecture, and cybersecurity evaluation.

Current Compass

I am trying to become the kind of engineer-researcher who can:

understand the model, read the kernel, measure the system, explain the tradeoff, and make the next version faster without making it less trustworthy.

Contact

Email: zybi@stu.pku.edu.cn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhaoyuan Bi EmbroiderSnow

Achievements

Achievements

Highlights

Block or report EmbroiderSnow