Skip to content
View EmbroiderSnow's full-sized avatar
  • Peking University
  • Beijing, China
  • 00:57 (UTC +08:00)

Highlights

  • Pro

Block or report EmbroiderSnow

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
EmbroiderSnow/README.md

Hi, I'm Zhaoyuan Bi

CS undergraduate at Peking University

I like building model systems that leave the paper, touch real hardware, and survive the awkward details of latency, memory, power, kernels, and devices.

Model Systems Edge AI Efficient Inference Systems Security

About Me

I am interested in the systems side of AI: how models are represented, lowered, scheduled, profiled, optimized, and deployed on constrained hardware.

My favorite kind of work sits between research and engineering:

  • turning a model idea into something runnable;
  • finding where the real bottleneck hides;
  • making benchmarks reproducible enough to trust;
  • building small teams that can move from uncertainty to working systems.

Things I Enjoy

  • Reading systems, ML, architecture, and security papers with an engineer's eye.
  • Profiling code until performance numbers become explainable.
  • Working close to hardware: CPU backends, mobile devices, heterogeneous compute, memory layout, kernels, and runtime details.
  • Writing clean notes, reproducible scripts, and documentation that future-me will not resent.
  • Exploring tools that make engineering collaboration sharper, faster, and calmer.

What I Can Work With

Model systems and inference

  • LLM inference pipelines, low-bit model deployment, runtime integration, benchmark design, throughput / latency / memory / power evaluation.

Systems and backend engineering

  • C/C++, Python, CUDA, CMake, Linux.
  • CPU backend debugging and optimization.
  • Familiar with SIMD / NEON / AVX optimization ideas.

Mobile and heterogeneous computing

  • Android / iOS deployment workflows.
  • OpenCL backend adaptation.
  • Mobile GPU kernel optimization exploration.

Performance analysis

  • perf, ncu, NVIDIA Nsight.
  • CPU/GPU profiling, hotspot diagnosis, kernel-level performance analysis.

Collaboration

  • Task decomposition, code review, experiment organization, technical writing, and keeping a small engineering group pointed in the same direction.

What I Want To Explore Next

  • Efficient and trustworthy deployment of large models on edge devices.
  • Model safety from a systems perspective: evaluation, runtime behavior, deployment reliability, and hardware-aware constraints.
  • Better runtimes for low-bit and non-standard model representations.
  • Mobile GPU kernels and heterogeneous scheduling for practical AI inference.
  • Reproducible benchmarking infrastructure for model systems.
  • The boundary between AI systems, architecture, and cybersecurity evaluation.

Current Compass

I am trying to become the kind of engineer-researcher who can:

understand the model, read the kernel, measure the system, explain the tradeoff, and make the next version faster without making it less trustworthy.

Contact

Popular repositories Loading

  1. MIT-6.828-JOS-DOC-Beautify MIT-6.828-JOS-DOC-Beautify Public

    HTML 1

  2. arap_deformation arap_deformation Public

    Lab of Frontiers of Geometric Computation(2025 Spring PKU)

    C++

  3. pointconv_pytorch pointconv_pytorch Public

    Repreduct of PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019

    Python

  4. Point2Mesh-via-SDF Point2Mesh-via-SDF Public

    Lab of Frontiers of Geometric Computation(2025 Spring PKU)

    Python

  5. RISC-V-Simulator RISC-V-Simulator Public

    A RV simulator, implement RV-IM.

    C

  6. CacheSimulator CacheSimulator Public

    A simple cache simulator with prefetch.

    Python