Skip to content
View lcy-seso's full-sized avatar

Organizations

@TiledTensor @FractalTensor

Block or report lcy-seso

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
lcy-seso/README.md

Ying 🐇

Working on systems where algorithms, compilers, and hardware are designed as one.


Research Interests

I'm broadly interested in deep learning systems, compilers, and hardware-aware programming abstractions. The two questions I've been thinking about recently are:

  • What happens when the algorithm, software, and hardware layers are designed together, rather than stacked on top of each other?
  • What does it take for large-scale human–LLM collaboration — across people and across agents — to sustain delivery inside a real software stack over the long run?

Outside of research, I enjoy writing programs and building software systems — I just like making things.


🚀 Current Projects

Both projects are joint work with friends at @tile-ai.

🧠 TileRTa take on algorithm · software · hardware co-design.

An ongoing effort that grew out of our earlier research — exploring what the layers between algorithms and hardware should look like when they are designed together, rather than stacked on top of each other.

🤖 TileOPsoperator library development in the agent era.

An exploration of how far LLM agents can go in autonomously developing an operator library — from writing kernels to testing and iterating on them — with quality good enough to actually ship.


🧩 Past & Ongoing Work

  • 🚀 TileFusion — an experimental C++ macro kernel template library that raises the abstraction level of CUDA C for tile processing, so algorithm developers can innovate on hardware-aware LLM kernels without drowning in low-level details.
  • 🧩 FractalTensor — a programming framework built around FractalTensor: nested, statically-shaped tensor lists with functional array operators (map / reduce / scan). DSL + IR work inspired by polyhedral loop analysis. [paper]
  • 🔍 VPTQ — an extreme low-bit quantization algorithm and inference library for LLMs, led by my friend @YangWang92; I contribute on the systems side.

✍️ Writing & Elsewhere

I keep a blog where I jot down ideas that catch my attention in daily work — updates are infrequent but unhurried. @haruhi55 is also me in disguise. 🐵✨

📫 Contact

lcy.seso@gmail.com · caoyingseso@126.com

Feel free to reach out — happy to talk about deep learning systems, compilers, hardware co-design, or LLM-driven engineering.

Pinned Loading

  1. tile-ai/TileRT tile-ai/TileRT Public

    Tile-Based Runtime for Ultra-Low-Latency LLM Inference

    Python 1.1k 64

  2. tile-ai/TileOPs tile-ai/TileOPs Public

    High-performance LLM operator library built on TileLang.

    Python 134 37

  3. microsoft/TileFusion microsoft/TileFusion Public

    TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.

    Cuda 111 6

  4. microsoft/FractalTensor microsoft/FractalTensor Public

    FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of lists of statically-shaped tensors, referred to as a Fractal…

    Python 31 6

  5. microsoft/VPTQ microsoft/VPTQ Public

    VPTQ, A Flexible and Extreme low-bit quantization algorithm

    Python 680 52

  6. LearningNotes LearningNotes Public

    Ying's notes

    TeX 8