Skip to content
View yingchen-coding's full-sized avatar

Block or report yingchen-coding

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. agentic-misuse-benchmark agentic-misuse-benchmark Public

    Trajectory-level benchmark for prompt injection, policy erosion, intent drift, and coordinated misuse in agentic LLM systems.

    Python 1 1

  2. agentguard agentguard Public

    Security linter for AI agent definitions — catches prompt-injection and over-broad-capability holes before they ship. Deterministic, zero-dependency, CI-ready.

    Python 1

  3. when-rlhf-fails-quietly when-rlhf-fails-quietly Public

    Research project evaluating silent alignment failures in LLMs under adversarial and high-stakes prompts.

    Python

  4. safety-memos safety-memos Public

    Short practical memos on agent safety failures, safeguards, and evaluation design.

    Python

  5. agentic-safety-systems-whitepaper agentic-safety-systems-whitepaper Public

    Whitepaper and system design for closed-loop agent safety: trajectory evals, safeguards, release gates, and incident feedback.

    HTML

  6. loopforge loopforge Public

    Engineering toolkit for agent loops — lint, scaffold, run, schedule, and eval autonomous loops against a six-block model.

    Python