Skip to content
@TrustAIRLab

TrustAIRLab

GitHub Org's stars

TrustAIRLab (Trustworthy AI Research Lab) is a research lab dedicated to the trustworthy machine learning, with a focus on safety, privacy, and security. It aims to

  • offer high-quality libraries to reduce the difficulties in algorithm reproduction

  • benchmark existing attacks and defenses on machine learning models

  • build a solid foundation for Trustworthy AI research and development

Popular repositories Loading

  1. JailbreakRadar JailbreakRadar Public

    Python 87 8

  2. VoiceJailbreakAttack VoiceJailbreakAttack Public

    Code for Voice Jailbreak Attacks Against GPT-4o.

    Python 38 2

  3. JailbreakLLMs JailbreakLLMs Public

    A dataset consists of 6,387 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 666 jailbreak prompts).

    20 2

  4. HateBench HateBench Public

    [USENIX'25] HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns

    14 3

  5. ZeroFake ZeroFake Public

    Python 11 2

  6. GPTracker GPTracker Public

    [S&P'25] GPTracker: A Large-Scale Measurement of Misused GPTs

    Python 11 1

Repositories

Showing 10 of 33 repositories

Top languages

Loading…

Most used topics

Loading…