Skip to content

Use or learn from dcgm for rapids doctor? #149

@betatim

Description

@betatim

A short note from a talk about training LLMs that I just attended. The speaker from huggingface mentioned https://github.com/NVIDIA/DCGM as a very useful way to debug/diagnose problems. Maybe useful for rapids doctor?


Just wanted to make a note, if you already know it/dismissed it feel free to close the issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions