Skip to content

DeepSourceCorp/benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DeepSource Benchmarks

Benchmark dataset evaluating code review and security analysis tools on the OpenSSF CVE Benchmark.

Benchmarked Tools

Last updated: April 12, 2026

Data Format

Judged Results (benchmarks/judged-results/)

Final evaluation results in JSONL format with fields:

  • cve_id: CVE identifier
  • variant: fixed or unfixed
  • detected_issues: Issues found by the tool
  • TP, FP, TN, FN: Classification metrics
  • judge_reasoning: Explanation of the judgment

Processed Results (benchmarks/processed/)

Intermediate formatted results from each tool, normalized for comparison.

Raw Output (benchmarks/raw-output/)

Original tool outputs per CVE, preserving the exact response from each tool.

Archive

The archive/ directory contains prompts and data from earlier benchmark runs:

References

About

Benchmark dataset evaluating code review and security analysis tools on the OpenSSF CVE Benchmark.

Resources

License

Stars

Watchers

Forks

Contributors