SWE-Perf Performance Leaderboard
This leaderboard evaluates AI models on code performance optimization tasks. Each model is assessed on three key metrics:
- Apply (%): Can the generated patch be applied cleanly to the codebase?
- Correctness (%): Do all tests still pass after applying the patch?
- Performance (%): How much runtime is saved after optimization?
End-to-End: Models generate complete solutions without any assistance.
Oracle: Models have access to the specific files that need optimization.
Rank | Model | Apply (%) | Correctness (%) | Performance (%) |
---|---|---|---|---|
Loading leaderboard data... |