SWE-Perf Performance Leaderboard
This leaderboard evaluates AI models on code performance optimization tasks. Each model is assessed on three key metrics:
- Apply (%): Can the generated patch be applied cleanly to the codebase?
- Correctness (%): Do all tests still pass after applying the patch?
- Performance (%): How much runtime is saved after optimization?
Realistic: Models generate complete solutions without any assistance.
Oracle: Models have access to the specific files that need optimization.
| Rank | Model | Apply (%) | Correctness (%) | Performance (%) |
|---|---|---|---|---|
| Loading leaderboard data... | ||||