This benchmark suite is designed and maintained by RISE of the University of Basel.
For more details, visit the GitHub repository.
This benchmark is the result of lots of contributions (see below).
You are welcome to join us! Check out our GitHub-Repository, especially our Contributing Guidelines.
Core development team and contact:
Current version of this frontend 0.4.0
| v0.1.0 | 2025-08-25 | Github, ZIP, TAR | Initial release. |
| v0.2.0 | 2025-08-31 | Github, ZIP, TAR | Added tests, fixed broken links, added global performance table. |
| v0.2.1 | 2025-09-10 | Github, ZIP, TAR | Added radar chart, Added library cards benchmark. |
| v0.2.2 | 2025-09-19 | Github, ZIP, TAR | |
| v0.3.0 | 2025-10-03 | Github, ZIP, TAR | |
| v0.3.1 | 2025-10-29 | Github, ZIP, TAR | Added openrouter and scicore support, added benchmarks, added 170 new test configurations. |
| v0.4.0 | 2025-12-08 | Github, ZIP, TAR | Overauled presentation system, externalized LLM functionality, added text and mixed benchmarks. |
Core Development, Domain Expert, Data curator, Annotator, Analyst, Engineer