RISE Humanities Data Benchmark, 0.4.0

Leaderboard

Introduction

This leaderboard presents a comparative overview of model performance across all benchmarks.
It brings together results from all used models and providers.

  • Benchmark Difficulty Ranking shows which tasks are most and least challenging on average.
  • Provider and Model Performance sections compare accuracy and consistency across different AI providers and individual models.
  • Cost Effectiveness and Model Speed visualizations balance performance against efficiency, highlighting practical trade-offs between quality, speed, and cost.

Together, these graphs offer an evidence-based snapshot of how current large language models perform on data-intensive tasks in the humanities.

Which Benchmark performs best

 

Which Provider performs best

 

 

Which Model performs best

 

Cost Effectiveness

 

Model Speed