RISE Humanities Data Benchmark, 0.5.0-pre1

Benchmark Results

Filter by benchmark title
 

Personnel Cards

Evaluates models' ability to transcribe and interpret personnel index cards of Swiss federal employees (1941–1961), containing typed and handwritten entries on job title, work location, pay grade, salary, and related notes in German and French.

Image

Top 5 Runs

ScoreDateProviderModel
97.691 week agoalibabaqwen3.5-122b-a10b
97.042 weeks agogenaigemini-3.1-pro-preview
97.012 weeks agoanthropicclaude-sonnet-4-6
96.961 week agoalibabaqwen3.5-27b
96.772 weeks agoopenaigpt-5.4-2026-03-05

Last 5 Runs

ScoreDateProviderModel
96.671 week agoalibabaqwen3.5-397b-a17b
85.551 week agoalibabaqwen3.5-flash-2026-02-23
96.511 week agoalibabaqwen3.5-35b-a3b
97.691 week agoalibabaqwen3.5-122b-a10b
96.961 week agoalibabaqwen3.5-27b

Tags
  • Type(s): index-card
  • Benchmark task(s):  transcription, document-understanding, data-correction
  • Writing: handwritten, typed, printed
  • Source creation (century): 20
  • Source Layout: table, form
  • Language(s): de, fr