RISE Humanities Data Benchmark, 0.5.1-pre1

Benchmark Results

Filter by benchmark title
 

Personnel Cards

Evaluates models' ability to transcribe and interpret personnel index cards of Swiss federal employees (1941–1961), containing typed and handwritten entries on job title, work location, pay grade, salary, and related notes in German and French.

Image

Top 5 Runs

ScoreDateProviderModel
97.754 weeks agoopenrouterqwen/qwen3.5-27b
97.691 month agoalibabaqwen3.5-122b-a10b
97.544 weeks agoopenrouterqwen/qwen3.5-flash-02-23
97.364 weeks agoopenrouterqwen/qwen3.5-122b-a10b
97.093 weeks agoopenaigpt-5.5-2026-04-23

Last 5 Runs

ScoreDateProviderModel
97.093 weeks agoopenaigpt-5.5-2026-04-23
83.934 weeks agoopenrouterqwen/qwen3.5-9b
95.634 weeks agoopenrouterqwen/qwen3.5-35b-a3b
97.754 weeks agoopenrouterqwen/qwen3.5-27b
87.674 weeks agoopenrouterqwen/qwen3.5-9b

Tags
  • Type(s): index-card
  • Benchmark task(s):  transcription, document-understanding, data-correction
  • Writing: handwritten, typed, printed
  • Source creation (century): 20
  • Source Layout: table, form
  • Language(s): de, fr