Evaluates models' ability to extract bibliographic information from historical documents such as publication details, authors, dates, and other metadata from digitized sources.

| Score | Date | Provider | Model |
|---|---|---|---|
| 71.55 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 71.43 | 1 year ago | openai | gpt-4o |
| 71.05 | 3 months ago | openai | gpt-5.1-2025-11-13 |
| 70.23 | 7 months ago | genai | gemini-2.5-flash-preview-09-2025 |
| 69.89 | 1 month ago | anthropic | claude-opus-4-7 |
| Score | Date | Provider | Model |
|---|---|---|---|
| 71.55 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 0.00 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 41.71 | 1 month ago | openrouter | google/gemma-4-26b-a4b-it |
| 46.10 | 1 month ago | openrouter | qwen/qwen3.5-9b |
| 61.35 | 1 month ago | openrouter | google/gemma-4-31b-it |
Index cards of companies on a British 'black list' in the 1940s. Assesses models' capability to recognize typed and handwritten information from index cards.

| Score | Date | Provider | Model |
|---|---|---|---|
| 96.87 | 3 months ago | genai | gemini-2.5-pro |
| 96.76 | 2 months ago | anthropic | claude-sonnet-4-6 |
| 96.35 | 3 months ago | anthropic | claude-opus-4-5-20251101 |
| 95.80 | 2 months ago | anthropic | claude-opus-4-6 |
| 95.65 | 6 months ago | openai | gpt-4.1-mini |
| Score | Date | Provider | Model |
|---|---|---|---|
| 85.96 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 85.30 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 88.64 | 1 month ago | openrouter | qwen/qwen3.5-plus-02-15 |
| 91.75 | 1 month ago | openrouter | google/gemma-4-26b-a4b-it |
| 92.48 | 1 month ago | openrouter | qwen/qwen3.6-plus |
| Score | Date | Provider | Model |
|---|---|---|---|
| 98.61 | 1 month ago | x-ai | grok-4.20-0309-reasoning |
| 98.48 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 98.21 | 1 month ago | openrouter | google/gemma-4-26b-a4b-it |
| 97.62 | 1 month ago | anthropic | claude-opus-4-7 |
| 97.54 | 3 months ago | openrouter | x-ai/grok-4 |
| Score | Date | Provider | Model |
|---|---|---|---|
| 98.48 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 97.42 | 3 weeks ago | deepseek | deepseek-v4-pro |
| 97.01 | 3 weeks ago | deepseek | deepseek-v4-flash |
| 18.80 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 92.01 | 1 month ago | openrouter | qwen/qwen3.5-122b-a10b |
Tests models on extracting structured metadata from historical correspondence, including person names, organizations, dates, locations, and other contextual information from 20th century Swiss historical letters.

| Score | Date | Provider | Model |
|---|---|---|---|
| 81.00 | 3 months ago | openai | gpt-5 |
| 77.00 | 9 months ago | openai | gpt-5 |
| 72.00 | 3 months ago | genai | gemini-3-flash-preview |
| 72.00 | 2 months ago | genai | gemini-3.1-pro-preview |
| 71.00 | 3 months ago | openai | gpt-5 |
| Score | Date | Provider | Model |
|---|---|---|---|
| 61.00 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 59.00 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 54.00 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 55.00 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 51.00 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |

| Score | Date | Provider | Model |
|---|---|---|---|
| 59.80 | 3 months ago | openai | gpt-5 |
| 58.40 | 6 months ago | openai | gpt-5 |
| 56.93 | 1 month ago | openrouter | qwen/qwen3.5-9b |
| 55.80 | 1 month ago | openrouter | qwen/qwen3.5-plus-02-15 |
| 55.47 | 2 months ago | genai | gemini-3.1-pro-preview |
| Score | Date | Provider | Model |
|---|---|---|---|
| 46.67 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 52.00 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 45.80 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 28.73 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 44.93 | 1 month ago | openrouter | qwen/qwen3.5-397b-a17b |
Assesses models' capability to recognize and transcribe historical German Fraktur script, a Gothic typeface commonly used in German-language documents.

| Score | Date | Provider | Model |
|---|---|---|---|
| 97.90 | 2 months ago | genai | gemini-3.1-pro-preview |
| 97.30 | 2 months ago | anthropic | claude-sonnet-4-6 |
| 96.00 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 95.90 | 3 months ago | genai | gemini-3-flash-preview |
| 95.90 | 1 month ago | alibaba | qwen3.5-122b-a10b |
| Score | Date | Provider | Model |
|---|---|---|---|
| 96.00 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 48.60 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 28.70 | 1 month ago | openrouter | google/gemma-4-26b-a4b-it |
| 77.40 | 1 month ago | openrouter | qwen/qwen3.5-plus-02-15 |
| 51.20 | 1 month ago | openrouter | google/gemma-4-31b-it |
Extract names, locations, signatures from table-like meeting minutes of Mines de Costano S.A., 1930s - 1960s

| Score | Date | Provider | Model |
|---|---|---|---|
| 88.64 | 2 months ago | openai | gpt-5.4-2026-03-05 |
| 86.20 | 2 months ago | openai | gpt-5.4-2026-03-05 |
| 85.59 | 2 months ago | openai | gpt-5.3-codex |
| 84.93 | 2 months ago | openai | gpt-5.2-2025-12-11 |
| 84.79 | 1 month ago | openai | gpt-5.3-codex |
| Score | Date | Provider | Model |
|---|---|---|---|
| 83.63 | 1 month ago | openai | gpt-5.3-codex |
| 83.31 | 1 month ago | openai | gpt-5.3-codex |
| 84.79 | 1 month ago | openai | gpt-5.3-codex |
| 83.29 | 1 month ago | openai | gpt-5.3-codex |
| 81.99 | 1 month ago | openai | gpt-5.3-codex |
A comprehensive benchmark focused on catalog card analysis and information extraction from historical library catalog systems. This benchmark evaluates models on structured data extraction from digitized catalog cards, testing their ability to parse complex bibliographic information, author names, dates, and hierarchical catalog structures from historical Swiss library records.

| Score | Date | Provider | Model |
|---|---|---|---|
| 89.51 | 8 months ago | openai | gpt-5 |
| 89.39 | 8 months ago | openai | gpt-4.1 |
| 89.36 | 8 months ago | openai | gpt-4o |
| 89.17 | 1 month ago | anthropic | claude-opus-4-7 |
| 89.10 | 5 months ago | genai | gemini-3-pro-preview |
| Score | Date | Provider | Model |
|---|---|---|---|
| 88.59 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 63.98 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 86.88 | 1 month ago | openrouter | qwen/qwen3.5-122b-a10b |
| 86.65 | 1 month ago | openrouter | qwen/qwen3.6-plus |
| 63.72 | 1 month ago | openrouter | qwen/qwen3.5-9b |
Examines a model's ability to extract bounding boxes of advertisements from magazine pages.

| Score | Date | Provider | Model |
|---|---|---|---|
| 95.60 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 88.50 | 1 month ago | openai | gpt-5.2-2025-12-11 |
| 86.00 | 2 months ago | openai | gpt-5.3-codex |
| 84.80 | 1 month ago | genai | gemini-3-flash-preview |
| 80.20 | 2 months ago | contour_local | opencv-contour |
| Score | Date | Provider | Model |
|---|---|---|---|
| 95.60 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 40.50 | 1 month ago | anthropic | claude-opus-4-7 |
| 0.00 | 1 month ago | openrouter | qwen/qwen3.5-27b |
| 4.30 | 1 month ago | openrouter | qwen/qwen3.5-397b-a17b |
| 5.40 | 1 month ago | openrouter | qwen/qwen3.5-9b |
Evaluates models on page segmentation and handwritten text extraction from 15th century medieval manuscripts written in late medieval German. Tests the ability to transcribe historical handwriting, identify folio numbers, distinguish main text from marginal additions, and maintain historical spelling and formatting. Performance is measured using fuzzy string matching and Character Error Rate (CER).

| Score | Date | Provider | Model |
|---|---|---|---|
| 84.90 | 3 months ago | anthropic | claude-opus-4-5-20251101 |
| 84.60 | 1 month ago | openrouter | qwen/qwen3.5-9b |
| 80.70 | 5 months ago | genai | gemini-3-pro-preview |
| 79.80 | 2 months ago | anthropic | claude-opus-4-6 |
| 77.90 | 2 months ago | genai | gemini-3.1-flash-lite-preview |
| Score | Date | Provider | Model |
|---|---|---|---|
| 71.10 | 3 weeks ago | openai | gpt-5.5-2026-04-23 |
| 62.30 | 4 weeks ago | openrouter | qwen/qwen3.5-9b |
| 71.70 | 1 month ago | openrouter | qwen/qwen3.5-397b-a17b |
| 73.90 | 1 month ago | openrouter | google/gemma-4-31b-it |
| 75.40 | 1 month ago | openrouter | qwen/qwen3.5-122b-a10b |