A test run is a single execution of a benchmark test using a defined model configuration.
Each run represents how a particular large language model (LLM) — such as GPT-4, Claude-3, or Gemini — performed on a given task at a specific time, with specific settings.
A test run includes:
Together, test runs make it possible to compare models, providers, and configurations across benchmarks in a transparent and reproducible way.
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | openrouter |
| Model | qwen/qwen3.5-27b-20260224 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 94.10 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.96 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 12.3K IT + 10.3K OT = 22.5K TT | Cost: 0.002$ + 0.016$ = 0.018$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | openrouter |
| Model | qwen/qwen3.6-plus-04-02 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 57.60 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.58 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 12.2K IT + 49.5K OT = 61.7K TT | Cost: 0.004$ + 0.097$ = 0.101$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | openrouter |
| Model | qwen/qwen3.5-35b-a3b-20260224 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 71.60 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.73 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 11.9K IT + 36.5K OT = 48.4K TT | Cost: 0.002$ + 0.047$ = 0.049$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | openrouter |
| Model | qwen/qwen3.5-flash-20260224 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 65.80 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.71 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 12.3K IT + 11.3K OT = 23.6K TT | Cost: 0.001$ + 0.003$ = 0.004$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | openrouter |
| Model | qwen/qwen3.5-122b-a10b-20260224 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 95.80 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.97 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 12.3K IT + 9.5K OT = 21.7K TT | Cost: 0.003$ + 0.020$ = 0.023$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | openrouter |
| Model | google/gemma-4-31b-it-20260402 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 51.20 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.56 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 4.1K IT + 7.5K OT = 11.6K TT | Cost: 0.001$ + 0.003$ = 0.003$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | anthropic |
| Model | claude-opus-4-7 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 78.30 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.79 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 21.0K IT + 14.2K OT = 35.2K TT | Cost: 0.105$ + 0.354$ = 0.459$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | openrouter |
| Model | qwen/qwen3.5-plus-20260216 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 77.40 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.78 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 12.3K IT + 24.1K OT = 36.3K TT | Cost: 0.003$ + 0.038$ = 0.041$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | openrouter |
| Model | qwen/qwen3.5-9b-20260310 |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 38.70 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.41 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 10.5K IT + 39.9K OT = 50.4K TT | Cost: 0.001$ + 0.006$ = 0.007$ |
{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}
| Provider | alibaba |
| Model | qwen3.5-35b-a3b |
| Temperature | 0.0 |
| Dataclass | Document |
| Normalized Score | 51.30 % |
| Test time | unknown seconds |
## IDENTITY AND PURPOSE
You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.
## INSTRUCTIONS
- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules:
- the long s (ſ) is transcribed as "s"
- "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
- `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
- `"tags_section"`: the heading under which the advertisement appears
- `"text"`: the full advertisement text
## EXAMPLE OUTPUT
{
"advertisements": [
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zum Verkauff offeriert",
"text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
},
{
"date": "1731-01-02",
"tags_section": "Es werden zu Entleihen begehrt",
"text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
}
]
}
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.53 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 12.1K IT + 8.9K OT = 21.0K TT | Cost: 0.003$ + 0.018$ = 0.021$ |