RISE Humanities Data Benchmark, 0.5.2-pre1

Search Test Runs

 

A test run is a single execution of a benchmark test using a defined model configuration.
Each run represents how a particular large language model (LLM) — such as GPT-4, Claude-3, or Gemini — performed on a given task at a specific time, with specific settings.

A test run includes:

  • Prompt and role definition – what the model was asked to do and from what perspective (e.g. “as a historian”).
  • Model configuration – provider, model version, temperature, and other generation parameters.
  • Results – the model’s actual response and its evaluation (scores such as F1 or accuracy).
  • Usage and cost data – token counts and calculated API costs.
  • Metadata – information like the test date, benchmark name, and person who executed it.

Together, test runs make it possible to compare models, providers, and configurations across benchmarks in a transparent and reproducible way.

Search Results

Your search for Benchmark 'fraktur_adverts__true' with Search Hidden 'False' returned 142 results, showing page 1 of 15.
Result 1 of 142

Test T1106 at 2026-06-08

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providerx-ai
Modelgrok-4.3
  
Temperature0.0
DataclassDocument
  
Normalized Score84.50 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.87 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.2K IT + 7.5K OT = 19.7K TTCost: 0.015$0.019$0.034$
Result 2 of 142

Test T1119 at 2026-06-05

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Provideropenrouter
Modelmeta-llama/llama-4-scout-17b-16e-instruct
  
Temperature0.0
DataclassDocument
  
Normalized Score53.00 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.57 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 13.0K IT + 5.7K OT = 18.7K TTCost: 0.001$0.002$0.003$
Result 3 of 142

Test T1132 at 2026-06-05

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Provideropenrouter
Modelstepfun/step-3.7-flash-20260528
  
Temperature0.0
DataclassDocument
  
Normalized Score92.00 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.94 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 6.9K IT + 80.5K OT = 87.4K TTCost: 0.001$0.093$0.094$
Result 4 of 142

Test T0439 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelmistral-small-2506
  
Temperature0.0
DataclassDocument
  
Normalized Score30.20 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.31 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 9.1K OT = 21.2K TTCost: 0.001$0.003$0.004$
Result 5 of 142

Test T0095 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelpixtral-large-2411
  
Temperature0.0
DataclassDocument
  
Normalized Score78.80 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.81 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 9.5K OT = 21.6K TTCost: 0.024$0.057$0.081$
Result 6 of 142

Test T0569 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelmagistral-small-2509
  
Temperature0.0
DataclassDocument
  
Normalized Score11.80 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.12 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 2.7K IT + 2.1K OT = 4.8K TTCost: 0.001$0.003$0.004$
Result 7 of 142

Test T1067 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Provideropenrouter
Modelqwen/qwen3.7-plus-20260602
  
Temperature0.0
DataclassDocument
  
Normalized Score78.10 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.83 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.2K IT + 20.7K OT = 32.9K TTCost: 0.005$0.033$0.038$
Result 8 of 142

Test T0536 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelmistral-large-2512
  
Temperature0.0
DataclassDocument
  
Normalized Score78.20 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.80 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 10.0K OT = 22.1K TTCost: 0.006$0.015$0.021$
Result 9 of 142

Test T0178 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelmistral-medium-2505
  
Temperature0.0
DataclassDocument
  
Normalized Score50.60 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.52 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 6.8K OT = 19.0K TTCost: 0.005$0.014$0.019$
Result 10 of 142

Test T1080 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Provideranthropic
Modelclaude-opus-4-8
  
Temperature0.0
DataclassDocument
  
Normalized Score78.40 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.79 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 19.0K IT + 15.1K OT = 34.2K TTCost: 0.095$0.379$0.474$