RISE Humanities Data Benchmark, 0.5.2-pre1

Search Test Runs

 

A test run is a single execution of a benchmark test using a defined model configuration.
Each run represents how a particular large language model (LLM) — such as GPT-4, Claude-3, or Gemini — performed on a given task at a specific time, with specific settings.

A test run includes:

  • Prompt and role definition – what the model was asked to do and from what perspective (e.g. “as a historian”).
  • Model configuration – provider, model version, temperature, and other generation parameters.
  • Results – the model’s actual response and its evaluation (scores such as F1 or accuracy).
  • Usage and cost data – token counts and calculated API costs.
  • Metadata – information like the test date, benchmark name, and person who executed it.

Together, test runs make it possible to compare models, providers, and configurations across benchmarks in a transparent and reproducible way.

Search Results

Your search for Benchmark 'fraktur_adverts__true' with Search Hidden 'False' returned 142 results, showing page 2 of 15.
Result 11 of 142

Test T0547 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelministral-14b-2512
  
Temperature0.0
DataclassDocument
  
Normalized Score52.40 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.54 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 14.1K OT = 26.2K TTCost: 0.002$0.003$0.005$
Result 12 of 142

Test T0536 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelmistral-large-2512
  
Temperature0.0
DataclassDocument
  
Normalized Score78.20 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.80 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 10.0K OT = 22.1K TTCost: 0.006$0.015$0.021$
Result 13 of 142

Test T1093 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelmistral-medium-3.5
  
Temperature0.0
DataclassDocument
  
Normalized Score57.10 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.61 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 8.0K OT = 20.1K TTCost: 0.018$0.060$0.078$
Result 14 of 142

Test T0439 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelmistral-small-2506
  
Temperature0.0
DataclassDocument
  
Normalized Score30.20 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.31 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 9.1K OT = 21.2K TTCost: 0.001$0.003$0.004$
Result 15 of 142

Test T0178 at 2026-06-04

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providermistral
Modelmistral-medium-2505
  
Temperature0.0
DataclassDocument
  
Normalized Score50.60 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.52 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.1K IT + 6.8K OT = 19.0K TTCost: 0.005$0.014$0.019$
Result 16 of 142

Test T1054 at 2026-05-22

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Providergenai
Modelgemini-3.5-flash
  
Temperature0.0
DataclassDocument
  
Normalized Score76.50 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.77 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 8.1K IT + 10.1K OT = 18.2K TTCost: 0.012$0.091$0.103$
Result 17 of 142

Test T1041 at 2026-04-27

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Provideropenai
Modelgpt-5.5-2026-04-23
  
Temperature0.0
DataclassDocument
  
Normalized Score96.00 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.97 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.9K IT + 32.3K OT = 45.2K TTCost: 0.064$0.970$1.034$
Result 18 of 142

Test T0987 at 2026-04-22

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Provideropenrouter
Modelqwen/qwen3.5-9b-20260310
  
Temperature0.0
DataclassDocument
  
Normalized Score48.60 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.51 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 11.7K IT + 62.6K OT = 74.3K TTCost: 0.001$0.009$0.011$
Result 19 of 142

Test T0909 at 2026-04-21

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Provideropenrouter
Modelqwen/qwen3.5-122b-a10b-20260224
  
Temperature0.0
DataclassDocument
  
Normalized Score95.80 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.97 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 12.3K IT + 9.5K OT = 21.7K TTCost: 0.003$0.020$0.023$
Result 20 of 142

Test T0935 at 2026-04-21

{'document-type': ['book-page'], 'writing': ['printed'], 'century': [18, 19], 'language': ['de'], 'layout': ['prose'], 'script-style': ['fraktur'], 'task': ['transcription']}

Configuration
Provideropenrouter
Modelqwen/qwen3.5-35b-a3b-20260224
  
Temperature0.0
DataclassDocument
  
Normalized Score71.60 %
Test timeunknown seconds
Prompt

## IDENTITY AND PURPOSE

You are an OCR and information extraction system trained to process historical newspaper pages printed in 18th-century German using Fraktur type. The pages contain mostly classified advertisements. Your task is to identify and extract each advertisement *exactly as printed*, including historical spellings, typographic errors, punctuation, and formatting.


## INSTRUCTIONS

- Extract **all advertisements** from the input image, one after the other, following the sequence on the page.
- Maintain the **original spelling**, capitalization, and any **typos or non-standard forms**.
- Follow these transcription rules: 
  - the long s (ſ) is transcribed as "s"
  - "/" is transcribed as ","
- Use the masthead of the newspaper only to extract the date, ignore other content.
- The layout is typically **two-column**; extract ads from both columns, including the ad number.
- Return the result as a **JSON object** in the specified format and **nothing else** (no explanations, summaries, or additional text).
- For each advertisement, include:
  - `"date"`: the publication date of the page in ISO 8061 format (YYYY-MM-DD)
  - `"tags_section"`: the heading under which the advertisement appears
  - `"text"`: the full advertisement text

## EXAMPLE OUTPUT

{
  "advertisements": [
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "5. Ein kleines, jedoch listiges Lehrbuch der Zauberkunst, lange im Gebrauche des jungen Bartolomeus Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zum Verkauff offeriert",
      "text": "6. Ein rarer, mit Edelsteinen besetzter Saxophon-Kasten, aus dem Besitze der Jungfer Lisa Simpson."
    },
    {
      "date": "1731-01-02",
      "tags_section": "Es werden zu Entleihen begehrt",
      "text": "7. Ein gar prachtvoller, jedoch etwas zerlesener Band mit Rezepten von Margaretha Simpsonin."
    }
  ]
}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.73 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 11.9K IT + 36.5K OT = 48.4K TTCost: 0.002$0.047$0.049$