RISE Humanities Data Benchmark

A test run is a single execution of a benchmark test using a defined model configuration.
Each run represents how a particular large language model (LLM) — such as GPT-4, Claude-3, or Gemini — performed on a given task at a specific time, with specific settings.

A test run includes:

Prompt and role definition – what the model was asked to do and from what perspective (e.g. “as a historian”).
Model configuration – provider, model version, temperature, and other generation parameters.
Results – the model’s actual response and its evaluation (scores such as F1 or accuracy).
Usage and cost data – token counts and calculated API costs.
Metadata – information like the test date, benchmark name, and person who executed it.

Together, test runs make it possible to compare models, providers, and configurations across benchmarks in a transparent and reproducible way.

Result 51 of 93

Test T0538 at 2026-02-18

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	mistral
Model	mistral-large-2512

Temperature	0.0
Dataclass	Document

Normalized Score	0.00 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.00	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 9.0K IT + 552 OT = 9.6K TT

Cost: 0.005$ + 0.001$ = 0.005$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 52 of 93

Test T0276 at 2026-02-17

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	openai
Model	gpt-4o-mini-2024-07-18

Temperature	0.0
Dataclass	Document

Normalized Score	51.70 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.56	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 450.0K IT + 2.4K OT = 452.4K TT

Cost: 0.067$ + 0.001$ = 0.069$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 53 of 93

Test T0278 at 2026-02-17

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	openai
Model	gpt-4.1-nano-2025-04-14

Temperature	0.0
Dataclass	Document

Normalized Score	48.40 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.54	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 51.1K IT + 2.2K OT = 53.3K TT

Cost: 0.005$ + 0.001$ = 0.006$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 54 of 93

Test T0275 at 2026-01-26

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	openai
Model	gpt-4o-2024-08-06

Temperature	0.0
Dataclass	Document

Normalized Score	61.70 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.65	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 21.2K IT + 2.3K OT = 23.5K TT

Cost: 0.053$ + 0.023$ = 0.076$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 55 of 93

Test T0279 at 2026-01-26

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	openai
Model	gpt-5-2025-08-07

Temperature	0.0
Dataclass	Document

Normalized Score	43.40 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.46	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 18.8K IT + 62.9K OT = 81.8K TT

Cost: 0.024$ + 0.629$ = 0.653$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 56 of 93

Test T0409 at 2026-01-26

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	openai
Model	gpt-5.1-2025-11-13

Temperature	0.0
Dataclass	Document

Normalized Score	62.90 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.66	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 18.8K IT + 2.6K OT = 21.5K TT

Cost: 0.024$ + 0.026$ = 0.050$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 57 of 93

Test T0282 at 2026-01-26

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	openai
Model	o3-2025-04-16

Temperature	0.0
Dataclass	Document

Normalized Score	59.70 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.63	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 19.6K IT + 39.5K OT = 59.2K TT

Cost: 0.039$ + 0.316$ = 0.356$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 58 of 93

Test T0281 at 2026-01-26

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	openai
Model	gpt-5-nano-2025-08-07

Temperature	0.0
Dataclass	Document

Normalized Score	21.60 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.22	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 34.2K IT + 22.1K OT = 56.3K TT

Cost: 0.002$ + 0.009$ = 0.011$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 59 of 93

Test T0273 at 2026-01-26

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	openai
Model	gpt-4.1-2025-04-14

Temperature	0.0
Dataclass	Document

Normalized Score	64.00 %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
0.68	n/a	n/a	n/a	n/a	n/a	n/a	n/a	n/a
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: n/a, n/a.

Tokens: 21.2K IT + 2.4K OT = 23.6K TT

Cost: 0.042$ + 0.019$ = 0.062$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Result 60 of 93

Test T0560 at 2026-01-26

{'document-type': ['manuscript'], 'writing': ['handwritten'], 'century': [15], 'language': ['de'], 'layout': ['prose'], 'task': ['transcription']}

Configuration

Provider	mistral
Model	ministral-8b-2512

Temperature	0.0
Dataclass	Document

Normalized Score	None %
Test time	unknown seconds

Prompt

IDENTITY and PURPOSE:

You are presented with an image from a 15th century medieval manuscript written in Basel in late medieval German. Your task is to extract the text from the manuscript in the specified JSON format. You must extract all text exactly as it appears in the manuscript, maintaining historical spellings, punctuation, and formatting. Do not resolve abbreviations.

The manuscript may contain:
1. A main text body (often in one column as continuous text)
2. A folio number (if present)
3. Additional notes or text written in the margins (labeled as addition1, addition2, etc.)

You must:
- Identify and transcribe the main text
- Extract the folio number if visible (use empty string "" if not visible)
- Identify and transcribe any marginal additions separately
- Preserve line breaks with \n
- Maintain all historical spellings and abbreviations exactly as written. If a letter is superscribed, normalize it by writing the superscribed letter after the main letter, e.g. "u with superscribed o" is spelled as "uo". If special characters are used for abbreviations, do not resolve them, but try to transcribe the special character according to the Medieval Unicode Font Initiative. ꝛ for "er" and ꝰ for "us" or "em" might be the most common special characters.
- Do not modernize or correct the text
- Do not use OCR or attempt to interpret unclear text - transcribe what you can see
- If a field has no content, use an empty string "" (not null)

Take a deep breath and think step by step about the layout of the page. First identify the folio number, then the main text area, then any marginal additions. Return only a JSON file with no additional commentary.

EXAMPLE OUTPUT:

{
  "folios": [
    {
      "folio": "15",
      "text": "In disem jare kam der kunig\n mit grossem here in daz lant\n vnd belagerte die stat Basel\n do wertend sich die burger\n mit grosser kraft vnd tugent\n vnd triben den kunig dannen\n mit schanden vnd verlust",
      "addition1": "Anno domini 1444",
      "addition2": "",
      "addition3": ""
    }
  ]
}

Results

no valid result

Scoring

Fuzzy Score	F1 micro / macro		Micro precision/recall		Tue/False Positives
None	None	None	None	None	None	None	None	None
			Micro Precision	Micro Recall	Instances	TP	FP	FN

Costs / Pricing

Pricing Date: None, None.

Tokens: None IT + None OT = None TT

Cost: None$ + None$ = None$

Cite: Hindermann, Marti, Alberto, et al., (2025). RISE-UNIBAS/humanities_data_benchmark, 10.5281/zenodo.16941752

Search Test Runs

Search Results
Show compact results Refine Search New Search

Download JSON Download CSV

Test T0538 at 2026-02-18

Test T0276 at 2026-02-17

Test T0278 at 2026-02-17

Test T0275 at 2026-01-26

Test T0279 at 2026-01-26

Test T0409 at 2026-01-26

Test T0282 at 2026-01-26

Test T0281 at 2026-01-26

Test T0273 at 2026-01-26

Test T0560 at 2026-01-26

Search Test Runs

Search Results Show compact results Refine Search New Search Download Download JSON Download CSV

Test T0538 at 2026-02-18

Test T0276 at 2026-02-17

Test T0278 at 2026-02-17

Test T0275 at 2026-01-26

Test T0279 at 2026-01-26

Test T0409 at 2026-01-26

Test T0282 at 2026-01-26

Test T0281 at 2026-01-26

Test T0273 at 2026-01-26

Test T0560 at 2026-01-26

Search Results
Show compact results Refine Search New Search

Download JSON Download CSV