A test run is a single execution of a benchmark test using a defined model configuration.
Each run represents how a particular large language model (LLM) — such as GPT-4, Claude-3, or Gemini — performed on a given task at a specific time, with specific settings.
A test run includes:
Together, test runs make it possible to compare models, providers, and configurations across benchmarks in a transparent and reproducible way.
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | openai |
| Model | gpt-5.4-2026-03-05 |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 86.42 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.86 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 103.2K IT + 3.4K OT = 106.7K TT | Cost: 0.258$ + 0.051$ = 0.309$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | anthropic |
| Model | claude-opus-4-6 |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 95.80 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.96 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 93.3K IT + 6.4K OT = 99.7K TT | Cost: 0.466$ + 0.160$ = 0.626$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | genai |
| Model | gemini-3.1-flash-lite-preview |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 93.98 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.94 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 45.0K IT + 5.4K OT = 50.4K TT | Cost: 0.011$ + 0.008$ = 0.019$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | genai |
| Model | gemini-3.1-pro-preview |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 91.91 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.92 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 45.0K IT + 70.5K OT = 115.4K TT | Cost: 0.090$ + 0.846$ = 0.936$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | openai |
| Model | gpt-5.2-2025-12-11 |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 80.08 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.80 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 75.2K IT + 3.7K OT = 78.9K TT | Cost: 0.132$ + 0.052$ = 0.183$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | openai |
| Model | gpt-4.1-nano-2025-04-14 |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 81.97 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.82 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 137.8K IT + 2.9K OT = 140.8K TT | Cost: 0.014$ + 0.001$ = 0.015$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | openai |
| Model | gpt-4o-2024-08-06 |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 93.55 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.94 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 52.1K IT + 3.1K OT = 55.2K TT | Cost: 0.130$ + 0.031$ = 0.161$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | openai |
| Model | gpt-5-nano-2025-08-07 |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 76.48 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.76 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 90.1K IT + 86.7K OT = 176.8K TT | Cost: 0.005$ + 0.035$ = 0.039$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | openai |
| Model | gpt-4.1-mini-2025-04-14 |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 94.83 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.95 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 96.2K IT + 3.1K OT = 99.3K TT | Cost: 0.039$ + 0.005$ = 0.043$ |
{'document-type': ['index-card'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['de'], 'entry-type': ['company'], 'task': ['information-extraction']}
| Provider | openai |
| Model | gpt-5-2025-08-07 |
| Temperature | 0.5 |
| Dataclass | Card |
| Normalized Score | 91.96 % |
| Test time | unknown seconds |
You are a meticulous archivist extracting data from an index card image. Analyze the provided image and extract the following information. Return the data ONLY as a valid JSON object.
- "company": The primary company name, usually in the top-left. Exclude the location.
- "location": The city or town, often following the company name.
- "b_id": The identifier code, usually in the top-right, starting with "B.".
- "date": Any stamped dates on the card in YYYY-MM-DD format. If no date is present, use an empty string.
- "information": A list of text blocks from the main body of the card. Each block should be a separate string in the list. Maintain line breaks with \\n.
Here is the required JSON format:
{
"company": {"transcription": ""},
"location": {"transcription": ""},
"b_id": {"transcription": ""},
"date": "",
"information": [
{"transcription": ""}
]
}
If you cannot find a value for a field, leave its transcription value as an empty string. Do not add any explanatory text outside of the JSON object.
no valid result
| Fuzzy Score | F1 micro / macro | Micro precision/recall | Tue/False Positives | |||||
| 0.92 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
| Micro Precision | Micro Recall | Instances | TP | FP | FN | |||
| Pricing Date: n/a, n/a. | Tokens: 45.6K IT + 39.0K OT = 84.6K TT | Cost: 0.057$ + 0.390$ = 0.447$ |