RISE Humanities Data Benchmark, 0.5.0-pre1

Search Test Runs

 

A test run is a single execution of a benchmark test using a defined model configuration.
Each run represents how a particular large language model (LLM) — such as GPT-4, Claude-3, or Gemini — performed on a given task at a specific time, with specific settings.

A test run includes:

  • Prompt and role definition – what the model was asked to do and from what perspective (e.g. “as a historian”).
  • Model configuration – provider, model version, temperature, and other generation parameters.
  • Results – the model’s actual response and its evaluation (scores such as F1 or accuracy).
  • Usage and cost data – token counts and calculated API costs.
  • Metadata – information like the test date, benchmark name, and person who executed it.

Together, test runs make it possible to compare models, providers, and configurations across benchmarks in a transparent and reproducible way.

Search Results

Your search for Benchmark 'general_meeting_minutes__true' with Search Hidden 'False' returned 20 results, showing page 2 of 2.
Result 11 of 20

Test T0667 at 2026-03-16

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.4-2026-03-05
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score82.48 %
Test timeunknown seconds
Prompt

Extract the table content from the page image and return it in the required JSON structure.

The page contains a table of shareholders with the following columns:

1. N° D'ORDRE (row number)
2. NOM, PRÉNOMS ET DOMICILE DES ACTIONNAIRES (name and address)
3. ACTIONS O
4. ACTIONS P
5. NOMBRE DE VOIX
6. SIGNATURE DES ACTIONNAIRES OU DES MANDATAIRES

ROW STRUCTURE
-------------
Each entry begins with the row number in the first column.
All lines belonging to that number form one entry.

The second column contains BOTH name and address in the same field.
Split this content into:

- name
- address

The name is always the first line.
All following lines belong to the address.

Preserve the line breaks inside the address exactly as in the source.

Do NOT modify spelling or add accents. Transcribe the text exactly as shown.

Dashes inside addresses must be preserved.
Only remove a dash if it clearly separates the name from the address on the same line.

NUMERIC COLUMNS
---------------
Extract the values from the table columns:

- actions_o
- actions_p
- no_de_voix

Leave fields empty if no value is present.

SIGNATURE COLUMN
----------------
If the signature column contains handwriting or text:

signature_present = true
signature = transcription of the visible text

If the column is empty:

signature_present = false
signature = ""

TOTALS
------
Totals may appear below the table near the text "A REPORTER".
Only extract totals if explicit numbers are present.
Otherwise return empty strings for:

total_o
total_p
total_voix

DOCUMENT METADATA
-----------------
filename = {filename}
page_number = {page_number}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.82 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 63.5K IT + 5.0K OT = 68.5K TTCost: 0.159$0.075$0.233$
Result 12 of 20

Test T0664 at 2026-03-16

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.4-2026-03-05
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score86.20 %
Test timeunknown seconds
Prompt

Please extract the metadata according to the given output format.
Name and Address are in the same table field. Your task is to extract them into separate fields.
Lose any dashes between name and address but preserve linebreaks.
Be aware: Addresses may contain multiple dashes; preserve them, only remove "visual splitting characters" between name and address.
Filename: {filename}
Page: {page_number}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.86 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 50.7K IT + 5.0K OT = 55.7K TTCost: 0.127$0.074$0.201$
Result 13 of 20

Test T0666 at 2026-03-16

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.4-2026-03-05
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score82.59 %
Test timeunknown seconds
Prompt

Please extract the metadata according to the given output format.
Name and Address are in the same table field. Your task is to extract them into separate fields.
Lose any dashes between name and address but preserve linebreaks.
Be aware: Addresses may contain multiple dashes; preserve them, only remove "visual splitting characters" between name and address.
Each numbered entry may span multiple lines in the name/address field.
The presented image may contain rotated pages, extract everything, treat them as a MinutesPage.
Filename: {filename}
Page: {page_number}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.83 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 52.2K IT + 5.0K OT = 57.2K TTCost: 0.131$0.074$0.205$
Result 14 of 20

Test T0665 at 2026-03-16

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.4-2026-03-05
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score88.64 %
Test timeunknown seconds
Prompt

Filename: {filename}
Page: {page_number}

Prompt 2

CRITICAL RULES
--------------

1) Name and Address are in the same table field.
   - Extract them into separate fields: "name" and "address".
   - Remove only the visual splitting characters (e.g., a dash used only to visually separate name and address).
   - Preserve line breaks exactly as in the source.
   - Addresses may contain multiple dashes; preserve those. Only remove the dash(es) that act purely as a separator between name and address.

2) Transcription fidelity (no normalization):
   - Many entries are in French. The original often omits accents (é, è) and shows only "e".
   - Do NOT correct spelling and do NOT add French diacritics unless they are explicitly present in the original image.
   - Transcribe exactly what is visible, including punctuation and spacing as much as possible.

3) totals_actions must NOT be computed:
   - Do NOT calculate totals.
   - Only fill "total_actions" if the totals are explicitly written on the page.
   - You can always find the information for the totals at the end of the page after “A Reporter”
   - If there is no totals row/section, still output the JSON but set total_actions fields to empty strings OR omit total_actions only if your parser requires it (prefer empty strings if the field is required).

4) actions_o and actions_p content:
   - These fields may contain digits as well as "/" and "-" characters. Preserve them exactly as seen.

5) Column order may vary:
   - The order of the "actions_o" and "actions_p" columns can change from page to page.
   - Do NOT assume a fixed column order. Always verify from the header / column labels / layout in the original image which column corresponds to "O" and which to "P", and map values accordingly.

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.89 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 65.3K IT + 5.1K OT = 70.4K TTCost: 0.163$0.076$0.239$
Result 15 of 20

Test T0668 at 2026-03-16

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.4-2026-03-05
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score78.32 %
Test timeunknown seconds
Prompt

Please extract the metadata according to the given output format.

Name and address are in the same table field. Split them into:

- name
- address

The name is always the first line.
All following lines belong to the address.

Preserve line breaks exactly as they appear.
Do not normalize spelling or accents.

Steps:
1. Identify each row of the table using the row number.
2. For each row extract the table columns.
3. Split the name/address field.
4. Return the final JSON.

Document Metadata:
filename = {filename}
page_number = {page_number}

Return only the JSON.

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.78 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 51.8K IT + 4.4K OT = 56.2K TTCost: 0.129$0.066$0.196$
Result 16 of 20

Test T0624 at 2026-03-12

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.2-2025-12-11
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score73.13 %
Test timeunknown seconds
Prompt

Filename: {filename}
Page: {page_number}

Prompt 2

CRITICAL RULES
--------------

1) Name and Address are in the same table field.
   - Extract them into separate fields: "name" and "address".
   - Remove only the visual splitting characters (e.g., a dash used only to visually separate name and address).
   - Preserve line breaks exactly as in the source.
   - Addresses may contain multiple dashes; preserve those. Only remove the dash(es) that act purely as a separator between name and address.

2) Transcription fidelity (no normalization):
   - Many entries are in French. The original often omits accents (é, è) and shows only "e".
   - Do NOT correct spelling and do NOT add French diacritics unless they are explicitly present in the original image.
   - Transcribe exactly what is visible, including punctuation and spacing as much as possible.

3) totals_actions must NOT be computed:
   - Do NOT calculate totals.
   - Only fill "total_actions" if the totals are explicitly written on the page.
   - You can always find the information for the totals at the end of the page after “A Reporter”
   - If there is no totals row/section, still output the JSON but set total_actions fields to empty strings OR omit total_actions only if your parser requires it (prefer empty strings if the field is required).

4) actions_o and actions_p content:
   - These fields may contain digits as well as "/" and "-" characters. Preserve them exactly as seen.

5) Column order may vary:
   - The order of the "actions_o" and "actions_p" columns can change from page to page.
   - Do NOT assume a fixed column order. Always verify from the header / column labels / layout in the original image which column corresponds to "O" and which to "P", and map values accordingly.

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.73 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 56.1K IT + 4.5K OT = 60.6K TTCost: 0.098$0.063$0.161$
Result 17 of 20

Test T0625 at 2026-03-12

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.2-2025-12-11
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score84.93 %
Test timeunknown seconds
Prompt

Please extract the metadata according to the given output format.
Name and Address are in the same table field. Your task is to extract them into separate fields.
Lose any dashes between name and address but preserve linebreaks.
Be aware: Addresses may contain multiple dashes; preserve them, only remove "visual splitting characters" between name and address.
Each numbered entry may span multiple lines in the name/address field.
The presented image may contain rotated pages, extract everything, treat them as a MinutesPage.
Filename: {filename}
Page: {page_number}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.85 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 45.1K IT + 5.0K OT = 50.1K TTCost: 0.079$0.070$0.149$
Result 18 of 20

Test T0627 at 2026-03-12

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.2-2025-12-11
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score81.88 %
Test timeunknown seconds
Prompt

Please extract the metadata according to the given output format.

Name and address are in the same table field. Split them into:

- name
- address

The name is always the first line.
All following lines belong to the address.

Preserve line breaks exactly as they appear.
Do not normalize spelling or accents.

Steps:
1. Identify each row of the table using the row number.
2. For each row extract the table columns.
3. Split the name/address field.
4. Return the final JSON.

Document Metadata:
filename = {filename}
page_number = {page_number}

Return only the JSON.

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.82 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 45.3K IT + 5.0K OT = 50.3K TTCost: 0.079$0.070$0.150$
Result 19 of 20

Test T0623 at 2026-03-12

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.2-2025-12-11
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score80.67 %
Test timeunknown seconds
Prompt

Please extract the metadata according to the given output format.
Name and Address are in the same table field. Your task is to extract them into separate fields.
Lose any dashes between name and address but preserve linebreaks.
Be aware: Addresses may contain multiple dashes; preserve them, only remove "visual splitting characters" between name and address.
Filename: {filename}
Page: {page_number}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.81 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 43.7K IT + 5.0K OT = 48.7K TTCost: 0.076$0.071$0.147$
Result 20 of 20

Test T0626 at 2026-03-12

{'document-type': ['minutes'], 'writing': ['typed', 'handwritten'], 'century': [20], 'language': ['it', 'fr', 'de'], 'layout': ['table'], 'entry-type': ['person', 'location'], 'task': ['information-extraction']}

Configuration
Provideropenai
Modelgpt-5.2-2025-12-11
  
Temperature1.0
DataclassMinutesPage
  
Normalized Score81.00 %
Test timeunknown seconds
Prompt

Extract the table content from the page image and return it in the required JSON structure.

The page contains a table of shareholders with the following columns:

1. N° D'ORDRE (row number)
2. NOM, PRÉNOMS ET DOMICILE DES ACTIONNAIRES (name and address)
3. ACTIONS O
4. ACTIONS P
5. NOMBRE DE VOIX
6. SIGNATURE DES ACTIONNAIRES OU DES MANDATAIRES

ROW STRUCTURE
-------------
Each entry begins with the row number in the first column.
All lines belonging to that number form one entry.

The second column contains BOTH name and address in the same field.
Split this content into:

- name
- address

The name is always the first line.
All following lines belong to the address.

Preserve the line breaks inside the address exactly as in the source.

Do NOT modify spelling or add accents. Transcribe the text exactly as shown.

Dashes inside addresses must be preserved.
Only remove a dash if it clearly separates the name from the address on the same line.

NUMERIC COLUMNS
---------------
Extract the values from the table columns:

- actions_o
- actions_p
- no_de_voix

Leave fields empty if no value is present.

SIGNATURE COLUMN
----------------
If the signature column contains handwriting or text:

signature_present = true
signature = transcription of the visible text

If the column is empty:

signature_present = false
signature = ""

TOTALS
------
Totals may appear below the table near the text "A REPORTER".
Only extract totals if explicit numbers are present.
Otherwise return empty strings for:

total_o
total_p
total_voix

DOCUMENT METADATA
-----------------
filename = {filename}
page_number = {page_number}

Results

no valid result

Scoring
Fuzzy Score F1 micro / macro Micro precision/recall Tue/False Positives
0.81 n/a n/a n/a n/a n/a n/a n/a n/a
      Micro Precision Micro Recall Instances TP FP FN
Costs / Pricing
Pricing Date: n/an/aTokens: 56.2K IT + 4.9K OT = 61.2K TTCost: 0.098$0.069$0.168$