{"_id": {"$oid": "6a26c67c3572c06bc1455dc0"}, "test_id": "T0893", "benchmark": "business_letters", "date": "2026-04-21", "tags": {"document-type": ["letter"], "writing": ["typed", "handwritten"], "century": [20], "language": ["de"], "layout": ["prose"], "entry-type": ["person", "location"], "task": ["information-extraction"]}, "contributors": ["eric_decker", "maximilian_hindermann", "lea_kasper", "anthea_alberto", "pema_frick", "jose_luis_losada_palenzuela", "sorin_marti", "elena_spadini"], "hidden": false, "config": {"provider": "openrouter", "model": "qwen/qwen3.6-plus", "dataclass": "Document", "temperature": 0.0, "role_description": "You are a historian with keyword knowledge and an expert in the field of 20th century Swiss history", "prompt_file": "prompt.txt", "legacy_test": false}, "prompt": "            IDENTITY and PURPOSE:\n\n            You are presented with a series of images constituting a historical letter. Your task is to extract the\n            values of the following keys from the letter and return them in a JSON file where the values\n            corresponding to each key should be stored as a list, even if there is only a single value for a\n            key:\n\n            - letter_title: Title of the letter.\n            - sender_persons: Name(s) of the person(s) who wrote the letter.\n            - send_date: The exact or approximate date the letter was written.\n            - receiver_persons: Name(s) of the person(s) who received the letter.\n\n            Take a deep breath and think step by step about how to best accomplish this goal. Map out all the claims\n            and implications on a virtual whiteboard in your mind. Do not use OCR. Use the ISO format YYYY-MM-DD for\n            dates. If a piece of information is not included in the letter, set the value for the corresponding key\n            to \"null\". Do not return anything except the JSON file.\n\n            EXAMPLE:\n\n            {\n                \"letter_title\": [\"Petition for Environmental Protection\"],\n                \"send_date\": [\"1993-03-12\"],\n                \"sender_persons\": [\"Lisa Simpson\"],\n                \"receiver_persons\": [\"Mayor Joe Quimby\", \"Seymour Skinner\"],\n            }\n\n            OUTPUT:", "results": {"text": "{\"metadata\":{\"send_date\":[\"1928-04-23\"],\"letter_title\":[\"Entwurf des Kaufvertrages betr. das Boot \\\"Rheinfelden\\\"\"],\"sender_persons\":[\"Basler Personenschiffahrtsgesellschaft\"],\"receiver_persons\":[\"Basler Rheinschiffahrt-Aktiengesellschaft\"]}}", "model": "qwen/qwen3.6-plus-04-02", "provider": "openrouter", "finish_reason": "stop", "usage": {"input_tokens": 5646, "output_tokens": 2108, "total_tokens": 7754, "input_cost_usd": 0.0018349500000000001, "output_cost_usd": 0.0041106, "estimated_cost_usd": 0.00594555}, "duration": 32.4582633972168, "timestamp": "2026-04-21T13:10:33.517113", "parsed": {"metadata": {"send_date": ["1928-04-23"], "letter_title": ["Entwurf des Kaufvertrages betr. das Boot \"Rheinfelden\""], "sender_persons": ["Basler Personenschiffahrtsgesellschaft"], "receiver_persons": ["Basler Rheinschiffahrt-Aktiengesellschaft"], "document_number": "letter26"}}, "conversation_id": "a3816c24-1179-41a2-b847-4b7cc67658be", "score": {"send_date_tp": 1, "send_date_fp": 0, "send_date_fn": 0, "sender_persons_tp": 0, "sender_persons_fp": 1, "sender_persons_fn": 0, "receiver_persons_tp": 0, "receiver_persons_fp": 1, "receiver_persons_fn": 0}, "raw_response": "{\"id\":\"gen-1776769801-7EZwVC8lF9z0OXSdb0g0\",\"choices\":[{\"finish_reason\":\"stop\",\"index\":0,\"logprobs\":null,\"message\":{\"content\":\"{\\n    \\\"metadata\\\": {\\n        \\\"send_date\\\": [\\n            \\\"1928-04-23\\\"\\n        ],\\n        \\\"letter_title\\\": [\\n            \\\"Entwurf des Kaufvertrages betr. das Boot \\\\\\\"Rheinfelden\\\\\\\"\\\"\\n        ],\\n        \\\"sender_persons\\\": [\\n            \\\"Basler Personenschiffahrtsgesellschaft\\\"\\n        ],\\n        \\\"receiver_persons\\\": [\\n            \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\"\\n        ]\\n    }\\n}\",\"refusal\":null,\"role\":\"assistant\",\"annotations\":null,\"audio\":null,\"function_call\":null,\"tool_calls\":null,\"parsed\":{\"metadata\":{\"send_date\":[\"1928-04-23\"],\"letter_title\":[\"Entwurf des Kaufvertrages betr. das Boot \\\"Rheinfelden\\\"\"],\"sender_persons\":[\"Basler Personenschiffahrtsgesellschaft\"],\"receiver_persons\":[\"Basler Rheinschiffahrt-Aktiengesellschaft\"]}},\"reasoning\":\"user wants me to extract metadata from a historical letter provided as images.\\n\\n**1. Analyze the input:**\\n- Two images of a typewritten letter in German.\\n- The letter is dated \\\"Basel, den 23. April 1928.\\\"\\n- The recipient is \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\".\\n- The sender is \\\"Basler Personenschiffahrtsgesellschaft\\\" (found at the bottom of the second page).\\n- The content discusses a purchase contract for a boat named \\\"Rheinfelden\\\".\\n\\n**2. Extract specific fields:**\\n\\n*   **letter_title:** There isn't an explicit title like \\\"Subject: ...\\\". However, the first paragraph mentions \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". This acts as the subject. I can formulate a title based on this, or leave it null if strictly looking for a header. The prompt asks for \\\"Title of the letter\\\". Often in these tasks, if there's no explicit title, I might infer one or look for a subject line. Here, the text starts with \\\"Wir sind im Besitze Ihres Geehrten vom 28. M\u00e4rz...\\\". It's a response to a previous letter. The subject matter is the purchase contract for the boat \\\"Rheinfelden\\\". Let's look for a specific title line. There isn't one. I will set this to null or perhaps a descriptive title if appropriate, but the instructions say \\\"If a piece of information is not included... set to null\\\". Wait, looking closely at the first paragraph: \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". This is the subject matter. I will treat this as the implicit title or just set it to null if I must be strict. Let's look at the example. The example has \\\"Petition for Environmental Protection\\\". This suggests I can derive a title. Let's try to derive a title: \\\"Kaufvertrag Boot Rheinfelden\\\" or similar. Or perhaps just null. Let's look at the prompt again. \\\"Title of the letter.\\\" Usually, this refers to a subject line. There is no subject line. I will set it to null to be safe, or perhaps extract the subject matter. Let's look at the text again. \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". This is the closest thing to a title. I will use a descriptive title based on the content: \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". Actually, looking at the structure, it's a business letter. Often these don't have titles. I will set it to null to be precise, or maybe \\\"Letter regarding purchase contract for boat Rheinfelden\\\". Let's stick to null if there's no explicit title line. Wait, looking at the first page, top right, there is a red number \\\"26\\\". That's an archive number.\\n    *   Decision: The prompt asks for \\\"Title of the letter\\\". If there isn't one, null. There is no \\\"Betreff:\\\" line. I will set it to null.\\n\\n*   **sender_persons:**\\n    *   Bottom of page 2: \\\"Basler Personenschiffahrtsgesellschaft\\\".\\n    *   This is a company, not a person, but the field is \\\"sender_persons\\\". In historical document extraction, organizations are often treated as the \\\"person\\\" entity if no individual is named.\\n    *   So, sender: [\\\"Basler Personenschiffahrtsgesellschaft\\\"].\\n\\n*   **send_date:**\\n    *   Top right of page 1: \\\"Basel, den 23. April 1928.\\\"\\n    *   Format: YYYY-MM-DD.\\n    *   Date: \\\"1928-04-23\\\".\\n\\n*   **receiver_persons:**\\n    *   Top center of page 1: \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\".\\n    *   Address: \\\"Basel, Marktgasse 5.\\\"\\n    *   So, receiver: [\\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\"].\\n\\n**3. Refine the extraction:**\\n*   **letter_title:** I'll look again. Is there a title? No. I will set it to null.\\n*   **sender_persons:** \\\"Basler Personenschiffahrtsgesellschaft\\\".\\n*   **send_date:** \\\"1928-04-23\\\".\\n*   **receiver_persons:** \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\".\\n\\n**4. Final JSON construction:**\\n*   `metadata`:\\n    *   `send_date`: [\\\"1928-04-23\\\"]\\n    *   `letter_title`: [null] (or maybe I should extract the subject \\\"Entwurf des Kaufvertrages...\\\"? The prompt says \\\"Title of the letter\\\". If I look at the example \\\"Petition for...\\\", that's a descriptive title. I will try to be helpful. The text says \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". I will use that as the title if possible, but usually, titles are explicit. Let's assume null is safer if not explicit. However, often in these datasets, the \\\"title\\\" is the subject line. There is no subject line. I will use null.)\\n    *   Wait, let's look at the text \\\"Wir sind im Besitze Ihres Geehrten vom 28. M\u00e4rz, mit dem Sie uns den Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden' \u00fcbermitteln.\\\" This confirms the subject. I will leave it as null because there is no header title.\\n    *   `sender_persons`: [\\\"Basler Personenschiffahrtsgesellschaft\\\"]\\n    *   `receiver_persons`: [\\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\"]\\n\\nLet's double check the sender. The letter ends with \\\"In Hochachtung Basler Personenschiffahrtsgesellschaft\\\". Yes.\\nLet's double check the receiver. The letter is addressed to \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\". Yes.\\n\\nOne detail: The prompt asks for \\\"sender_persons\\\" and \\\"receiver_persons\\\". These are companies. I will include them as strings in the list.\\n\\nLet's re-evaluate \\\"letter_title\\\". If I look at the first paragraph, it mentions \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". This is the subject. I will set letter_title to null as there is no explicit title line like \\\"Subject: ...\\\".\\n\\nWait, looking at the very top right, there is a red \\\"26\\\". That's likely a folio number.\\n\\nOkay, let's assemble the JSON.\\n\\n```json\\n{\\n  \\\"metadata\\\": {\\n    \\\"send_date\\\": [\\\"1928-04-23\\\"],\\n    \\\"letter_title\\\": [null],\\n    \\\"sender_persons\\\": [\\\"Basler Personenschiffahrtsgesellschaft\\\"],\\n    \\\"receiver_persons\\\": [\\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\"]\\n  }\\n}\\n```\\nActually, looking at the prompt's example, \\\"Petition for Environmental Protection\\\" is a title. If I can infer a title, maybe I should. But usually, these tasks want explicit text. I will stick with null for title.\\n\\nWait, is there any other date? \\\"28. M\u00e4rz\\\" is mentioned as the date of the received letter. \\\"1. April 1928\\\" is mentioned in the text. The send date is definitely 23. April 1928.\\n\\nOne final check on the sender. The text says \\\"Die Personenschiffahrt verkauft an die Basler...\\\". And the signature is \\\"Basler Personenschiffahrtsgesellschaft\\\". This confirms the sender.\\n\\nOne final check on the receiver. \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\". This confirms the receiver.\\n\\nThe JSON schema requires lists for all values.\\n\\nFinal check on the schema:\\n`send_date`: list of strings or null.\\n`letter_title`: list of strings or null.\\n`sender_persons`: list of strings or null.\\n`receiver_persons`: list of strings or null.\\n\\nThe output must be just the JSON.\\n\\nOne small thing: \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\" is the receiver.\\n\\\"Basler Personenschiffahrtsgesellschaft\\\" is the sender.\\n\\nThe letter title is tricky. I will provide null.\\n\\nLet's verify the text one more time.\\nPage 1:\\n\\\"Basel, den 23. April 1928.\\\" -> Date.\\n\\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\" -> Receiver.\\n\\\"Basel Marktgasse 5.\\\" -> Address.\\n\\nPage 2:\\n\\\"Basler Personenschiffahrtsgesellschaft\\\" -> Sender (at the bottom).\\n\\nOkay, I'm confident.\\n\\nOne detail: \\\"letter_title\\\". Sometimes the first line is considered the title if it's distinct. Here the first line is the date/place. The next block is the receiver address. Then the body starts. No title.\\n\\nI will formulate the JSON now.\",\"reasoning_details\":[{\"type\":\"reasoning.text\",\"text\":\"user wants me to extract metadata from a historical letter provided as images.\\n\\n**1. Analyze the input:**\\n- Two images of a typewritten letter in German.\\n- The letter is dated \\\"Basel, den 23. April 1928.\\\"\\n- The recipient is \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\".\\n- The sender is \\\"Basler Personenschiffahrtsgesellschaft\\\" (found at the bottom of the second page).\\n- The content discusses a purchase contract for a boat named \\\"Rheinfelden\\\".\\n\\n**2. Extract specific fields:**\\n\\n*   **letter_title:** There isn't an explicit title like \\\"Subject: ...\\\". However, the first paragraph mentions \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". This acts as the subject. I can formulate a title based on this, or leave it null if strictly looking for a header. The prompt asks for \\\"Title of the letter\\\". Often in these tasks, if there's no explicit title, I might infer one or look for a subject line. Here, the text starts with \\\"Wir sind im Besitze Ihres Geehrten vom 28. M\u00e4rz...\\\". It's a response to a previous letter. The subject matter is the purchase contract for the boat \\\"Rheinfelden\\\". Let's look for a specific title line. There isn't one. I will set this to null or perhaps a descriptive title if appropriate, but the instructions say \\\"If a piece of information is not included... set to null\\\". Wait, looking closely at the first paragraph: \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". This is the subject matter. I will treat this as the implicit title or just set it to null if I must be strict. Let's look at the example. The example has \\\"Petition for Environmental Protection\\\". This suggests I can derive a title. Let's try to derive a title: \\\"Kaufvertrag Boot Rheinfelden\\\" or similar. Or perhaps just null. Let's look at the prompt again. \\\"Title of the letter.\\\" Usually, this refers to a subject line. There is no subject line. I will set it to null to be safe, or perhaps extract the subject matter. Let's look at the text again. \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". This is the closest thing to a title. I will use a descriptive title based on the content: \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". Actually, looking at the structure, it's a business letter. Often these don't have titles. I will set it to null to be precise, or maybe \\\"Letter regarding purchase contract for boat Rheinfelden\\\". Let's stick to null if there's no explicit title line. Wait, looking at the first page, top right, there is a red number \\\"26\\\". That's an archive number.\\n    *   Decision: The prompt asks for \\\"Title of the letter\\\". If there isn't one, null. There is no \\\"Betreff:\\\" line. I will set it to null.\\n\\n*   **sender_persons:**\\n    *   Bottom of page 2: \\\"Basler Personenschiffahrtsgesellschaft\\\".\\n    *   This is a company, not a person, but the field is \\\"sender_persons\\\". In historical document extraction, organizations are often treated as the \\\"person\\\" entity if no individual is named.\\n    *   So, sender: [\\\"Basler Personenschiffahrtsgesellschaft\\\"].\\n\\n*   **send_date:**\\n    *   Top right of page 1: \\\"Basel, den 23. April 1928.\\\"\\n    *   Format: YYYY-MM-DD.\\n    *   Date: \\\"1928-04-23\\\".\\n\\n*   **receiver_persons:**\\n    *   Top center of page 1: \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\".\\n    *   Address: \\\"Basel, Marktgasse 5.\\\"\\n    *   So, receiver: [\\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\"].\\n\\n**3. Refine the extraction:**\\n*   **letter_title:** I'll look again. Is there a title? No. I will set it to null.\\n*   **sender_persons:** \\\"Basler Personenschiffahrtsgesellschaft\\\".\\n*   **send_date:** \\\"1928-04-23\\\".\\n*   **receiver_persons:** \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\".\\n\\n**4. Final JSON construction:**\\n*   `metadata`:\\n    *   `send_date`: [\\\"1928-04-23\\\"]\\n    *   `letter_title`: [null] (or maybe I should extract the subject \\\"Entwurf des Kaufvertrages...\\\"? The prompt says \\\"Title of the letter\\\". If I look at the example \\\"Petition for...\\\", that's a descriptive title. I will try to be helpful. The text says \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". I will use that as the title if possible, but usually, titles are explicit. Let's assume null is safer if not explicit. However, often in these datasets, the \\\"title\\\" is the subject line. There is no subject line. I will use null.)\\n    *   Wait, let's look at the text \\\"Wir sind im Besitze Ihres Geehrten vom 28. M\u00e4rz, mit dem Sie uns den Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden' \u00fcbermitteln.\\\" This confirms the subject. I will leave it as null because there is no header title.\\n    *   `sender_persons`: [\\\"Basler Personenschiffahrtsgesellschaft\\\"]\\n    *   `receiver_persons`: [\\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\"]\\n\\nLet's double check the sender. The letter ends with \\\"In Hochachtung Basler Personenschiffahrtsgesellschaft\\\". Yes.\\nLet's double check the receiver. The letter is addressed to \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\". Yes.\\n\\nOne detail: The prompt asks for \\\"sender_persons\\\" and \\\"receiver_persons\\\". These are companies. I will include them as strings in the list.\\n\\nLet's re-evaluate \\\"letter_title\\\". If I look at the first paragraph, it mentions \\\"Entwurf des Kaufvertrages betr. das Boot 'Rheinfelden'\\\". This is the subject. I will set letter_title to null as there is no explicit title line like \\\"Subject: ...\\\".\\n\\nWait, looking at the very top right, there is a red \\\"26\\\". That's likely a folio number.\\n\\nOkay, let's assemble the JSON.\\n\\n```json\\n{\\n  \\\"metadata\\\": {\\n    \\\"send_date\\\": [\\\"1928-04-23\\\"],\\n    \\\"letter_title\\\": [null],\\n    \\\"sender_persons\\\": [\\\"Basler Personenschiffahrtsgesellschaft\\\"],\\n    \\\"receiver_persons\\\": [\\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\"]\\n  }\\n}\\n```\\nActually, looking at the prompt's example, \\\"Petition for Environmental Protection\\\" is a title. If I can infer a title, maybe I should. But usually, these tasks want explicit text. I will stick with null for title.\\n\\nWait, is there any other date? \\\"28. M\u00e4rz\\\" is mentioned as the date of the received letter. \\\"1. April 1928\\\" is mentioned in the text. The send date is definitely 23. April 1928.\\n\\nOne final check on the sender. The text says \\\"Die Personenschiffahrt verkauft an die Basler...\\\". And the signature is \\\"Basler Personenschiffahrtsgesellschaft\\\". This confirms the sender.\\n\\nOne final check on the receiver. \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\". This confirms the receiver.\\n\\nThe JSON schema requires lists for all values.\\n\\nFinal check on the schema:\\n`send_date`: list of strings or null.\\n`letter_title`: list of strings or null.\\n`sender_persons`: list of strings or null.\\n`receiver_persons`: list of strings or null.\\n\\nThe output must be just the JSON.\\n\\nOne small thing: \\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\" is the receiver.\\n\\\"Basler Personenschiffahrtsgesellschaft\\\" is the sender.\\n\\nThe letter title is tricky. I will provide null.\\n\\nLet's verify the text one more time.\\nPage 1:\\n\\\"Basel, den 23. April 1928.\\\" -> Date.\\n\\\"Basler Rheinschiffahrt-Aktiengesellschaft\\\" -> Receiver.\\n\\\"Basel Marktgasse 5.\\\" -> Address.\\n\\nPage 2:\\n\\\"Basler Personenschiffahrtsgesellschaft\\\" -> Sender (at the bottom).\\n\\nOkay, I'm confident.\\n\\nOne detail: \\\"letter_title\\\". Sometimes the first line is considered the title if it's distinct. Here the first line is the date/place. The next block is the receiver address. Then the body starts. No title.\\n\\nI will formulate the JSON now.\",\"format\":\"unknown\",\"index\":0}]},\"native_finish_reason\":\"stop\"}],\"created\":1776769801,\"model\":\"qwen/qwen3.6-plus-04-02\",\"object\":\"chat.completion\",\"service_tier\":null,\"system_fingerprint\":null,\"usage\":{\"completion_tokens\":2108,\"prompt_tokens\":5646,\"total_tokens\":7754,\"completion_tokens_details\":{\"accepted_prediction_tokens\":null,\"audio_tokens\":0,\"reasoning_tokens\":1985,\"rejected_prediction_tokens\":null,\"image_tokens\":0},\"prompt_tokens_details\":{\"audio_tokens\":0,\"cached_tokens\":0,\"cache_write_tokens\":0,\"video_tokens\":0},\"cost\":0.00594555,\"is_byok\":false,\"cost_details\":{\"upstream_inference_cost\":0.00594555,\"upstream_inference_prompt_cost\":0.00183495,\"upstream_inference_completions_cost\":0.0041106}},\"provider\":\"Alibaba\"}"}, "scoring": {"f1_macro": 0.56, "f1_micro": 0.56, "cost_summary": {"total_input_tokens": 258886, "total_output_tokens": 198565, "total_tokens": 457451, "input_cost_usd": 0.08413794999999996, "output_cost_usd": 0.38720175, "total_cost_usd": 0.4713396999999999}}, "normalized_score": 56.00000000000001}