{"_id": {"$oid": "69d4eb12b877594bc5718c36"}, "test_id": "T0620", "benchmark": "personnel_cards", "date": "2026-02-09", "tags": {"document-type": ["index-card"], "writing": ["handwritten", "typed", "printed"], "century": [20], "layout": ["table", "form"], "task": ["transcription", "document-understanding", "data-correction"], "language": ["de", "fr"]}, "contributors": ["tabea_wullschleger", "maximilian_hindermann"], "hidden": false, "config": {"provider": "openrouter", "model": "x-ai/grok-4", "dataclass": "Table", "temperature": 0.0, "role_description": "You are a historian specializing in 20th century personnel records and administrative documents", "prompt_file": "prompt.txt", "legacy_test": false}, "prompt": "Extract all information from this personnel card table and return it as a structured JSON object. The card contains rows documenting employment history with positions, locations, salary information, dates, and remarks.\n\nREQUIRED JSON STRUCTURE:\n```json\n{\n  \"rows\": [\n    {\n      \"row_number\": 1,\n      \"dienstliche_stellung\": {\n        \"diplomatic_transcript\": \"exact text from card\",\n        \"interpretation\": \"expanded/standardized form or null\",\n        \"is_crossed_out\": false\n      },\n      \"dienstort\": {\n        \"diplomatic_transcript\": \"exact text from card\",\n        \"interpretation\": \"expanded/standardized form or null\",\n        \"is_crossed_out\": false\n      },\n      \"gehaltsklasse\": {\n        \"diplomatic_transcript\": \"exact text from card\",\n        \"interpretation\": \"expanded/standardized form or null\",\n        \"is_crossed_out\": false\n      },\n      \"jahresgehalt_monatsgehalt_taglohn\": {\n        \"diplomatic_transcript\": \"exact text from card\",\n        \"interpretation\": \"standardized numeric form or null\",\n        \"is_crossed_out\": false\n      },\n      \"datum_gehalts\u00e4nderung\": {\n        \"diplomatic_transcript\": \"exact text from card\",\n        \"interpretation\": \"YYYY-MM-DD format or null\",\n        \"is_crossed_out\": false\n      },\n      \"bemerkungen\": {\n        \"diplomatic_transcript\": \"exact text from card\",\n        \"interpretation\": \"expanded/standardized form or null\",\n        \"is_crossed_out\": false\n      }\n    }\n  ]\n}\n```\n\nCOLUMN DEFINITIONS:\n1. **dienstliche_stellung**: Official position or job title (e.g., \"Assistent\", \"Professor\", \"Sekret\u00e4r\")\n2. **dienstort**: Place of service or work location (e.g., \"Basel\", \"Z\u00fcrich\")\n3. **gehaltsklasse**: Salary class or grade (e.g., \"III\", \"IV\", roman numerals or numbers)\n4. **jahresgehalt_monatsgehalt_taglohn**: Salary amount (annual/monthly/daily wage)\n5. **datum_gehalts\u00e4nderung**: Date of salary change or effective date\n6. **bemerkungen**: Remarks, notes, or additional comments\n\nFIELD EXTRACTION RULES:\n\n**diplomatic_transcript** (REQUIRED for all fields):\n- Transcribe EXACTLY as written on the card\n- Include all abbreviations, punctuation, and formatting as they appear\n- Preserve original capitalization and spacing\n- Include currency symbols and separators (e.g., \"Fr. 2'400.-\", \"3.700.-\")\n- For dates, copy the exact format (e.g., \"1. Jan. 1946\", \"1.April 1945\")\n- Use empty string \"\" for empty cells\n- Be sure to escape ditto marks (repetition marks such as `\"` indicating \"same as above\")\n- DO NOT expand abbreviations or standardize formats\n- Do not transcribe checkmarks\n\n**interpretation** (OPTIONAL - use null if not applicable):\n- For ditto marks (repetition marks such as `\"`): Replace with the actual repeated value from the previous row in the same column\n  - For example, if previous row's dienstliche_stellung is \"Hilfsleiterin\" and current row has `\"`, interpret as \"Hilfsleiterin\"\n  - Do NOT add explanatory text like \"wie oben\" or similar - just provide the repeated value\n- Never expand abbreviations\n- For dates: Convert to ISO format YYYY-MM-DD\n  - \"1. Jan. 1946\" \u2192 \"1946-01-01\"\n  - \"1.April 1945\" \u2192 \"1945-04-01\"\n- For salary amounts: Extract numeric value only (remove currency symbols, separators)\n  - \"Fr. 2'400.-\" \u2192 \"2400\"\n  - \"3.700.-\" \u2192 \"3700\"\n  - \"5.094.-\" \u2192 \"5094\"\n  \"6,30.-\" \u2192 \"6.3\"\n- For salary class: Convert roman numerals to arabic if clear\n  - \"III\" \u2192 \"3\"\n  - \"IV\" \u2192 \"4\"\n- Do not convert roman numerals to arabic if not salary, date, or salary class\n- Use null if no interpretation/expansion is needed or if the field is empty\n- \"+ {word}\" where {word} is a word does not need interpretation\n\n**is_crossed_out** (REQUIRED for all fields):\n- Set to true if text in this field is crossed out, struck through, or deleted\n- Set to false if text is normal (not crossed out)\n- Empty cells should have is_crossed_out: false\n\nROW HANDLING:\n- Number rows sequentially starting from 1 (top to bottom)\n- Include ALL rows that have ANY content in ANY column\n- Empty rows (all cells empty) should be omitted\n- A row with only one filled cell should still be included\n\nIMPORTANT NOTES:\n- Return ONLY the JSON object, no additional text or explanation\n- Every field must have all three sub-fields: diplomatic_transcript, interpretation, is_crossed_out\n- diplomatic_transcript must never be null (use empty string \"\" for empty cells)\n- interpretation can be null when no expansion/standardization applies\n- Process the entire table from top to bottom\n- Maintain consistent row numbering throughout\n", "results": {"text": "{\"rows\":[{\"row_number\":1,\"dienstliche_stellung\":{\"diplomatic_transcript\":\"Hilfsleiterin\",\"interpretation\":null,\"is_crossed_out\":false},\"dienstort\":{\"diplomatic_transcript\":\"\",\"interpretation\":null,\"is_crossed_out\":false},\"gehaltsklasse\":{\"diplomatic_transcript\":\"\",\"interpretation\":null,\"is_crossed_out\":false},\"jahresgehalt_monatsgehalt_taglohn\":{\"diplomatic_transcript\":\"4200.-\",\"interpretation\":\"4200\",\"is_crossed_out\":false},\"datum_gehalts\u00e4nderung\":{\"diplomatic_transcript\":\"1. Okt. 45\",\"interpretation\":\"1945-10-01\",\"is_crossed_out\":false},\"bemerkungen\":{\"diplomatic_transcript\":\"+ Zulagen\",\"interpretation\":null,\"is_crossed_out\":false}}]}", "model": "x-ai/grok-4", "provider": "openai", "finish_reason": "stop", "usage": {"input_tokens": 3213, "output_tokens": 916, "total_tokens": 4129, "cached_tokens": 679}, "duration": 19.61096453666687, "timestamp": "2026-02-09T19:10:34.431915", "parsed": {"rows": [{"row_number": 1, "dienstliche_stellung": {"diplomatic_transcript": "Hilfsleiterin", "interpretation": null, "is_crossed_out": false}, "dienstort": {"diplomatic_transcript": "", "interpretation": null, "is_crossed_out": false}, "gehaltsklasse": {"diplomatic_transcript": "", "interpretation": null, "is_crossed_out": false}, "jahresgehalt_monatsgehalt_taglohn": {"diplomatic_transcript": "4200.-", "interpretation": "4200", "is_crossed_out": false}, "datum_gehalts\u00e4nderung": {"diplomatic_transcript": "1. Okt. 45", "interpretation": "1945-10-01", "is_crossed_out": false}, "bemerkungen": {"diplomatic_transcript": "+ Zulagen", "interpretation": null, "is_crossed_out": false}}]}, "conversation_id": "990e17e9-f4a9-433d-9954-6189cb12f39b", "score": {"f1_score": 0.83, "precision": 0.8333333333333334, "recall": 0.8333333333333334, "true_positives": 10, "false_positives": 2, "false_negatives": 2, "field_scores": {"rows[0].dienstort.diplomatic_transcript": {"response": null, "ground_truth": null, "score": 1.0}, "rows[0].jahresgehalt_monatsgehalt_taglohn.diplomatic_transcript": {"response": "4200.-", "ground_truth": "4200.-", "score": 1.0}, "rows[0].dienstliche_stellung.diplomatic_transcript": {"response": "Hilfsleiterin", "ground_truth": "Hilfsleiterin", "score": 1.0}, "rows[0].bemerkungen.interpretation": {"response": null, "ground_truth": null, "score": 1.0}, "rows[0].jahresgehalt_monatsgehalt_taglohn.interpretation": {"response": "4200", "ground_truth": "4200", "score": 1.0}, "rows[0].dienstliche_stellung.interpretation": {"response": null, "ground_truth": null, "score": 1.0}, "rows[0].datum_gehalts\u00e4nderung.is_crossed_out": {"response": false, "ground_truth": false, "score": 1.0}, "rows[0].dienstort.interpretation": {"response": null, "ground_truth": null, "score": 1.0}, "rows[0].bemerkungen.diplomatic_transcript": {"response": "+ Zulagen", "ground_truth": "+ Zulagen", "score": 1.0}, "rows[0].bemerkungen.is_crossed_out": {"response": false, "ground_truth": false, "score": 1.0}, "rows[0].datum_gehalts\u00e4nderung.diplomatic_transcript": {"response": "1. Okt. 45", "ground_truth": "9. Okt. 45", "score": 0.9}, "rows[0].gehaltsklasse.is_crossed_out": {"response": false, "ground_truth": false, "score": 1.0}, "rows[0].dienstliche_stellung.is_crossed_out": {"response": false, "ground_truth": false, "score": 1.0}, "rows[0].dienstort.is_crossed_out": {"response": false, "ground_truth": false, "score": 1.0}, "rows[0].gehaltsklasse.diplomatic_transcript": {"response": null, "ground_truth": null, "score": 1.0}, "rows[0].jahresgehalt_monatsgehalt_taglohn.is_crossed_out": {"response": false, "ground_truth": false, "score": 1.0}, "rows[0].gehaltsklasse.interpretation": {"response": null, "ground_truth": null, "score": 1.0}, "rows[0].datum_gehalts\u00e4nderung.interpretation": {"response": "1945-10-01", "ground_truth": "1945-10-09", "score": 0.9}}, "total_fields": 19}, "raw_response": "{\"id\":\"gen-1770660609-QrUHgALAruryZcmvZ1hv\",\"choices\":[{\"finish_reason\":\"stop\",\"index\":0,\"logprobs\":null,\"message\":{\"content\":\" {\\\"rows\\\":[{\\\"row_number\\\":1,\\\"dienstliche_stellung\\\":{\\\"diplomatic_transcript\\\":\\\"Hilfsleiterin\\\",\\\"interpretation\\\":null,\\\"is_crossed_out\\\":false},\\\"dienstort\\\":{\\\"diplomatic_transcript\\\":\\\"\\\",\\\"interpretation\\\":null,\\\"is_crossed_out\\\":false},\\\"gehaltsklasse\\\":{\\\"diplomatic_transcript\\\":\\\"\\\",\\\"interpretation\\\":null,\\\"is_crossed_out\\\":false},\\\"jahresgehalt_monatsgehalt_taglohn\\\":{\\\"diplomatic_transcript\\\":\\\"4200.-\\\",\\\"interpretation\\\":\\\"4200\\\",\\\"is_crossed_out\\\":false},\\\"datum_gehalts\u00e4nderung\\\":{\\\"diplomatic_transcript\\\":\\\"1. Okt. 45\\\",\\\"interpretation\\\":\\\"1945-10-01\\\",\\\"is_crossed_out\\\":false},\\\"bemerkungen\\\":{\\\"diplomatic_transcript\\\":\\\"+ Zulagen\\\",\\\"interpretation\\\":null,\\\"is_crossed_out\\\":false}}]}\",\"refusal\":null,\"role\":\"assistant\",\"annotations\":null,\"audio\":null,\"function_call\":null,\"tool_calls\":null,\"parsed\":{\"rows\":[{\"row_number\":1,\"dienstliche_stellung\":{\"diplomatic_transcript\":\"Hilfsleiterin\",\"interpretation\":null,\"is_crossed_out\":false},\"dienstort\":{\"diplomatic_transcript\":\"\",\"interpretation\":null,\"is_crossed_out\":false},\"gehaltsklasse\":{\"diplomatic_transcript\":\"\",\"interpretation\":null,\"is_crossed_out\":false},\"jahresgehalt_monatsgehalt_taglohn\":{\"diplomatic_transcript\":\"4200.-\",\"interpretation\":\"4200\",\"is_crossed_out\":false},\"datum_gehalts\u00e4nderung\":{\"diplomatic_transcript\":\"1. Okt. 45\",\"interpretation\":\"1945-10-01\",\"is_crossed_out\":false},\"bemerkungen\":{\"diplomatic_transcript\":\"+ Zulagen\",\"interpretation\":null,\"is_crossed_out\":false}}]},\"reasoning\":null,\"reasoning_details\":[{\"id\":\"rs_d5c439b0-dcd5-e8d3-57b7-1b71c9539316\",\"format\":\"xai-responses-v1\",\"index\":0,\"type\":\"reasoning.encrypted\",\"data\":\"16lg9Ee9jBv2w+PMXn3MXdFMsHoy5gY7x4ylYtW/eomcpCT3zhba+6yey3qqSw1fNWLPPlFeF2DjS24B1Qf4NAj1iFSo2dQIJTw3lFsUgNYKLQjk2+g5ZIom6tTjQ1XoGnH1pHYlNQkUpI/XPnvU7c+idGb5Sxvl+me+5PkU3gGT6ZhElJCFycoWRFFYGpXdCc3XufhzJXSgC+Hemn9alz0yARM3f4TMWQR72G0bATQLKXo+WEqZYwb1eT9miwhH5XZBHfjVO5kbpYFQRtgmmBHscCgjPJ4DPD7cgrFG/GzctDg93Fw62UNiOYylQUHGlxRej4bH1IsSLBrfrG5kDvcp92xa8+Hv6CGMUI3o1QjvFHI4ldyYp8i3Kw1s6cALyItrdwfH5/0WMmCg0K3Tdmuh3tLDBF2VvmwWkXNG7CIR92nVkrp7Ur7lpCrpopQKX1Vtq/TkATs0YZDPXX5w/FxlW9pbxp3kXZybiBDhVS4ARvRlHAOFN5TSJR40liJyxLRl8RDQvr/1FEQ14VyvXOgnEtP7Q8bpbphG8VjgHG2f4O3+m6XdvYfK5onacSAXArP2xqaejbqHh3UrSIt49Y9hTOK8odcZqXRShDX4QBX4h3LztalEz9Ql+pUpvujuS8tikAjEKIBY+OyaUhIVccLoVuZ8KxsG+uyWaElMtsBppAwmUvrxC4ndUYwHY6RkUjiKTIijXODf6wmyJLKOaG0P+ryQBu+VWa3HlsZb5RCdMn8RzMPTkspjK3j3nTaUmPkQTh3aBgF98YeYf/qcZXZKxRl59/66OD3WHSvoD+shwqSGRLGQAeFhXJRgiLZHd5CxrcEK+epTEN5kupM6bpP2v9c+Qkexd7t0Lp9bNdaTUtAsVxe7QWz8vPaJoCo3GJ0n2dfWchTsuMZj9S7wKYGXPMxnNHcdZstMq6llbOAhxcw2+L4SxhhcwFUVgnR2HGZHkri0rgQt0K6aXsGAOCcmOgSDOMOOjuQzAMgaYHaHAg1WzHVN3PlWm4Kc0TrpyM4FdJg1h8NpEsSH5jAy0b1ADOfG0Le7GTkMGaljj0n8cR5j4bI3ve9wG6YJt3B5LMRlPMJlUDn1X5tHlQhu3BTbs3WV7LUlSSNwu0sKjVwYdckUZPE1UlYgQA+zlwzDimOcboE0S4pFjXNruDoaXdPBVB6ECvzMyG8RBvJrrLRcAKccKhHP/XFfiIsSpq2mo8vE+ho1MFhDS6nWW9zqRXj3EruI7y0FKXksfT3f/NSyBwIKgcSrX1B0OhgH1QiSQ6RSFwgxlJxr38b9ruh7hY/uGLSbKSk0Hj2nXbbr8val2qzPiQzYsSPH2+NSsXob0orK1R61N/H5HFU+hpkrRZRiikh19KcESdf05SbFrQf1wYnEdBCe6r2rMJExRXyyNg1z2ipXTot2Pj2gGCFdd4wrCSygTi6C5jPnQOfQ/vFhI2XILwmTAnMt8AHIyOQEB/OTrrFCeRcvXonjFQO2WFRakA2stBEHyEDSDeVfa6lfR542Wu1lcskcoab3OazZerJQa/P5ON6lcHnFO0rDitEzYtKvTmZQAUL4a12db9g/7PYxuTUMwxf+tiJ628X8a+w3X2FoGhg3w0YOposACWVoPm6DlrJsvgu3fgTtaW1caKrIShS0k2MCoOSAKQ5XLWmxeb8k1bOpqiI11uftSmCajmIP4X6D0K75/ta28LQ4aC4XXoW61qOWjf2xxKSpHetO+w0Q1oUh6d83HvK9Sfdff3VCk8MhL2j9o/KVce5XuyoO7qFfdE++a45G0X65LhgcZlHa63cjpPosj/S5KzqeEiYw2tjf+X6cVYZ7Z1Hax97hfji2Y4lzkmfDC0bzsMvjSV6xUxuSvDwZLELXyxMObILacDdB5Bnf1mKV6gOHkaup0pNjXsTFcfXvEX4tr6u4o3aKGx9VgD5pOnnF9pGt2t4d8ZjR37QLOXp+nVlQB6NoZ4DeM+hGp0xpxooRxvadC4LZSriEYWoz2oRR61hbKxddkLx6DYpXEfKLTPw0rf3S8DXJF1jsxOJF3e0uMlLt1y9yBwBbjpWhLZIdnpKFiBzZYKwDXdXgxUhDvk3CT+UaFMNvfJzqZ85BzH4AMFp1bhexJMCnR1jufZn1pyi+l0vcHGEUun/WCYYHJyoB9FlspamIPFyKPjZv05+KWPwG5BEuS7/Ajy/RUBjFsGWM88m2dLquFLLZCoGnRhexUo59Nbd2CZtldKVbH3oL92dOWk5DuwrE+jg1LjYJOmJYfHN/ld2o0+lJUAy4KH+4RXY4fIBd9KyHxutHge1cG8evP40lzHsmeiyw8QMGL885p9KLhQjPsNkC0Q4I5v+czUeD5w8/voRV/hPr/nABnOi8zkIOTRRWgWoiWMJqouagDLVNR49HZCk4DZXIdwT0XASsMp/hDnilCIXc2BXrBqdB2CN919vgg1bV7JB6sKMevDrZeEiL+ZQCOSn/6oVpjbf3SuQnQ3Krio4IMvagiVgsaiattQrHbxxjgUUF3X8m/CaQVcf5ZmxUZhLp+0DWU805zMDt8oF7Et7xFCRAXA7w1BbRARtdzIhJx9q6KlJ9GA3wtYLQ/fSL1p5udlnqe2K8UfUJFCcsOz8wfdj6mIjbrC3agwmo9j8oxcvlaMO5risRLQ2uS2g4uVtkT0LRiRZwhzrWz+sHw/v6ZobvfG+4XDUoSPTid/De4YiDhgf7FCUVbl4KpL1mdNNUauUAgJWDO1Ee5UUGHCKY7+6gdzGEr8jnhieGMNibObN4bdznpstmMbdeJjwBhuC9ADFML4IxvkZm95Y9orBJ+gt3iRcA2+x29NyOmpERzqx7IOjccQp7IOf7rdKHlQCxilE49CwUWK1dP/AWEShtJf4FAJBsn0ALqbwXKt12XQBWJyryvf/rLKn2BKYKA9MrlvdAa9+B8kQOUrI4nDr0Yl30uOfKuyXdmFfKASybqx+ZtG+sNLPX4DccoKlY7/qM/BT/DbU17n6mLpKIT965+JFcTd5Wd7f07Ods76b8I3ywUS0Ei4JmQHWoQRC5N+/KJlI0SutpXOD65Lr48VwLD/U0M1m+D+DJL3yz9lhuIt1EtPR/dA9NLAbBmcDMNgKVU81jr3pLetKHeW50eLdEevyUoKrp4cK2TxwkK1UFpzwoLO4JXq13eyI7A/r11X9zaGhseDkqkUINscGLtieNkIGYm4RPalc/oVG20dACY88HKSNDXvAUWlYeWOCiVMBYdFkskCzk2SGSgz1CIvt/7sITzS/gGzj/ApC9WxAPoD9YoqmYjmUhpibNrR6iMeyVTPpKnWzPaBALQoJJ6lFNPiX9+q1o5fX8GYRpyx3cwjN4TpkP+5/PhKmXgAgjxxjmKh6r1WzFWnEr1QbIHinyQUUh7tjzGtyy/DyXqd9BtMxKh84D5F1Yrv3yAeu1ARwI6h9mTMDhMxrs5fSWzzEMibJvqX0KNjeQYeahJOSb8/bjpWYi4IcakU33DPJX6yL6iSjn1y0p1/SGbb/3WmN0Ka64JRdf1EQhqzIli3SEr3yfLn1bQ75XWoaaFgT/l0TW6Yx9UzUvPR/0MxbXPzPIJiHHvmJub28V+emV/rnw7ycWEu2vUYAOp431TgBHLOPW8lLndmHLKJMLhe0Q\"}]},\"native_finish_reason\":\"completed\"}],\"created\":1770660609,\"model\":\"x-ai/grok-4\",\"object\":\"chat.completion\",\"service_tier\":null,\"system_fingerprint\":null,\"usage\":{\"completion_tokens\":916,\"prompt_tokens\":3213,\"total_tokens\":4129,\"completion_tokens_details\":{\"accepted_prediction_tokens\":null,\"audio_tokens\":null,\"reasoning_tokens\":741,\"rejected_prediction_tokens\":null},\"prompt_tokens_details\":{\"audio_tokens\":null,\"cached_tokens\":679},\"cost\":0.02185125,\"is_byok\":false,\"cost_details\":{\"upstream_inference_cost\":0.02185125,\"upstream_inference_prompt_cost\":0.00811125,\"upstream_inference_completions_cost\":0.01374}},\"provider\":\"xAI\"}"}, "scoring": {"f1_micro": 0.5597110072248195, "f1_macro": 0.5718032786885247, "micro_precision": 0.6019195612431444, "micro_recall": 0.5230341540905481, "total_instances": 61, "total_tp": 1317, "total_fp": 871, "total_fn": 1201, "cost_summary": {"total_input_tokens": 195993, "total_output_tokens": 265259, "total_tokens": 461252, "input_cost_usd": 0.0, "output_cost_usd": 0.0, "total_cost_usd": 0.0}}, "normalized_score": 55.97110072248195}