This benchmark focuses on the extraction of metadata from historic letters. It provides a ground truth for the metadata categories `send_date`, `sender_persons` and `receiver_persons` for a collection of letters from the Basler Rheinschifffahrt-Aktiengesellschaft between 1926 and 1932, and F1-micro and F1-macro scores across these categories.
| Data Type | Images (JPG, 2479x3508, ~600 KB each) |
|---|---|
| Amount | 57 letter, 98 images |
| Origin | http://dx.doi.org/10.7891/e-manuscripta-54917 |
| Signature | CH SWA HS 191 V 10 |
| Language | German |
| Content Description | Collection of letters from the Basler Rheinschifffahrt-Aktiengesellschaft |
| Time Period | 1926 - 1932 |
| License | Academic use |
| Tags |
|
This benchmark uses as input the digital collection "Basler Rheinschifffahrt-Aktiengesellschaft, insbesondere über die Veräusserung des Dieselmotorbootes 'Rheinfelden' und die Gewährung eines Darlehens zur Finanzierung der Erstellung des Dieselmotorbootes 'Rhyblitz' an diese Firma" (shelf mark: `CH SWA HS 191 V 10`, persistent link: http://dx.doi.org/10.7891/e-manuscripta-54917, referred to as "the collection" in what follows) of the Schweizer Wirtschaftsarchiv.
The collection consists of 68 letters of various length (mostly 1-3 pages). The letters are dated between 1926 and 1932 and are written in German. The letters are mostly typewritten, with some handwritten annotations. The letters reflect the correspondence of the Basler Rheinschifffahrt-Aktiengesellschaft and are mostly signed by individuals or companies. The letters cover a variety of topics, including business transactions, shipping schedules, and personnel matters. In this benchmark, a subset of 57 letters have been ground truthed (see below).
The ground truth for the collection has been created in a Google Sheet and then imported and used to benchmark LLMs with respect to information extraction tasks. The following fields have been extracted:
| Field Name | Description | Data Type |
|---|---|---|
| transkribus_doc_url | URL link to the letter on Transkribus. |
|
| document_number | The letter's number is between 1 and 68, inclusive (1 ≤ i ≤ 68). |
|
| done | Indicates whether the creation of the ground truth is completed. |
|
| checked_by | Identifier of the person who is responsible for creating the ground truth. |
|
| send_date | Date when the letter was sent. |
|
| letter_title | Title of the letter as diplomatically inscribed. |
|
| sender_persons_inscribed | Sender person(s) as diplomatically inscribed. |
|
| sender_persons | Individuals explicitly mentioned as senders in the document. |
|
| receiver_persons_inscribed | Receiver person(s) as diplomatically inscribed. |
|
| receiver_persons | Individuals associated with receiving the document, inferred or explicitly stated. |
|
| has_signatures | Indicates whether the document contains signatures. |
|
| signatures_recognised | Indicates whether all signatures have been mapped to persons as per ground_truth/persons.json. |
|
| comment | Additional comments or annotations about the document. |
|
| action_required | Indicates what action is required to get to document done. |
|
Persons
Persons are recorded in the persons tab of the Google Sheet. The metadata schema and workflow for persons is described in ground_truth_persons_organizations.md.
For sender_persons_inscribed, sender_persons receiver_persons_inscribed, receiver_persons:
| is used to separate multiple values: Mustermann, Hans | Musterfrau, Maria.<Mustermann, Hans><<Musterfrau, Maria>>The sender_persons and receiver_persons fields use the names of persons as recorded in the normalized persons tab. Be sure to add inscribed variants to their respective alternateName fields.
Importing the ground truth from Google Sheets
Letters that are done are exported from the ground_truth_export tab of the Google Sheet as a CSV file and saved to ground_truth/letters.csv.
Consider the first letter as an example. The letter is composed of three pages.
| Metric | Ground Truth | Prediction | TP | FP | FN |
|---|---|---|---|---|---|
| 1926-02-16 | 1926-02-16 | 1 | 0 | 0 |
| Groschupf-Jaeger, Louis Ritter-Dreier, Fritz | Basler Rheinschiffahrt-Aktiengesellschaft | 0 | 1 | 2 |
| Christ-Wackernagel, Paul | Herr Christ i/Fa. Paravicini, Christ & Co. | 1 | 1 | 0 |
send_date: The prediction matches the ground truth (1 TP).sender_persons: The prediction is incorrect (1 FP) as "Basler Rheinschiffahrt-Aktiengesellschaft" is not a sender person, and the two actual sender persons "(Groschupf-Jaeger, Louis" and "Ritter-Dreier, Fritz") are missing (2 FN).receiver_persons: The prediction is partly correct as "Herr Christ" is mentioned as a receiver person (1 TP), and the prediction is partly incorrect as "i/Fa. Paravicini, Christ & Co." is not a receiver person (1 FP).
Scoring the collection
With scores for each letter in place, we can calculate the overall performance of an LLM on the collection. We calculate F1-micro and F1-macro:
Rule parameters
inferred_from_function: If true, the person is inferred from their function and the date (e.g., a letter from the Basler Personenschifffahrtsgesellschaft in 1925 signed by "der Präsident" was penned by Max Vischer-von Planta ).inferred_from_correspondence: If true, the person is inferred from the correspondence history (e.g., "referring to your letter from last week").skip_signatures: If true, then letters with signatures are not scored.skip_non_signatures: If true, then letters without signatures are not scored.TODOs
Add more fields to the metadata schema, namely `sender_organization` (inscribed and normalized), `receiver_organization` (inscribed and normalized), fields for entities mentioned (persons, places, organizations, ships; inscribed and normalized).