Collana |
Studi di archivistica, bibliografia, paleografia
Miscellanea | Models of Data Extraction and Architecture in Relational Databases of Early Modern Private Political Archives
Capitolo | Cracking the Historical Code
Cracking the Historical Code
From Unstructured Correspondence Corpora to Computational Analysi
- Agata Bloch - Tadeusz Manteuffel Institute of History of Polish Academy of Sciences, Poland - email
- Michał Bojanowski - Kozminski University, Poland - email
- Clodomir Santana - Tadeusz Manteuffel Institute of History of Polish Academy of Sciences - email
- Demival Vasques Filho - Luxembourg Centre for Contemporary and Digital History (C2DH), University of Luxembourg - email
Abstract
The chapter addresses a methodological approach to unstructured data and discusses the potential that structured data offers in the field of historical research. The dataset, which initially consists of textual content sourced from digital collections at the Portuguese Overseas Archives in Lisbon, undergoes a preprocessing phase that forms the basis for the extraction of structured data. The authors combine history, social sciences, and computer science to convert the correspondence repository into a machine‑processable form. This transformation is supported by an interdisciplinary strategy in which they weave together elements of effective content management, topic modelling, and social network analysis.
Presentato: 03 Ottobre 2023 | Accettato: 18 Gennaio 2024 | Pubblicato 22 Maggio 2025 | Lingua: en
Keywords Digital infrastructure • Colonial Portuguese Empire • Public correspondence • Structured data • Historical dataset
Copyright © 2025 Agata Bloch, Michał Bojanowski, Clodomir Santana, Demival Vasques Filho. This is an open-access work distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction is permitted, provided that the original author(s) and the copyright owner(s) are credited and that the original publication is cited, in accordance with accepted academic practice. The license allows for commercial use. No use, distribution or reproduction is permitted which does not comply with these terms.
Permalink http://doi.org/10.30687/978-88-6969-919-1/006
- Introduction
- Dorit Raines
- 22 Maggio 2025
Perspectives: Historical Archives and Digital Humanities
- The Digital Historiographic Turn and the Historian’s Changing Toolkit: From ‘Facts’ and ‘Events’ to ‘Datasets’
- Dorit Raines
- 22 Maggio 2025
-
Is There a Reception of Algorithm‑Based Research in Traditional Historical Scholarship?
Three Case Studies from Academic “Trading Zones” - Thomas Wallnig
- 22 Maggio 2025
- The Representation of Historical Uncertainties as the Outcome of Competing and Incompatible Certainties
- Fabio Vitali, Valentina Pasqual
- 22 Maggio 2025
- Metapolis: Spatializing Histories Through Archival Sources
- Lukas Klic
- 22 Maggio 2025
Experiences: Historical Archives, Database and Online Publication
-
Including the Archival Context in the Historian’s Materials: The Advantages of Archival Standard Databases in Historical Research
VINCULUM Project Database and Information System Guide - Maria de Lurdes Rosa
- 22 Maggio 2025
-
Cracking the Historical Code
From Unstructured Correspondence Corpora to Computational Analysi - Agata Bloch, Michał Bojanowski, Clodomir Santana, Demival Vasques Filho
- 22 Maggio 2025
-
Methods and Tools of Quantification in Historical Research
Napoleonic Employment Applications as a Case Study - Valentina Dal Cin
- 22 Maggio 2025
-
Gendered Data in Medieval and Early Modern Sources
The Gendered Networks and Digital Edgeworth Network Projects - Máirín MacCarron
- 22 Maggio 2025
-
Extraction, Architecture and Recovery of Family Correspondence Data
The Platform “EpiCAT. Family Letters from Catalonia (Sixteenth‑Nineteenth Centuries)” - Javier Antón Pelayo
- 22 Maggio 2025
Challenges: Graziani Archive and Omeka S
-
Historical Research and Archival Sciences in a Digital Perspective
Relational Database, Data Architecture and Data Extraction in Graziani Archives Portal - Dorit Raines
- 22 Maggio 2025
-
Reconciling Complex Historical Records with Omeka S Relational Database
The Case of the Graziani Archive - Gabriella Desideri
- 22 Maggio 2025
-
A Puzzle with Missing Pieces
Extracting, Deciphering, and Digitally Rearranging Data in Antonio Maria Graziani Private Archives - Carlo Baja Guarienti
- 22 Maggio 2025
-
How to Digitally Reconstruct the History of an Early Modern Private Library?
Antonio Maria Graziani (1537‑1611) and the Vicissitudes of His Books - Luca Iori
- 22 Maggio 2025
| DC Field | Value |
|---|---|
|
dc.identifier |
ECF_chapter_18897 |
|
dc.contributor.author |
Bloch Agata |
|
dc.contributor.author |
Bojanowski Michał |
|
dc.contributor.author |
Santana Clodomir |
|
dc.contributor.author |
Vasques Filho Demival |
|
dc.title |
Cracking the Historical Code. From Unstructured Correspondence Corpora to Computational Analysi |
|
dc.type |
Capitolo |
|
dc.language.iso |
en |
|
dc.description.abstract |
The chapter addresses a methodological approach to unstructured data and discusses the potential that structured data offers in the field of historical research. The dataset, which initially consists of textual content sourced from digital collections at the Portuguese Overseas Archives in Lisbon, undergoes a preprocessing phase that forms the basis for the extraction of structured data. The authors combine history, social sciences, and computer science to convert the correspondence repository into a machine‑processable form. This transformation is supported by an interdisciplinary strategy in which they weave together elements of effective content management, topic modelling, and social network analysis. |
|
dc.relation.ispartof |
Studi di archivistica, bibliografia, paleografia |
|
dc.publisher |
Edizioni Ca’ Foscari - Venice University Press, Fondazione Università Ca’ Foscari |
|
dc.issued |
2025-05-22 |
|
dc.dateAccepted |
2024-01-18 |
|
dc.dateSubmitted |
2023-10-03 |
|
dc.identifier.uri |
http://edizionicafoscari.it/it/edizioni4/libri/978-88-6969-919-1/cracking-the-historical-code/ |
|
dc.identifier.doi |
10.30687/978-88-6969-919-1/006 |
|
dc.identifier.issn |
2610-9875 |
|
dc.identifier.eissn |
2610-9093 |
|
dc.identifier.isbn |
978-88-6969-920-7 |
|
dc.identifier.eisbn |
978-88-6969-919-1 |
|
dc.rights |
Creative Commons Attribution 4.0 International Public License |
|
dc.rights.uri |
http://creativecommons.org/licenses/by/4.0/ |
|
item.fulltext |
with fulltext |
|
item.grantfulltext |
open |
|
dc.peer-review |
yes |
|
dc.subject |
Colonial Portuguese Empire |
|
dc.subject |
Digital infrastructure |
|
dc.subject |
Historical dataset |
|
dc.subject |
Public correspondence |
|
dc.subject |
Structured data |
| Download data |