Remarks on the Datafication of History, University of Nottingham, February 4th, 2026
- Joseph Nockels
- 16 hours ago
- 7 min read
Last week, I had the opportunity to present to a group of historians, mostly PhDs, interested in leveraging Large Language Models (LLMs) for Arts and Humanities research at-scale. This formed part of a workshop, organised by Liudmila Lyagushkina and Finn Cadell, encouraging LLM experimentation, with suitable guardrails and encouragment in place, as a means to critically adopt such technologies and ascertain whether AI automation is truly fit-for-purpose in historical research. Of course, conversions turned to replacement narratives, handling errors and whether AI approaches flatten cultural remediation. It was also encouraging to see that students had already dabbled in Transkribus (https://www.transkribus.org), or were, at least, encouraged by seeing its applications.
Below are my short remarks, which attempted to drive home the socio-technical aspects of using AI-enabled processing on heritage collections, as well as highlighting the affordances and limitations of automatic transcription technologies. There are also references made (on the fly) to Federico Nanni's, Senior Research Data Scientist at the Alan Turining Institute, introductory talk on the history of LLMs.
The attached slides were made available through the event's project GitHub, but can also be shared by contacting me at j.nocklels@sheffield.ac.uk.
Introduction
Hi everyone,
Thank you for inviting me, and for organising a wonderful event,
We're going to be adding a layer of difficulty to Frederico’s talk on LLMs, using data that itself is not machine-readable without further processing, and is often degraded and inherently complex. This takes us to the aligned field of Computer Vision to uncover the cultural meaning imbued in material heritage through handwriting.
As part of this, I’ve been asked to briefly speak about Automatic Text Recognition (ATR) on library collections, as well as its processes, some of the technology’s benefits and limitations, drawing on my own research.
What is Automatic Text Recognition?
You may be familiar with Handwritten Text Recognition (HTR) or Optical Character Recognition (OCR), I refer to both of these technologies under the wider banner of Automatic Text Recognition (ATR): the AI-enabled method for converting images-of-manuscripts into computer-readable text (Pinche & Stokes, 2024), with our discussion covering both handwritten manuscripts and complex print.
ATR encompasses a broad set of algorithms, from more-established Neural Network approaches to Generative Large Language Models, trained on vast sums of textual data through ‘self-attention’ (O’Sullivan 2025). As Palkovic (2023: 6) states, the introduction of AI processes and multi-layered architecture has streamlined the digital transcription process into one of pre-processing, text detection and recognition, with tools regularly exceeding 95% character accuracy on handwritten text and over 99% on printed material.
ATR is a complex landscape, with models also mapping against certain business structures and some ATRs, such as eScriptorium and Transkribus, supporting a range of models. More recently, Vision Language Models have shown great potential in simultaneously processing manuscript images and text, through recognising spatial properties using bounding boxes (Merve & Beeching, 2024). This moves us from word embeddings to talking about visual embeddings also.
Nonetheless, whatever the model architecture, ATR functionality is reliant on the high-quality digital capture of material, with this archival work remaining vital. Therefore, we should keep in mind that we are talking about a ‘suite’ of tools or technologies, opposed to one type of AI model or system.
General ATR Process
To contradict my previous point, the next few slides show a streamlined ATR process. Of course, tailoring a model to your material is preferable if you want greater accuracy.
ATR relies on document understanding, recognising the page’s initial structure, for instance drawing regions for stanzas / articles / titles. These are often marked-up as XML and used as AI training data - providing a limited contextual representation of the image. This slide shows one such instance, on The Spiritualist Newspaper (1869), available through the National Library of Scotland’s Data Foundry, where open collections data can be downloaded, reused and replayed (Reaching People, 2020: 6). For newspapers, column delineation at this stage of the process is essential to preserve reading order.
Line segmentation then occurs, through a baseline model. These lines are then corresponded to a text editor. This is where we see the importance of preserving the reading order.
Lastly, a text extraction model (the actual transcription) is performed. Ideally, an ATR model would be trained on c. 15,000 words of Ground Truth (GT) data, although far less is needed in the case of regimented print, such as newspapers. Layout, not text extraction, is the main issue in recognising newspapers!
Through this process, ATR holds immense benefit in building accurate datasets of historical material; cleaning existing transcriptions created with older forms of OCR technology, never really suited to handwriting or complex newspapers, and providing access to material across a greater range of languages. ATR also supports the accessibility of personal archival histories, previously contained in diaries and letters, as well as supporting text-to-speech software for visually impaired researchers.
This all sounds straightforward and worthwhile, no? As long as the training data is available and you have good quality online images, scanned by a library’s professional digitisation unit, you can simply follow the workflow I’ve outlined?
ATR Limitations
As you may have gathered - adopting a critical ATR approach is more complex, AND something that historians are well-placed to navigate. Sustainably using ATR is not only about recognising text, it is about Humanities inquiry and dissemination. Historians are trained to research, analyse and interpret the past, extracting meaning and establishing patterns from evidence (Anderson, 2004: 81-82). They are trained to eavesdrop on the past, through weighing up available evidence based on scraps and interpretation, with an intimate knowledge of archival structures.
Using ATR, then, moves beyond the simple notion of programming, and instead relies on understanding AI’s impact on archives. It is a socio-technical process, accounting for information accessibility; power dynamics in historical narratives and archival labour.
Therefore, I would like to outline here some outstanding ATR challenges, that historians can help solve or - at least - grapple with:
After several notable failures, archives and libraries are rightly prioritising inclusivity and sensitivity in reframing their collections, for instance updating catalogue descriptions and further prioritising the ‘souls in their stacks’ (Drake, 2021: 8). ATR - the ability for the past to be made readable - holds great potential in making collections more accessible, but may also promote archival exclusions, dispossession, disinheritance and disembodiment, if we wield such technology to reinforce historical canons. ATR, therefore, teems ‘with diverse political, legal, and cultural investments and controversies' (Thylstrup, 2019, 3), in a way that undermines positivist ideas of archival practice being neutral (Tschan, 2002, 176).
As I’ve suggested, ATR is dependent on digitisation (Terras, 2022, 193) and relies on archives’ patchwork funding, workflows and priorities (Zaagsma, 2019). Does the pressure to adopt AI systems, combined with sectoral staffing cuts, increase the fraying of this patchwork support?
Although ATR improves the readability of handwriting and complex print, it consolidates an increasing scholarly reliance on databases and keywords as the primary way to access archival content. Therefore, does it become increasingly harder to understand the material archive and the texts on which these models are based?
A Case Example - Ghostwriting, Recognising Scottish Spiritualist Newspapers

Slide 7 from Datafication of History Presentation, showing the Spiritualist Andrew Jackon Davis and Transkribus 'Ghostwriter' model.
So, to finish up with a historical illustration -
In Autumn, last year, I began a research project supported by the National Library of Scotland. It aimed to bring together the AI transcription we have discussed with explainable developmental principles: the effort to help non-technical library users understand how AI reaches its results (Van Wessell, 2020), and asked how transparent model processes should be for libraries to utilise ATR at-scale? In following this, I trialled ten ATR softwares, ranging from Open Source community-based models to commercial LLMs - Claude Sonnet 4.5., Gemini and GPT 4o, on The Spiritualist Newspaper (1869) you saw earlier.
It was during this research, I found the following example for why ATR requires a Humanities perspective, so hopefully you indulge this pivot:
From a young age, Andrew Jackson Davis (1826 - 1910) was seen to be a spiritual prophet; He later became a core figure in the rise of 19th century Spiritualism: a belief typified by communicating with the deceased, namely through séances (Delp, 1967). Through his career, invited observers would test Davis through ‘impossible’ questions. One day, he was asked:
‘Do you perceive any plan by which to expedite the art of writing?’
Davis responded -
‘Yes; I am almost moved to invent an automatic psychographer — that is, an artificial soul-writer. It may be constructed something like a piano, one brace or scale of keys to represent the elementary sounds ...’
We recognise this ‘automatic writing’ as a typewriter, an invention credited some 40 years later. How did Davis know? The only justification given is that the spirits told him.
This story appears full of eccentrics, word-of-mouth testimonies and an unwillingness for Davis’s observers to rationalise what they were hearing. However, too often our treatment of AI, as modern automatic writing, holds similarities with those observing Davis. We rely on opaque guidance, become susceptible to overhyping model performance, and have limited information to fully evaluate the responses we receive from such automated processes.
In her 2022 survey of Transkribus users, the largest consumer-level ATR software, Melissa Terras (2022: 195) states ‘only 10% of respondents said that they fully understood the technology behind [A]TR’. ATR users are not always purview to the mechanics and weights of a model, due to unintelligible ‘black box’ algorithms, closed business structures, or a lack of easily reproducible workflows (Confalonieri, 2021). Without a familiar pattern of use, certain types of automation can present risks of misuse and harm depending on their context, scale and application: such as data protection issues, copyright infringement, and digital exclusion. We need to evaluate ATR processes with such issues in mind, not as a technological given, despite its usefulness in making collections readable.
We should inquire further, just as observers should have asked Davis some follow-up questions ...
—
Confalonieri, R., Ludovik, C., Wagner, B., and Besold, T.R. (2021). ‘A Historical Perspective of Explainable Artificial Intelligence’, WIREs - Data Mining and Knowledge Discovery, WILEY, 11: 1–21, doi: 10.1002/widm.1391
Delp, R.B. (1967). Andrew Jackson Davis: Prophet of American Spiritualism, Journal of American History. 54(1): 43-56, doi: https://doi.org/10.2307/1900318
Pinche, A., Stokes, P. (2024). Historical documents and Automatic Text Recognition: Introduction. Journal of Data Mining and Digital Humanities. 1 - 11, doi: https://10.46298/jdmdh.13247
Terras, M. (2022). Inviting AI into the archives: the reception of handwritten recognition technology into historical manuscript transcription. In: Jaillaint, L. (Ed.), Archives, Access and Artificial Intelligence: Working with Born-Digital and Digitized Archival Collections, Bielefeld University Press, Bielefeld, pp. 179-204.
Thylstrup, N.B. (2019). The Politics of Mass Digitization. MIT Press, Cambridge.
Tschan, R. (2002). A comparison of Jenkinson and Schellenberg on appraisal. The American Archivist. 65(2): 176-195, doi: 10.17723/aarc.65.2.920w65g3217706l1
Van Wessel, J.W. (2020). AI in Libraries: Seven Principles. National Library of the Netherlands. Available at: https://zenodo.org/records/3865344
Zaagsma, G. (2019). Digital history and the politics of digitization. Digital Scholarship in the Humanities. 38(2): 830-851, doi: 10.1093/llc/fqac050/6702047


Comments