Library Labs, Data Foundry and Handwritten Text Recognition
- Joseph Nockels
- 5 days ago
- 18 min read
“The following will be very cheeky. If you’re available … how do you fancy doing an impromptu lecture?”
It’s nice to be trusted by your former academic supervisors, especially when they invite you to contribute to their newly minted course. In this case, the Library Studies MA delivered by the University of Glasgow’s School of Information Studies, with a guest lecture on how the National Library of Scotland is supporting digital scholarship through open and transparent collections data, namely through its Data Foundry (https://data.nls.uk).
As it happens, I’d just attended a talk by Sheffield’s Careers and Employability Team, where the speaker introduced the idea of job crafting. The premise being that if you’re dissatisfied in your role - not in my case - you can make small additions, subtractions, or substitutions for now to regain a sense of goodwill and enjoyment. It was not presented as a long-term fix, but a way of feeling more adequately compensated in the short term. I didn’t have much to offer, beyond the usual desire for better work/life balance - this blog being part of that boundary-setting. Nonetheless, I did make note that I miss teaching in the classroom, so the invitation was opportune.
Before completing my PhD, I spent the allotted nine hours my funder allowed delivering course materials as a Graduate Teaching Assistant - ranging from teaching basic HTML to discussing the different types of misinformation. Now, as a research-only member of staff, most of my teaching takes the form of invited talks and training sessions. These mostly focus on building AI confidence among library staff, particularly in their use of digital transcription tools. It’s rewarding work and a welcome break from a desk-based day, especially within an organisation you care about such as the National Library of Scotland that aligns with your workplace values. Still, in delivering this training, it can feel as though you’re adding to an already solid foundation of expertise. Library practitioners are already highly skilled in information technologies, just not the unfamiliar tool you’re helping them trial. Their roles are often established, despite staff precarity in the sector, and they arrive well-prepared with thoughtful questions. All of this is welcome, of course, but it creates a different dynamic to university teaching. Students in library studies, while often possessing cultural heritage experience and strong digital skills, come with a different set of expectations and unknowns.
Yet these groups - practitioners and students - also hold similarities as audiences. From established libraries to MA candidates aspiring to similar careers, they are intrinsically motivated by a common interest in history, memory, and the role of digital technologies in promoting accessible learning. This was clear when listening to students reflect on their recent lab work, in which they co-trained a Handwritten Text Recognition (HTR) model to transcribe the diaries of Andrew McGeorge (1774–1857), an esteemed Glasgow-based lawyer and alderman whose papers are held in the University of Glasgow’s Special Collections. In encouraging students to develop both a technical understanding of HTR and a broader awareness of the ethical considerations surrounding its integration into library workflows, the course followed Beetham and Sharpe’s (2013: 52) definition of education as a set of personal and interpersonal activities rooted in social and cultural contexts. The context, here, was the situated library lab; the social process, the student’s co-created transcriptions.
What follows is my guest lecture on library labs, the National Library of Scotland’s Data Foundry and HTR outputs, delivered ahead of an afternoon lab exploring the affordances and limitations of presenting HTR transcriptions online.
Slides are available via Zenodo at: 10.5281/zenodo.17702003
Introduction (Slide 1)
Hi everyone,
Thanks for having me. I’ll do my best to offer a sense of the National Library of Scotland’s priorities in supporting digital scholarship, how it works in practice and its points of intersection with Handwritten Text Recognition (HTR), which I know you’ve been using and critiquing with library workflows in mind.
Background (Slide 2)
Just as a brief background to how I’ll approach these subjects, which will differ slightly to Sarah [Ames], the Digital Scholarship Librarian, due to our different roles - I currently work as a Research Associate at the Digital Humanities Institute, based at the University of Sheffield. I lead the research theme: Digital Representations of Cultural Artefacts, which sets out to advance the state-of-the-art in the digital capture, interpretation and representation of physical culture. A large part of this research uses AI-enabled Handwritten Text Recognition, or its catch-all term Automatic Text Recognition (ATR) (Pinche & Stokes, 2024), to convert manuscript images into machine-processable text for further analysis. Previous projects, using such HTR workflows, have focused on the abolitionist Frederick Douglass and his use of religious language in his personal papers and making Scottish children’s history more accessible, which we’ll briefly look at later on, as part of the National Library of Scotland (NLS) Data Foundry.
Across this work, I aim to approach HTR - and broader AI systems - through a socio-technical lens. This means looking beyond whether such tools work as intended towards how they change our encounters with historical information, expectations of technology and, subsequently, how libraries are experimenting with, understanding and implementing them. *We’ll explore some of these threads today, focusing on the long tail of human operators behind technologies like HTR, resourcing considerations and such technologies fit into wider library networks, priorities and workflows.
I first worked with the NLS as a PhD student, interviewing curators and digital staff about their response to HTR, after it became a more accessible, intuitive and accurate technology, as well as trialling approaches to embedding the technology within the library’s digital collection systems. In furthering this work as the 2025/2026 NLS Digital Fellow, I’m looking to interpret HTR beyond technical scores - namely Character Error Rates (CERs), in order to analyse where there is room for providers to be more transparent about their processes. *This links to your lab later on, contextualising HTR-generated transcriptions through additional website information.
ILOs (Slide 3)
So, in contextualising ‘Who am I?’ and ‘What am I doing here?’ Here are this morning’s Intended Learning Outcomes (ILOs):
LO1 - Understand the role of GLAM labs in supporting digital scholarship, networks and technological development.
LO2 - Define predominant Handwritten Text Recognition (HTR) business models and the place of GLAM labs as users of technology.
LO3 - Discuss the National Library of Scotland’s priorities for digital scholarship and where this work sits within the organisation.
LO4 - Explore and critique digital repositories such as the NLS Data Foundry.
To make this all a bit clearer, we’re going to pin each of these ILOs to a different ‘... as infrastructure’ label.
The foundations needed to support innovative digital scholarship - ‘infrastructure as infrastructure’ (the nuts and bolts, resources and access to technology)
‘People as infrastructure’ (the people behind developing technology, networks supporting their use and those operating them)
‘Learning as infrastructure’ (how to better understand HTR and other AI-enabled tools, the digital skills necessary and efforts to broaden such technical fluency).
‘Datasets as infrastructure’ (which relates to the Data Foundry - researchers need data to work on after all, if they are to understand information derived from historical materials).
Contents (Slide 4)
Part One - We'll first address what GLAM labs are in practice and the motivations behind their supporting digital scholarship. This will lead us to situating libraries’ place in broader critical AI advocacy and research networks, ending with the Recognition and Enrichment of Archival Documents (READ) - COOP, which supports Transkribus.
Part Two - We’ll then narrow in on the NLS and where digital scholarship sits within the organisation, a library - like others - often associated with analogue collections, physical exhibitions, and bricks and mortar reading rooms, instead of experimental digital services. This section gives a sense of potential research opportunities open to those with a knowledge of libraries and their digital services, with the NLS very much open to collaboration. Lastly, we’ll narrow in on the Data Foundry - the NLS’s online environment for open and transparent datasets, supported by the NLS’s Digital Scholarship Service. We’ll explore its merits, potential drawbacks and some alternative approaches taken by other institutions.
Part One - GLAM Labs, What Are They? What Do They Matter? How Does this Relate to HTR? (Slide 5)
Defining GLAM Labs (Slide 6)
So, first - what are GLAM labs?
Phetteplace, Brooks and Heller (2013: 1) describe such efforts, specifically in the context of libraries, as experimental services developed in collaboration with users. They also offer a more specific definition, seeing library labs as:
“Any library program, physical or digital (hybrid) in which innovative approaches to library services, tools, or materials are tested in a structured way before being made part of a regular workflow, program or mission.”
Others, such as Nowviskie (2013: 53) emphasise the physical aspect of such labs - as bricks and mortar “skunkworks”, “semi-independent, research-oriented software prototyping and makerspace labs”. The NLS - as we shall see - is a more digital operation, with colleagues working hybrid across locations.
In any case, GLAM labs rely on a culture of innovation that seeks to challenge long-held assumptions about library processes, priorities, remits and protocols, leading to - sometimes crude - comparisons with tech startups. However, a core component is that they are grounded in the values and priorities of the sector - findability, accessibility, interoperability, reusability (https://www.go-fair.org/fair-principles/), as well as priorities of transparency, trust, representation and community (Gooding, 2023). If we think of GLAM labs as reliant on such cultural values, we might say that “… even small libraries whose enthusiasm for new technology may outweigh their resources can adopt a library lab concept” (Phetteplace, Brooks and Heller, 2013: 1). However, we’ll see that there are obstacles to this, mainly resourcing and securing buy-in from library decision-makers and management.
Why Library Labs Matter (Slide 7)
Libraries face numerous socio-technical challenges in their daily operations, which makes library labs all the more important to our research environment, innovation and accountability:
Karen Coyle (2017) has convincingly argued that libraries have been traditional innovators of organisational technologies - such as early Optical Character Recognition (OCR) and QR book scanning, both computer vision tools which act as precursors to modern HTR systems. But, in the timeline of information technology over the twentieth century and into the twenty-first, Coyle (2017) suggests library technology has fallen behind the general technology evolution.
Why is this a problem? Can’t libraries simply use the technology that’s out there?
Sometimes. However, in many cases, what’s available on the market are a narrow band of tools and algorithms, especially if we focus on the AI sector - a problem that the Turing Institute (2025) (https://www.turing.ac.uk/news/publications/doing-ai-differently) calls AI ‘homogenisation’. We can see this with the ubiquity of chat-based AI tools, with similar interfaces and a general reliance on crawled web data to answer queries. This centralisation of technological innovation away from cultural heritage has a wider impact on the power dynamics of AI adoption, especially for libraries who have complex collections and limited resources. Library labs attempt to stay in-touch with the cutting edge, tweaking such tools for their specific needs and advocating to providers.
Library labs also form a major part of building technical confidence within institutions, through regular training, projects and discussions. The need for greater confidence and technical fluency is noted as early as 2012 by Hadro, who travelled around American libraries to understand the state of the field regarding library labs. He noticed that none of the libraries he visited thought they were being innovative because they could always point to someone who was doing something bigger and better (Hadro, in OCLC, 2012), instead their worth often comes in building local communities of practice willing to experiment with tools in smaller, controlled ways.
We’ll get onto how libraries are now communicating their lab work to each other, in the next slide.
As much as you can adopt a culture of innovation, libraries are under-resourced and often over-burdened, especially in our current environment - meaning they have to ‘Do More with Less’ or adopt ‘More Product, Less Process’ methods to processing collections (Greene & Meissner, 2005; Hujda et al., 2018). However, in this quest for efficiency - blending with Problem 1, libraries are particularly susceptible to adopting one-size-fits-all approaches that appear more efficient but can easily become unworkable (Shah, 2024). Library labs, instead, research critical alternatives that truly work for their collections, processes and users - here, we get to our ‘infrastructure as infrastructure’ label.
Library Labs and Global Networks (Slide 8)
In looking at library labs, we can see that the sector is not a passive consumer of technology. Instead, by meeting the challenges outlined - libraries and digital scholars come together, both nationally and internationally, to benefit from developing collaborative technology and critically discussing tools - like GenAI - that are causing mixed feelings among the sector (Gasparini & Kautonen, 2022: 3). Therefore, library labs are a major part of this reassertion of intellectual leadership over technology development and research.
Here, we get to our ‘people as infrastructure’ label - thinking of the key individuals, networks and communities that underpin such discussions and development.
Some quick examples -
Collections as Data Initiative (2017 - ): Foundational project emphasising the need to convert collections into shareable information beyond physical holdings with limited access, with specific enough information to meet a variety of users’ research goals (Lincoln 2017: 30). With AI-enabled technologies presenting new benefits for such approaches, Collections as Data is even more core to the NLS Digital Scholarship Service.
AI4LAM (Artificial Intelligence for Libraries, Archives & Museums) (2019 - ): Represents a sector-led ‘international participatory community focused on advancing the use of AI in, for and by libraries, archives and museums’. In addition to an annual conference, the AI4LAM community organise a series of events, provide news on AI in libraries and maintain a registry of projects, activities, AI datasets and models.
AEOLIAN Network (2021 - 2023, outputs in 2025): designed to investigate the role of AI in making born-digital and digitised cultural records more accessible to users, through carefully-structured workshops with library professionals among others GLAM institutions and the creation of an international network of researchers and practitioners.
Digital Scholarship Guides, European Association of Research Libraries (LIBER) (2025 - ): This work offers open and collaboratively curated training resources for library professionals, including guides on AI / Machine Learning for Libraries (with HTR examples), led by Nora McGregor, a Digital Curator at the British Library, and a wider Digital Scholarship working group.
And, finally, which brings us onto HTR and digital transcription, our last example of how library labs relate to broader research and development networks - the READ-COOP.
The READ-COOP, Library Labs as HTR Users (Slide 9)
In turning to how library labs support Handwritten Text Recognition (HTR), it is important to look beyond the technology and whether it works as planned, toward the wider structures underpinning its use and where they come from (Robinson, 2022: 27). In the case of HTR and Transkribus, this wider structure is the READ-COOP, which develops and hosts its AI-enabled transcription models and tools. The READ-COOP is the first - and currently only - AI cooperative of its kind and highlights the value in encouraging stakeholder input - which library labs contribute to - when creating AI systems that are innovative and accountable (Terras et al., 2025). Mühlberger (2019), the founder of Transkribus, in describing the benefits of the READ-COOP and its European Cooperative Society Model (SCE) stated:
“It is democratically organised … open to new members, who in turn become co-owners through the acquisition of shares. At the same time, an SCE offers the opportunity to do business and thus secure the future and further development of the Transkribus platform.”
As of October 2024, READ-COOP had 227 Members from across 30 countries, approx. 65% being institutions (Nockels, 2025), with 235,000 registered user accounts and having processed over 90 million digital images of historical texts (Terras et al., 2025).
How does this differ to other business models in AI research and development?
Well, the European Cooperative Society (SCE) model means READ-COOP is an independent ‘not for profit’ legal entity with the objective of sharing common goals and earnings, with profits directly reinvested into its service. For library labs, this is important as it ‘socialises’ the AI product, pooling together institutions and technical expertise. It differs heavily from extractive commercial entities, driven by profit and growth via ‘shareholder-oriented capitalism’ and, again, restores some intellectual leadership of AI development for libraries and their users (Cheffins, 2021, 1607). Again, we can attach this to the ‘people as infrastructure’ label we’ve been using.
(Slide 10)
Alongside providing access to decision-making with AI developers, access to a broader network of institutions as ‘people as infrastructure’ and sustainable business practice, the READ-COOP also supports the Transkribus Scholarship Programme (Nockels et al., 2024). This enables students and early-career researchers to access free processing (up to 3,000 credits - 3,000 pages of handwritten text), as well as an opportunity to present as part of the Transkribus User Conference in Innsbruck, Austria.
Part Two - Digital Scholarship at the National Library of Scotland (Slide 11)
So, we’ve established what library labs are, their motivations and associated networks - especially around HTR technology. This second portion focuses more locally on the NLS and its Digital Scholarship Service, connecting this with the broader - global - structures we’ve mentioned.
(Slide 12)
The NLS manages around 31 million items in its collection (excluding born-digital and web archives), with its wider strategies - such as the Reaching People Strategy (2020 - 2025) emphasising the need to provide ‘outstanding digital engagement’ (pg. 13) and ‘to explore opportunities to reach people outside Scotland’s central belt’ (pg. 2). The publication and continued accessibility of digital collections, whether digitised (captured from analogue material) or digitalised (interacted with in innovative ways online) are seen as key ways to achieve these strategic aims *we’ll get onto reasons why it often isn’t that simple.
First of all, however - what are the main features of the NLS’s role in supporting digital scholarship?
The Data Foundry - the library manages approximately a petabyte of data, across a range of material (one third map content, a third moving image and another third still image digitised material) (Hibberd, 2023). These materials are managed through an internal database and two primary Catalogue Management Systems (CMSs), whereby information can be published and made available to the public. The Data Foundry, however, goes beyond simply publishing catalogues but instead forms a destination where NLS data can be downloaded, reused and replayed (Reaching People, 2020: 6). The Data Foundry makes metadata, organisational information, as well as map and spatial data, accessible for digital scholarship in line with the ‘collections as data’ principles we mentioned. It is hoped that through this process users will be able to ‘melt down’ data from collections and ‘weld’ them back together to create new outputs (Ames, 2023: 8). *We’ve got some workers doing just that on Forth Bridge in this slide, created using an AI video enhancement software on the copyright-free Data Foundry header, just to emphasise that welding / reuse side to the service but maybe leading to a level of potential misinformation?
Digital Fellowships, with work being conducted on image recognition for 19th century chapbooks, visualising web archive data and providing clear datasheets for digital collections to make users aware of exactly how collections have been processed, changed and their data quality.
PhD researchers - the NLS undergoes a lot of research partnerships, mainly through the Scottish Graduate School, to provide collection access and resources to PhDs and offer research-in-practice placements. Recent projects have included: looking at how AI can better regulate climate stores for photographs and older material, text and data mining materials related to histories of enslavement and attempting to chart historic deforestation through Ordnance Survey maps. Here, we see the ‘learning as infrastructure’ component to digital scholarship, whereby the NLS can benefit from innovation beyond its in-house capabilities and students gain practical experience within a cultural heritage institution. *Certainly, something to think about - post MA …
Community Needs Analysis, with Lucy Dalgleish (supported by the University of Edinburgh’s Centre for Data, Culture and Society) focusing on how NLS staff are responding to AI approaches, potential uncertainties and remaining skill-gaps. This work was presented as part of a wider AI Symposium in 2023, which gathered commercial / cultural heritage / academic partners.
Artists-in-Residence, Marion Carré - explored how AI may open up new ways of interacting with library and archival collections and what are the risks? Are future libraries going to archive fact or truth?
In terms of HTR datasets, the Data Foundry contains a manually corrected transcription of Majory Fleming’s (1803-1811) Diaries, a Scottish child author, who became posthumously famous for her free-thinking and precocious diaries - a lot of passages about how much Fleming reviled Queen Elizabeth! In linking HTR to digital scholarship and further engagement, these diaries are a key source for those seeking a situational history of early 19th century Scotland and have been used as teaching aids in Scottish public schools, Transkribus ‘new-feature’ demonstrations and scholarly publications.
Data Foundry Contentions (Slide 13)
The Digital Scholarship Service, through the Data Foundry, attempts to establish a library culture where digital publication is ‘business-as-usual’ practice, as well as an outlook that anticipates future research. This relies on clarifying user needs and how comfortable NLS staff and public stakeholders are with digital outputs and methods, like HTR. Not an easy thing - some researchers may expect a certain accuracy level from open datasets, while others may only want high-level information for experimentation or to respond to more generally. Some staff may be reticent of the tools trialed by library labs, tied to ‘replacement’ AI narratives; while others expect a great deal from these systems, instead veering near ‘technology as a solution’. As such, the NLS’s Digital Scholarship is conducted against the backdrop of the wider organisation, its workflows, policies, resourcing and expectations. This requires a general level of technical fluency and a clear way to articulate the benefits to senior-management and non-technical colleagues (Cox, 2021) - again, both ‘people and learning as infrastructure’.
Publishing Data Foundry datasets is a manual process, but with a systems team emerging, there are opportunities to streamline this delivery (Ames, 2023: 4). Often, curators select a collection of interest and, if not already digitised, a stack visit occurs with rights/license and conservation colleagues -> digitisation, with derivatives and metadata files -> a DOI given, after file organisation and compilation -> finally published. A whole library effort and potentially more considered / deliberate?
This process is also dependent on the material itself - degraded manuscript images for instance, will have limited applicability to some digital methods, for instance HTR. Also, for spatial data from maps - additional processing is needed to establish coordinate data via geo-referencing. NLS scanners are set-up to deal with certain bindings, dimensions and quality of documentation.
Where does all this work sit within the wider NLS? (Slide 14)
In 2023, Byrne - then Digital Transition Manager - described digital scholarship as occurring throughout the NLS, though it remained unclear where such work precisely sat in the library’s digital infrastructure: “Can digital scholarship serve our web archive? I do not know. But strictly speaking, they’re related, and they have an interdependency in both directions.” Byrne also suggested that - “where we have a weakness is that [digital scholarship] should sit more closely with outreach”. This brings us back to HTR, a technology that can produce easily accessible plain text for digital scholarship at scale, with clear outreach potential.
More recently, however, the NLS has pivoted - emphasising this need for coupling digital scholarship and outreach. Instead of building one-size-fits all approaches, they are instead focusing on more datasets. Some, like Daniel van Strien (2025), Machine Learning Librarian at Hugging Face, call this approach ‘Datasets as Infrastructure’: first, get the data out there -> second, build the interfaces and wider structures. This is slowly emerging as a way to maximise the benefit of NLS data for digital scholarship, created through such processes like HTR.
Finally, just to get you thinking about the Data Foundry more broadly, before the lab - this ‘Datasets as Infrastructure’ model is not the only one available to libraries in their supporting digital scholarship.
National libraries often disagree in how to best serve users of digital collections, with the NLS making raw / practical datasets open in a variety of formats - expecting researchers to do the researching part; while the National Library of Hungary not only provide data but also undergo some of the digital scholarship themselves, with visual graphs of key terms and far more contextualisation of collections (Szentkereszti, 2025, personal communication). The National Library of Norway have heavily resourced work into AI research and development - relying much less on external expertise (de la Rosa, 2025), while the National Library of Finland and British Library have specific curators for HTR processing and research: again we return to different levels of resources, motivations and priorities. Nonetheless, we only need to look at HTR workflows to see that the world of library labs is not consistent! The NLS adopts one approach, but in supporting digital scholarship - must be adaptable to change, new methods / technologies / expectations and institutional priorities.
References -
Ames, S. (2023). Digital Scholarship and Reaching People. National Library of Scotland. Internal Documentation.
Beetham, H., Sharpe, R. (2013). Introduction. In Helen Beetham, Rhona Sharpe (eds.)
Rethinking Pedagogy for a Digital Age. 2nd Edition. London: Routledge, pp. 1-12.
Cheffins, BR (2021). Stop blaming Milton Friedman! Wash U L Rev. HeinOnline; 98(6): 1607–1644. Reference Source
Cox, A. (2021). The impact of AI, machine learning, automation and robotics on the information profession. CILIP. 1-56. https://www.cilip.org.uk/page/researchreport.
Coyle, K. (2017). Creating the Catalog, Before and After FRBR. Paper presented at the Encuentro di Catalogacion y Metadatos, Universidad Nacional Autonoma de Mexico, 12 September. http://kcoyle.net/mexico.html.
Dalgleish, L. (2022). Artificial Intelligence, cultural heritage and the National Library of Scotland. https://data.nls.uk/projects/artificial-intelligence-report/.
de la Rosa, J. ‘Machine learning at the National Library of Norway’, in Lise Jaillant, Claire Warwick, Paul Gooding, Katherine Aske, Glen Layne-Worthey and J. Stephen Downie (eds.), Navigating Artificial Intelligence for Cultural Heritage Organisations. 61 - 92. UCL Press. doi: 10.14324/111.9781800088375
Gasparini, A., Kautonen, H. (2022). Understanding Artificial Intelligence in Research Libraries – Extensive Literature Review. LIBER Quarterly. 32(1): 1-36. doi: 10.53377/lq.10934
Greene, M.A., Meissner, D. (2005). More Product, Less Process: Revamping Traditional Archival Processing (PDF). American Archivist. 68 (2): 208–263. Doi: 10.17723/aarc.68.2.c741823776k65863.
Gooding, P. (2023). Recording. Collaboration, Transparency, and Technology: AI as a community challenge for Libraries. In: NLS AI Symposium, Edinburgh, 25 April. https://www.youtube.com/watch?v=l98O8GXLa-Q.
Hibberd, L. (2023) Interview by Joseph Nockels [Microsoft Teams], 28 April.
Hujda, K., Marineau, C., Wick, A. (2016). Maximum Product, Even Less Process: Increasing Efficiencies in Archival Processing Using ArchivesSpace. Journal of Archival Organization, 13(3–4), 100–113. doi: 10.1080/15332748.2018.1443549
Lincoln, M. (2017). Ways of forgetting: the librarian, the historian, and the machine. In: Thomas Padilla, Laurie Allen, Hannah Frost, Sarah Potvin, Elizabeth Russey Roke, Stewart Varner (eds.). Always already computational: library collections as data. Institute of Memory and Library Services, National Forum Positional Statements, pp. 20–30. https://collectionsasdata.github.io/part2whole/resources/.
National Library of Scotland (2020) Reaching People: Library Strategy 2020-2025. https://www.nls.uk/media/43mla4h3/2020-2025-library-strategy.pdf.
Nockels, J., Gooding, P., Terras, M. (2025). Are Digital Humanities platforms facilitating sufficient diversity in research? A study of the Transkribus Scholarship Programme, Digital Scholarship in the Humanities, Volume 40, Issue Supplement_1, January 2025, 46–65, doi: 10.1093/llc/fqae018.
Nowviskie, B. (2013). Skunks in the Library: A Path to Production for Scholarly R&D. Journal of Library Administration. 53:53–66. doi: 10.1080/01930826.2013.756698.
Halco in OCLC. (2012). Made in a Library: OCLC/LJ Online Symposium, WebEX recording, 1:58:00, May 15, 2012, www.oclc.org/innovation/archive/default.html.
Phetteplace, E., Brooks, M., Heller, M. (2013). Library labs. RUSQ: A Journal of Reference and User Experience. 52.3: 186-190.
Robinson, D. (2022). Voices in the Code: A Story about People, Their Values, and the Algorithm They Made. New York: Russell Sage Foundation.
A. Pinche, P. Stokes. (2024). Historical documents and automatic text recognition: Introduction, Journal of Data Mining and Digital Humanities, 1–11. doi: 10.46298/jdmdh.13247.
Shah, A. (2023). Interview by Joseph Nockels [Microsoft Teams]. 29 June.
Terras M, et al. (2025). The artificial intelligence cooperative: READ-COOP, Transkribus, and the benefits of shared community infrastructure for automated text recognition [version 2; peer review: 1 approved, 1 not approved]. Open Res Europe, 5:16. doi: 10.12688/openreseurope.18747.2



Comments