The FLOW Project: A Modular Workflow for Automatic Text Recognition and Beyond

Historical research often involves working with highly diverse and complex source materials, ranging from handwritten manuscripts to large, heterogeneous document collections. Machine learning methods are increasingly shaping how historians work with digitised sources, particularly through Automatic Text Recognition (ATR). In this talk, Jonas Widmer and Dana Meyer will introduce The FLOW, a modular, microservice-based framework designed to support machine learning–driven data management and processing in the Digital Humanities.

The talk will outline how The FLOW separates complex ATR workflows such as pre-processing, model training, inference, and evaluation into independent, reusable components that can be combined flexibly and accessed without programming experience. Using state-of-the-art transformer-based models, the project aims to make advanced text recognition workflows more transparent, reproducible, and scalable across diverse historical datasets.

Jonas and Dana will outline a typical FLOW workflow, showing how datasets are managed on the Hugging Face platform and then processed step by step. The focus will be on how such workflows can support everyday research practices when working with large and heterogeneous historical corpora.

Speaker Biographies

Jonas Widmer is a Research Software Engineer specialising in Digital Humanities at the University of Bern. In this role, he assists in planning and developing projects focused on Natural Language Processing. His primary interest lies in Handwritten Text Recognition (HTR), where he engages with historical projects and their diverse sources.

Dana Meyer is a Master’s student in Intelligent Interactive Systems at Bielefeld University and works as a research assistant on the project The Flow in the Digital History group at Bielefeld University

 

Jonas Widmer

Jonas Widmer

Dana Meyer

Dana Meyer

 

 

 

 

 

 

 

Bodleian Bytes

Bodleian Bytes is a series of online talks hosted by the Centre for Digital Scholarship at the Bodleian Libraries. The series engages with innovative national and international research in digital scholarship. It is a virtual space for discussions surrounding different tools and methodologies whilst also providing inspiration for future digital research.

 

Event Details and Registration

Registration is required for this free online event. Registration closes at 17.00 on Monday 16 February 2026.

Date and time: Wednesday 18 February, 15:00-16:00 (UK time)

Location: Online via Zoom.

For further information, please email the Centre for Digital Scholarship: cds@bodleian.ox.ac.uk.

Register for the event 'The FLOW Project: A Modular Workflow for Automatic Text Recognition and Beyond'

Centre for Digital Scholarship

The Centre for Digital Scholarship (CDS) at the Bodleian Libraries is a space and place for engaging, leading and shaping discussions around digital scholarship practice and research within and beyond the University of Oxford.