Victorian Prison Registers: HTR Pilot Project (c. 19th Century)
With Emma Kelso (Independent Researcher)

Back

Challenge

Victorian prison registers are densely structured manuscript records with hand-drawn tables and irregular handwriting. These documents presented significant challenges for layout recognition and accurate transcription, particularly due to sparse content, skewed baselines, and complex formatting.

Solution

Osiris-AI:

  • Prepared and processed 500 images of Victorian prison registers
  • Collaboratively built a training dataset
  • Developed a tailored pilot HTR model through iterative retraining, achieving around 85% accuracy
  • Cropped images to focus on table content, improving segmentation performance
  • Created a custom postprocessing pipeline to cluster rows and columns
  • Extracted structured fields (e.g., names, ages, birthplaces, offenses) using regex-driven parsing

Impact

The project produced a working HTR model for prison records and an adaptable pipeline for turning raw handwriting into structured CSV datasets. The tools and methods now form a robust foundation for scaling the transcription of 19th-century penal records, supporting both research and digital archive efforts.