Unlocking archives through intelligent automation.
From image to insight β Osiris-AI transforms complex historical documents into structured, searchable data.
Osiris-AI develops advanced tools that transform complex archival and historical materials into structured, searchable data. Our technology automates processes that once required extensive manual work β from page segmentation and handwriting recognition to data extraction from complex layouts β enabling faster, more accurate research and digital preservation.
Who we are.
Oliver Buxton Dunn
Head of Research and DevelopmentAlexis Litvine
Head of company operationsYiannos Stathopoulos
Co-founder and tech advisorRuth Murphy
Ruth holds a PhD in Italian from the University of Cambridge and is currently a postdoctoral researcher at the University of Sheffield. As a talented linguist, she oversees modern transcription work for Osiris.
Chloe Ashley
Chloe Ashley helps with company administration and client care. She also lends a hand with the more delicate transcription tasks and data QA. She holds a degree in English Literature and Hispanic Studies from Queen Mary, University of London.
Stan Hinton
Stan graduated in history from Cambridge after a successful career start in the tech industry. He specialises in UI components and annotations schemes for AI.
Charlie Cook
Charlie is an undergraduate mathematics student at The University of Aberdeen with an enthusiasm for coding and AI. Charlie is involved in model training and fine tuning, as well as script construction and refinement.
Sam Hoyle
Sam is now ESRC-funded postgraduate student at the University of Durham, specialising in maritime violence in the Indian Ocean. He has helped us with transcription, segmentation, and image processing since 2023.
Our services.
Osiris-AI turns complex historical records into trusted, structured data using tried-and-tested methods developed in collaboration with leading scholars since 2020.
πΌοΈAdvanced image preparation
Image enhancement, deskewing, dewarping, page splitting, and CV/AI optimisations.
βοΈHandwriting & print text recognition
Large base models and bespoke HTR/OCR pipelines adapted to historical scripts and document types.
β AI output validation & quality assurance
Detect hallucinations, missing content, misrecognition, and numeric errors with audit-ready logs.
π§©Layout & structure analysis
Custom page segmentation and layout parsing for reliable structured data extraction.
π§ Multi-model language processing
Carefully constrained multimodal LLM workflows for entity recognition and relationship extraction.
π₯Crowdsourced ground truth platforms
Web tools for expert/public validation, correction, and training data creation.
πΊοΈHistorical GIS & spatial analysis
Link records to place, movement, and spatial context for mapping and analytics.
π§βπΌConsultancy & deployment
Workflow design, grant support, on-site deployment, and ML infrastructure planning.
π¦Research-ready exports
Outputs delivered in JSON/CSV/XML/PDF, with structure retained and QA trails included.
π Ethical and transparent by design
- No subscriptions or hidden fees. Project-based pricing for the work you need.
- Your data stays private. Processing on secure, isolated servers (yours or ours), not third-party clouds.
- Guaranteed deletion on completion. Files and models can be permanently deleted after delivery and QA.
- Transparent error tracking. Outputs are auditable, correctable, and fit for integration.
Accuracy you can measure β and trace.
General-purpose LLMs can transcribe reasonably well β but they also invent plausible, correct-looking text. Osiris-AI measures and controls this risk explicitly.
Osiris gets the best from fast-evolving AI through research and development to provide trustworthy and useful information from complex records.
Why generic LLM transcription breaks in production
- Plausible hallucinations: extra words that look authentic but were never in the source.
- Loss of structure: word position, layout, and tables are not reliably preserved.
- Token limits: large documents must be chunked, increasing drift and QA burden.
How Osiris-AI manages this for you
- Two-pronged approach: deterministic feature extraction (segmentation + HTR) plus multimodal LLMs.
- Structure-first: layout, coordinates, and document structure are retained for downstream use.
- Token-level verification: every AI output is checked against the source.
- Batch-safe pipelines: designed for scale without manual chunking.
- Customisable models: configurations tuned to your material, with transparent error reporting.
We regularly benchmark our models against competing systems (including industry leaders). The examples below report overall accuracy (F1) against ground truth and break errors down by type β isolating hallucination risk, numeric failures, and OCR noise.
18th-century English handwriting
HTR benchmark against ground truth.
Overall Accuracy (F1) β 18th C English handwriting Osiris ββββββββββββββββββββ 90% ChatGPT βββββββββββββββββ 79% Error Types (Extra Tokens vs GT) Plausible errors (easy to miss) Osiris ββββββ 3.97% ChatGPT ββββββββββββββββββββββ 14.67% Numeric errors (high impact) Osiris βββββββββββββββββ 4.30% ChatGPT ββββββββββββββββββββββ 5.67% Minor OCR errors Osiris ββββββββββββββββββββββ 1.66% ChatGPT ββββββββββββββββββ 1.33% Other major errors Osiris ββββββββββββββββββββββ 0.66% ChatGPT 0.00%
English language newspaper print (1925)
OCR benchmark against ground truth.
Overall Accuracy (F1) Osiris ββββββββββββββββββββ 92% ABBYY ββββββββββββββββ 71% ChatGPT βββββββββββββββββββ 85% Error Types (Extra Tokens vs GT) Plausible errors (easy to miss) Osiris βββββββββββββββ 4.66% ABBYY βββββββββββββββββ 5.27% ChatGPT ββββββββββββββββββββββ 6.78% Numeric errors (high impact) Osiris βββββββ 0.62% ABBYY ββββββββββββββββββββββ 1.86% ChatGPT βββββββ 0.58% Minor OCR errors Osiris β 0.52% ABBYY ββββββββββββββββββββββ 14.68% ChatGPT ββ 1.05% Other major errors Osiris ββββ 1.35% ABBYY ββββββββββββββββββββββ 6.83% ChatGPT ββ 0.58%
Overall accuracy is reported as F1 against ground truth. Error categories isolate hallucination risk, numeric failures, and OCR noise β helping teams prioritise what to check and what to fix.
Past projects.
Trusted by
Get in touch with us ...
Whether you need to extract data at scale, want to integrate geospatial data and historical records or need data consultancy, we'd love to hear from you.



