Proof concept of optical character recognition application to provide access to the most important Romanian Holocaust-related archival document

Person stands at podium with screen behind him. Another person looks on
Babes-Bolyai University - Institute for Holocaust and Genocide Studies
Project Partner
United States Holocaust Memorial Museum
Participating Organizations
Civitas Europica Centralis Foundation (Hungary)

Today, there exist a relatively high number of available digital archival resources. However, the great majority of the basic archival collections have been digitized in non-searchable formats, such as images of the respective documents. Therefore both students and researchers have to read thousands of documents to find the necessary data for their projects. This tremendous work is preventing researchers to conduct complex researches, to correlate their findings, and to be able to reach pertinent conclusions.

One of the key aims of this project was to explore a technical solution to transform the images of the digital archival materials in searchable text documents with special applications provided by artificial intelligence type IT-solutions, and to create an indexed database of the most important key terms in order to help the complex development of both higher education teaching and research projects.

Subsequently, a proof concept of a complex IT-application was developed and its capabilities and performances were tested on the very poorly researched Romanian and Hungarian language archival materials related to the Holocaust in Romania. This was kept at United States Holocaust Memorial Museum (USHMM), Washington DC, USA. The project achieved its goals of constructing technical solutions of intelligent document digitization, key terms extraction, and indexing.

Beneficiary countries