Researchers are often faced with the challenge that historical sources are dispersed throughout various archives, collections, museums and memorials. Complementary data and information are grouped and catalogued following the logic of the individual institutions, and not necessarily with other institutions' collections or with potential users and researchers in mind.
At the IHRA Plenary Meetings, which took place in Bern from 27 - 30 November 2017, the founders of the Network War Collections – Netwerk Oorlongsbronnen from the Netherlands were invited to talk to IHRA delegates about how they are seeking to address this challenge by bringing archival material together in one digital space, creating a basic thesaurus and providing context - an initiative relevant to IHRA's goal to ensure access to documents bearing on the Holocaust.
Representatives of the Network, Puck Huitsing and Edwin Klijn, noted that “when you dream, you should dream big.” Their idea was not just to create an overarching database of the existing sources of historical data (as such initiatives already exist, even on the wider, pan-European level) but rather to present this data as information, primarily by creating a basic thesaurus and by providing context.
As Puck Huitsing, program director of the Network, explained to IHRA delegates, there are more than four hundred relevant archival sources regarding the history of the Second World War in the Netherlands alone. While some of them are key national institutions, like the National Archives of the Netherlands, the Central Archive for Special Justice or the War Graves Foundation, most of them are smaller entities, dealing with local history, individual actors or events. Yet, this latter group may also harbor valuable archival sources of information with a relevance far beyond their intended original scope.
The aim of the “data factory,” as the initiative was referred to by its founders, is to create searchable databases, to connect archival sources, define a thesaurus with the purpose of processing data and retrieving information, and finally to create context through which information on events, people or actions can be grouped together across collections, allowing it to be more easily interpreted and understood. Once the aim was clear, the project founders had to tackle the “technical” obstacles: from legal complications (involving copyrights and personal rights) to IT difficulties.
A good deal of the IT challenges stemmed from the fact that from the very beginning the network aimed to be more than a mere inventory of documents. The founders envisaged compiling the fragmented data in an easily searchable system. To understand how complex this challenge is, one has to keep on mind that most of the archival source material, even if digitized, is preserved as scanned pictures which do not allow for an automated search within the document. Applying optical character recognition brought mixed results, due to faded and handwritten documents. By applying cutting edge technology and AI, the network managed to reach the average level of text recognition above eighty-two percent, and further slight improvements can be expected. The searchable texts are then scanned by programs which recognize personal and geographical names. The results enable researchers to provide the answer to the basic questions: who, where, what and when. Based on the results, context can be provided, events and developments can be interlinked within these contexts and, approaching from another angle, timelines for peoplpe can be created.
The next logical step is, as Edwin Klijn, program manager of the Network explained, to reach out beyond the borders of the Kingdom of the Netherlands. The context of war-time Netherlands cannot be separated from the context of occupied Western Europe and, as such, interlinked with the context of the Nazi Germany, therefore the project is also relevant for the whole of Europe. The first step on this road is to translate the website of the Network, www.oorlogsbronnen.nl into English and German and to open the project for cooperation with relevant institutions all over the world.