The historical archives of the ‘Institut National d’Etudes Agronomique du Congo Belge (INEAC)’ and ‘La régie des plantations de la colonie (REPCO)’, spanning approximately six decades (~1901 – 1960), at the State Archives, the Royal Museum for Central Africa and the herbarium collections of the Botanic Garden Meise hold vast amounts of data including historical forestry, climatological, ecological, biodiversity data and aerial photographs, with great potential and relevance for basic and applied forestry research in the central Congo Basin.
The COBECORE project aims to establish these baseline measurements by valorizing eco-climatological legacy data available within the INEAC archives and complementary historical archives and natural history collections. The project will make information stored in analog archives digitally accessible, through computer vision, machine learning and citizen science approaches. In particular, we use (elastic) image registration, and convolutional neural networks to facilitate the extraction and transcription of handwritten data entries in combination with and supported by citizen science based validation data. The project will result in a multi-faceted database, while linking (meta-) data to existing data records (i.e. digitized herbarium specimen at the Botanical Garden Meise), for direct applications in forestry research.
Here we report on the first half year of data recovery and discuss progress made in the automated data registration and transcription as well as the use of crowdsourcing / citizen science platform. The COBECORE project validates and underscores the importance of an interdisciplinary approach connecting the humanities and information technology (computer science) in unlocking archived (analog) data in support of the natural sciences.