Digitization of INEAC archival documents

completed one main objective ...

Earlier this year, the digitization of INEAC climate data from the State Archives was finished. We clocked in at 71.856 and some pages from 873 archival folders, with each a month’s worth of daily data on temperature and precipitation. Sometimes these pages hold even more information: maximum and minimum temperature on different hours of the day, temperature of the soil at various depths, temperature in the shadow, wind speeds, humidity, etc (Figure 1). This means COBECORE now has 5.988 years of climate data from hundreds of observation points in the Congo basin.

Fig 1. - Historical records at Eala (today Mbandaka) as digitized during the COBECORE project, with barometric pressure (a), relative humidity (b), wind and cloud direction (c), nebulosity (d), cloud type (e), and precipitation (f) marked with red arrows.

The earliest data was written down in 1905; the last documents were filled out in 1964. Most of it, however, was gathered in the 1940s-1950s, INEACs heyday. This data collection was very much a collaboration between the three powers of Belgian colonialism: missions, companies, and scientific stations of the state all gathered climate data, which was then sent to INEAC. INEAC gave out the necessary equipment and instructions, and of course also collected data themselves in their stations.

Now that we have all of these numbers, we can begin processing them. First, the data needs to be cleaned out. Not all of it is trustworthy: sometimes equipment broke without being noticed, sometimes the observer was feeling lazy and gave the same values every day of the month, sometimes they simply made mistakes. The next step will be truly digitizing these 72.000 pages. There are multiple possibilities, from OCR technology to citizen science, to undertake this mammoth task. The end result will be a great database with all of this historic data on the climate and weather of the Congo basin, that scientists can use to model climate and climate change in this area.

BLOG
data_recovery digitization