UNESCO’s Proceedings, 1945–2017: A Bilingual Digital Text Corpus

Published in Journal of Open Humanities Data, 2025

On this paper, I annotated data while I was working at Uppsala University.

The record of the meetings of UNESCO’s General Conference offers a valuable resource for research in the global humanities. We present a digital text corpus, including metadata and supplementary material, that makes the complete record of these meetings from 1946 to 2017 in English and/or French accessible in a machine-readable form that is suitable for digital text analysis. The corpus is stored on Zenodo; relevant code is available on GitHub. The corpus offers reuse potential for scholars interested in any of the countless issues that have been discussed and debated in UNESCO’s General Conference over more than seventy years, as well as to Natural Language Processing (NLP) developers interested in the challenges of language recognition and automated segmentation.

Recommended citation: Martin, B. G., Norén, F. M., Mähler, R., Marklund, A., & Martin, O. (2025). "UNESCO’s Proceedings, 1945–2017: A Bilingual Digital Text Corpus". Journal of Open Humanities Data, 11: 31, pp. 1–5. DOI: https://doi.org/10.5334/ johd.314
Download Paper | Download Bibtex