Lest we forget
The dataset provided relates to the ‘Lest We Forget’ archive collection, which was created and published by a National Lottery Heritage Funded, World War 1 crowdsourcing project of the same name.
The ‘Lest We Forget’ archive collection is hosted on the Sustainable Digital Scholarship (SDS) platform, a web-based, open-access digital repository based in the Bodleian Libraries (University of Oxford). The dataset provided was generated via bulk export of this archive collection, using the SDS platform (i.e. Figshare) API.
Please be aware – the dataset relates to World War 1 (1914-1918) and the materials were submitted by the public. While working with this dataset, you may therefore encounter content or files (should you choose to scrape these using the file URLs), which you may find upsetting or offensive.
- Lest We Forget Archive (SDS platform): https://portal.sds.ox.ac.uk/lest_we_forget
- Lest We Forget Project’s Own Webpage: https://lwf.web.ox.ac.uk/home
- SDS platform main/non-project specific landing page: https://portal.sds.ox.ac.uk/
- Figshare (aka SDS platform) API use guide: https://info.figshare.com/user-guide/how-to-use-the-figshare-api/
- SDS website: https://www.sds.ox.ac.uk/
- SDS Featured Project Pages: https://www.sds.ox.ac.uk/featured-projects
- CC’s Hugging Face Spaces (which can be used to try stuff out without coding):
- NER (Named Entity Recognition) Explorer Tool: https://huggingface.co/spaces/SorrelC/NER-Explorer-Tool
- Keyword Extraction (KE) Explorer Tool: https://huggingface.co/spaces/SorrelC/KeywordExtraction-Explorer-Tool
- Controlled Vocabulary Keyword Tagging Tool: https://huggingface.co/spaces/SorrelC/ControlledVocab_KeywordExtraction


