On Saturday 29 June 2013, the British Library hosted a workshop to start making some of their material from the Europeana Collections 1914-1918 project publicly available. There was a public talk in the morning, to introduce the collections and the project, and then we had a chance to work with the material.
At the time of the event, the collections were (mostly) not yet published online, but a significant amount of them were made available on the day prior to release. The digitisation project was aiming to cover about 10,000 items published during and after the war, including a large number of books now out of print and difficult to obtain, as well as wartime ephemera from the UK and elsewhere.
Attendees were able to get a first chance to work with the digitised collections. Only a small fraction of these were online at the time of the event, and most of it wouldn't be posted until some point in 2014. At the same time there were scans on-site to work with!
Things we have had interest in:
Article-writing
This is Wikipedia, after all, and these are digitised books! In particular, the collection includes a large number of unit histories; for an example of what can be done with them, have a look at 15th (Imperial Service) Cavalry Brigade, built around digitised material from the India Office.
Image selection
The books themselves represent a very large and diverse image collection; as well as the text, most have several maps or plates, scanned in high quality and usually well-labelled - and often not published elsewhere. We could look at extracting some of these images and uploading them individually to Commons.
OCR\Text analysis
While we don't expect to have a large amount of the OCRed text ready for the workshop, the scans themselves will be available for running through (eg) Tesseract, and we plan to OCR some in advance. There has been some interest in mining these for names or places - perhaps trying to build indexes, or get a sense of geographic coverage?
The overall planned collection consists of about 10,000 items, of which most are English-language books, pamphlets and journals. There are large amounts of sheet music, manuscript items, and around 1000 non-English published books, with a small collection of photographs and maps. Not all of this will be available on the day - they have not all been digitised yet, and some material has a complex copyright situation that limits its availability.
A detailed list of the selection available on the day is now available here:
Books, journals and reports
~150 items from Printed Historical Sources; general contemporary historic, political and economic material
~120 items from Printed Literary Sources; predominantly contemporary literature
~130 items from International Official Publications; official publications and a mix of official and private unit (divisional, regimental) histories
~140 items from Trench Journals; a variety of official and unofficial unit publications; mostly UK with significant Canadian and ANZAC contributions, as well as a smattering of French and Eastern European material
~75 items from Maps; digitised map sheets
~65 items from India Office Records; official reports and publications dealing with Indian troops, the war in the Middle East, and Indian government policy. This post outlines the material digitised (now all available online through the BL's Digitised Manuscripts site)
Photographs
Girdwood Collection (Photo 24): Offical India Office photographs of the war in France, 1915. Now all on Wikimedia Commons (see commons:Commons:British Library/Girdwood)
Canadian Copyright Collection (HS85/10): A small number of images relating to WWI selected from the overall Canadian Copyright Collection. Some material already on Wikimedia Commons (see commons:Commons:Picturing Canada)
Canadian Official Photographs (LR.233): No metadata available. 1772 unlabelled images from the Canadian Official Photographs of WWI (mostly in sheets of six); not yet uploaded but will be put on Commons as part of the Picturing Canada project above.
Some sample titles from the printed collections:
07942.a.2 - Charles Herman Senn (d. 1934) : Senn's War Time Cooking Guide; 94pp
08226.ee.47 - Sir William Bower Forwood (merchant, d. 1928) : The Economics of War Finance Explained in Simple Words [reprinted from the Liverpool Post]; Liverpool : H. Young & Sons; 50pp; 1918
09083.dd.17 - Sir Henry Mortimer Durand [d. 1924] : The Thirteenth Hussars in the Great War; W. Blackwood & Sons; 392pp; 1921
B.S.68/92. - The War in the Air: being the story of the part played in the Great War by the Royal Air Force. (vol. 1. By W. Raleigh; vol. 2-6 by H. A. Jones.) [With appendices.] 6 vol. Oxford, 1922-37.
09084.cc.38 - Everard Wyrall, etc : The History of the 62nd [amalgamated with the 49th] - West Riding - Division, 1914-1919 ; 2 vol. John Lane: London, [1924-25.]
P.P.4039.wba.(2.) - “Sub Rosa”: being the magazine of the West Lancashire Division, BEF. Boulogne-sur-Mer, 1917, etc. 1917
T 35177 - Frederick Arthur Hook (d. 1930) : Merchant Adventurers, 1914-1918; [war records of the P&O, British India, and Associated shipping lines]; A&C Black; 319pp; 1920