Topic data sets – Hackathon Coding Precarity

In September 2020 the hackathon „Coding Precarity – Social Issues in Cultural Data“, organized by the Berlin State Library and the ZBW – Leibniz Information Centre for Economics, took place in Berlin. During this three-day event, interdisciplinary teams of computer scientists, designers and humanities scholars developed innovative projects using digitised historical documents from the Berlin State Library and the ZBW, addressing questions such as precarious employment, economic uncertainty, and social marginalization.


Data sets


The Social Question

Description:

With industrialisation, economic and social structures change dramatically: inhabitants of rural spaces move to growing cities in search of work, and the resulting oversupply of workers leads to low wages and impoverishment. These problems and their possible solutions were addressed controversially in the 19th century as the Social Question.


Materials:

Historical prints from the period 1844-1919 in German with reference to all subjects


Scope:

32 works / METS files; a fraction of the relevant SBB and ZBW stock


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Misery and Care

Description:

Due to the mass social misery in the cities, institutionalised and professionalised social work emerged alongside the tradition of upper class private charity, regarding poverty, crime and social problems as economically induced. Nevertheless, the view of self-inflicted poverty and misery also persisted and led to attempts to enforce moral conformity through prohibition and social control.


Materials:

Historical prints from the period 1845-1925 in German with reference to all subjects


Scope:

54 works / METS files; a fraction of the relevant SBB and ZBW stock


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Textile Industry and Agriculture

Description:

Surveys and statistics documenting the economic and social situation of specific professional groups provide insights into the working and living conditions of the working class and also show the efforts of labour unions and movements to constantly improve these conditions.


Materials:

Historical prints from the period 1792-1920 in German with reference to all subjects


Scope:

38 works / METS files; a fraction of the relevant SBB and ZBW stock


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Women as Social and Factory Workers

Description:

Women were generally more affected by social changes and their economic position was even more difficult than that of male workers. At the same time, new fields of professional activity were established for women, for example in social work, which led to help offered by women for women.


Materials:

Historical prints from the period 1792-1919 in German with reference to all subjects


Scope:

27 works / METS files; a fraction of the relevant SBB and ZBW stock


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Commercial Profit and Work Reform

Description:

Under the pressure of the labour movement and increasingly strong social democratic parties, social insurance schemes were established in 1883-1891, and there was an ongoing fight for better conditions, such as higher wages and regulated working hours. However, these reforms were sometimes regarded as a danger for economic profit and therefore prevented by the employers.


Materials:

Historical prints from the period 1828-1924 in German and French with reference to all subjects


Scope:

42 works / METS files; a fraction of the relevant SBB and ZBW stock


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Marginalisation and Criminialisation

Description:

Marginalised and criminalised groups, such as sick and unemployed persons, orphans, widows, unmarried mothers, alcohol addicts or prostitutes, were most dramatically affected by economic and social pressure in the 19th century. In the efforts to help these persons in need, both supportive and regulatory tendencies can be identified, offering help only in exchange for moral and social control.


Materials:

Historical prints from the period 1736-1925 in German with reference to all subjects


Scope:

54 works / METS files; a fraction of the relevant SBB and ZBW stock


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Socialism and Liberalism

Description:

With the emergence of the labour movement and the rise of socialist parties, the question of interaction between the economical and the political system became crucial. While labour unions advocated for the protection of workers, proponents of liberalism and nationalism saw this as a weakening of economic power and thus as a danger for the prosperity of the national state.


Materials:

Historical prints from the period 1876-1924 in German with reference to all subjects


Scope:

37 works / METS files; a fraction of the relevant SBB and ZBW stock


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Housing Shortage and Food Prices

Description:

The growth of cities led to housing shortage and overcrowding of flats with poor hygienic conditions. Moreover, the income of many working class people didn’t allow for sufficient alimentation. Also in rural areas the job scarcity often led to difficult living conditions. Proposed solutions for these problems were the establishment of worker’s colonies, the construction of garden cities in the suburbs and the increasing industrialisation of smaller towns.


Materials:

Historical prints from the period 1870-1920 in German with reference to all subjects


Scope:

21 works / METS files; a fraction of the relevant SBB and ZBW stock


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Stand: 23.11.2020


 

Data sets for historical events

Here are exemplary data sets listed, which have been created in different contexts or projects and are related to individual historical events. Therefore, they show only a small, but very interesting, part of the collections. In principle, the starting point is always the entrance page of the digitized collections; here you can find your way around the large subject groups of the Old Subject Catalogue. For history, ethnography and geography, digitized material can be found in a five-digit range. This set is also displayed via the OAI interface.

 


20. Century


🔗 War 1914-1918

Materials:

Historical prints (including single-sheet prints), manuscripts, portraits, estate materials, maps and sheet music regardless of language and place of publication with reference to all subjects


Scope:

7,621 METS files of monographic works and journals + 521 manuscripts; approx. 20 % of the relevant SBB holdings


Specifics:

mostly no structural data; full texts of 6,658 titles; pages optional


Licenses:

mainly Public Domain Mark 1.0 – see license notice in METS file; exceptions with different licenses


Links:


Contact persons:


🔗 Novemberrevolution

Materials:

Historical prints (broadsheets) in German language independent of the place of publication with reference to history / ethnography / geography and politics / state / society / economy


Scope:

128 METS files of monographic works; a fraction of the relevant SBB stock


Specifics:

no structural data; pages not exempted


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


16. Jahrhundert


🔗 Luther und die Reformation

Materials:

Historical prints, portraits and manuscripts independent of language and place of publication with reference to theology, history / ethnography / geography; created in connection with the exhibition Bibel, Thesen, Propaganda in the Staatsbibliothek zu Berlin 2017


Scope:

73 METS files, mainly monographic works; a fraction of the relevant SBB stock


Specifics:

Structural data; pages not exempted


Licenses:

Public Domain Mark 1.0


Contact persons:


Stand: 08.09.2020

Topic data sets – Hackathon Coding Gender

In the summer of 2019 the hackathon „Coding Gender – Women in Cultural Data“ took place in the Berlin State Library. During this three-day event, interdisciplinary teams of computer scientists, designers and humanities scholars developed innovative projects using digitized historical documents and addressing questions such as the visibility of women in cultural data, the construction and representation of gender roles and the relation between today’s socio-political debates and historical gender stereotypes.
The topic datasets created from the Digitized Collections of the SBB for this hackathon are accessible here.


Data sets


🔗 Imposed Identities and Moral Claims

Description:

Gender roles are constructed, discussed and set as normative concepts not only in scientific, but also in literary, educational and entertaining works by connecting specific qualities and modes of conduct with them. Furthermore, some works are specially dedicated to an audience defined by gender, e.g. almanacs for women.


Materials:

Historical prints (monographs, illustrations, leaflets) from the period 1613-1920 in German and English with reference to all subjects


Scope:

84 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 30 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 Empowerment and its Enemies

Description:

This data set contains early works concerning the social and legal position of women and documents from the women’s movement in the second half of the 19th century. In this time, women conquered a new freedom of action and challengend traditional gender roles, in political and socioeconomic contexts as well as in everyday life, e.g. by leisure activites like cycling or girl scout hiking.


Materials:

Historical prints (monographs, portraits, pamphlets, leaflets) from the period 1792-1920 in German and English with reference to all subjects


Scope:

29 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 14 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 A Life of One‘s Own?

Description:

The construction of gender roles often implies specific ways of life. Documents in this data set contain ideas and rules on how to lead a “good” female life. By and by, alternative lifestyles emerge in addition to the traditional ideal of women as wives and mothers, including academic studies, wagework and artistic activities.


Materials:

Historical prints (monographs, leaflets, illustrations) and manuscripts (handwritten letters) from the period 1757-1920 in German, French and English with reference to all subjects


Scope:

61 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 29 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 Sex and Gender

Description:

Historical scientific works about sex and gender often connect medical and psychological observations with judgements and regulations, and thus pathologize or even criminalize deviations from the heteronormative standard. Furthermore, sex education books show that access to knowledge about sexuality was shaped according to gender stereotypes.


Materials:

Historical prints (monographs, leaflets) and manuscripts (handwritten letters) from the period 1518-1920 in German and Latin with reference to all subjects.


Scope:

56 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 34 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 Binary / Non-Binary

Description:

For centuries, the discourse about gender has been based on the concept of two clearly distinct and opposed categories, “woman” and “man”. This data set contains documents which set out and consolidate this view in medical, psychological, and social perspectives, but also some works that challenge gender binarity by discussing alternative concepts.


Materials:

Historical prints (monographs, illustrations) and manuscripts (handwritten letters) from the period 1531-1920 in German, English and Latin with reference to all subjects


Scope:

92 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 44 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 Gender – Race – Class

Description:

Discrimination against women is often combined with other forms of discrimination, e.g. for social or ethnic reasons, a phenomenon called intersectionality. This data set contains mainly documents from colonial contexts.


Materials:

Historical prints (monographs, picture books, sheet of pictures) from the period 1757-1917 in German with reference to all subjects


Scope:

17 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 10 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 Women in Wartime

Description:

The state of emergency during the first world war had a great impact on gender roles. On the one hand, an increasing number of women did wagework and assumed responsibilites formerly reserved to men; on the other hand, in the patriotic war propaganda, women mainly appeared as loving mothers and weak beings in need of male protection.


Materials:

Historical prints (monographs, picture books, leaflets), photographs, handwritten material form the period 1870-1919 in German and English with reference to all subjects


Scope:

44 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 31 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 Individual Lives

Description:

This data set contains documents about the personal lives of artists, e.g. the composer Luise Adolpha Le Beau and the writers Madame de Staël and Lou Andreas-Salomé, as well as scientists, like Elsa Neumann and Magnus Hirschfeld.


Materials:

Historical prints (monographs, portraits), manuscripts (handwritten letters), newspaper excerpts from the period 1752-1920 in German and French with reference to all subjects


Scope:

111 works / METS files; a fraction of the relevant SBB stock


Specifics:

no full texts


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 Doll‘s Kitchen and Aeroplane Kite

Description:

Even today, the classification pink and blue, princess and pirate, passive and active for girls and boys respectively is still ubiquitous and forces children to assume binary gender roles at a very early stage of life. This data set contains historical precursors of this phenomenon.


Materials:

Historical prints (monographs, portraits, picture books, sheets of pictures) from the period 1562-1920 in German, French, and Dutch with reference to all subjects


Scope:

71 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 40 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


🔗 Picturing Gender

Description:

Pictures often show implicit gender concepts by the way of representing persons or the choice of the scene in which they are represented.


Materials:

Historical prints (monographs, engravings, sheets of pictures), photographs and paintings from the period 1518-1920 in German, French and English with reference to all subjects


Scope:

153 works / METS files; a fraction of the relevant SBB stock


Specifics:

full texts of 25 works


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Stand: 23.06.2020


 

Special data sets – prints

The OAI interface of the digitised collections offers not only roughly factual and formal sets, which are usually very extensive, but also special compilations that are defined by time, genre or project. These additional sets thus provide more targeted access to individual (mostly incomplete) digitised groups of holdings. All the data contained here should be represented in at least one of the large sets (formally/contentwise).

Auflistung nach Jahrhunderten, nach Gattungen, nach Projekten

Listing by centuries, by genres, by projects


By centuries


Incunabula – prints of the 15th century

Materials:

Cradle prints independent of language and place of publication with reference to all subjects, especially theology, languages / literatures


Scope:

1,192 METS files of monographic works; approx. 27 % of the relevant SBB stock


Specifics:

no structural data; covers scanned; GW numbers included; pages not exempted


Licenses:

Public Domain Mark 1.0


Links:


Contact persons:


Prints of the 16th century

Materials:

Historical prints of the 16th century independent of language and place of publication with reference to all subjects, especially theology, languages / literatures, music, history / ethnography / geography, law


Scope:

12,230 METS files of monographic works; approx. 22 % of the relevant SBB stock


Specifics:

in-depth manually entered structural data; covers scanned; pages not exempted


Licenses:

Public Domain Mark 1.0


Contact persons:


Prints of the 17th century

Materials:

Historical prints of the 17th century independent of language and place of publication with reference to all subjects, especially theology, languages / literatures, law, general / science / history of literature

Scope:

22,912 METS files of mainly monographic works; approx. 25 % of the relevant SBB stock

Specifics:

in-depth manually entered structural data; scanned bindings; full texts of 940 titles; subject indexing by AAD genre terms; pages not optional

Licenses:

Public Domain Mark 1.0

Contact persons:


Prints of the 18th century

Materials:

Historical prints of the 18th century independent of language and place of publication with reference to all subjects, especially theology, languages / literatures, law, general / science / history of literature, history / ethnography / geography

Scope:

30,666 METS files of monographic works and journals; approx. 28 % of the relevant SBB stock

Specifics:

manually entered structural data; original bindings scanned; full texts of 1,881 titles; subject indexing by AAD genre terms; pages not exempted

Licenses:

Public Domain Mark 1.0

Contact persons:


Prints of the 19th century

Materials:

Historical prints of the 19th century independent of language and place of publication with reference to all subjects, especially law, ostasiatica, history / ethnography / geography, languages / literature

Scope:

22,316 METS files of monographic works and journals; approx. 5 % of the relevant SBB stock

Specifics:

manually entered structural data; full texts of 13,763 titles; pages optional

Licenses:

Public Domain Mark 1.0

Contact persons:


Prints of the 20th century

Materials:

Prints of the 20th century independent of language and place of publication with reference to all subjects, especially to the war 1914-1918, Ostasiatica

Scope:

11,594 METS files of monographic works and journals; a minimum of the relevant SBB stock

Specifics:

manually entered structural data; full texts of 8,880 titles; pages optional

Licenses:

Various licenses – see license note in METS file (mostly Public Domain Mark 1.0)

Contact persons:


Nach Gattungen


Zeitschriften

Materials:

Historical journals independent of language and place of publication with reference to all subjects, especially law, war 1914-1918, history / ethnography / geography

Scope:

8,734 METS files of journals and their volumes; a minimum of the relevant SBB stock

Specifics:

mainly manually entered structural data; full texts of 5,714 volumes; pages mainly optional

Licenses:

Various licenses – see license note in METS file (mostly Public Domain Mark 1.0)

Links:

Contact persons:


VD18-Magazines

Materials:

Historical journals from the VD 18 project with reference to all subjects, especially general / science studies / literary history, history / ethnography / geography, theology, languages / literature

Scope:

1,485 METS files of journals and their volumes; a minimum of the relevant SBB stock

Specifics:

manually entered structural data; pages not exempted

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


Funerals

Materials:

Sermons, epicedia and occasional writings on death printed in the German-speaking world from the 16th-18th century

Scope:

10,052 METS files of monographic works; approx. 55 % of the relevant SBB stock

Specifics:

manually entered structural data; full texts of 1,278 works; subject indexing using AAD genre terms; pages not optional

Licenses:

Public Domain Mark 1.0

Links

Contact persons:


Folkloristic German incunabula

Materials:

Cradle prints in German language independent of the place of publication with reference to all subjects, especially to languages / literature, politics / state / society / economy

Scope:

285 METS files of monographic works (many single-sheet prints)

Specifics:

no structural data; covers scanned; GW numbers included; pages not exempted

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


Illustrated song fliers

Materials:

Illustrated historical song flight prints in German (also in Low German) in all subjects, especially languages / literatures, theology, independent of the place of publication

Scope:

1,598 METS files of monographic works with a small number of pages; relevant SBB stock almost complete

Specifics:

few structural data; full texts of 5 works; subject indexing by AAD genre terms; pages not optional

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


Poetry

Materials:

Historical prints with lyrical parts (songs, funeral poems, singspiels) independent of language and place of publication with reference to all subjects, especially languages / literatures, theology, music

Scope:

7,359 METS files of monographic works; a minimum of the relevant SBB stock – colourful mixture

Specifics:

manually entered structural data; full texts of 57 works; pages mostly not exempted

Licenses:

Public Domain Mark 1.0

Contact persons:


Novels

Materials:

Historical prints (coded with AAD genre term Roman) mainly from the 18th century

Scope:

690 METS files of monographic works; a minimum of the relevant SBB stock – colourful mixture

Specifics:

few structural data; full texts for 1 work; pages mostly not exempted

Licenses:

Public Domain Mark 1.0

Contact persons:


By projects


VD16 digital

Materials:

Historical prints of the 16th century from the German-speaking area, regardless of language, and German-language prints, regardless of place of publication, with reference to all subjects, especially theology, languages / literatures, music, history / ethnography / geography, law

Scope:

11,975 METS files of monographic works; 40 % of the relevant SBB stock

Specifics:

extensively manually entered structural data; covers scanned; pages not exempted

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


VD17 digital – Unika

Materials:

Historical prints of the 17th century from the German-speaking area, regardless of language, and German-language prints, regardless of place of publication, with reference to all subjects, especially theology, languages / literatures, general / science / history of literature, law

Scope:

9,651 METS files of monographic works; 13 % of the relevant SBB-VD17 stock – complement to Preußen 17 digital

Specifics:

650 titles from other libraries; extensively manually entered structural data; scanned bindings; full texts of 388 works; subject indexing by AAD genre terms; pages not optional

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


Preußen 17 digital

Materials:

Historical prints of the 17th century from the Prussian and Northern German area with reference to all subjects, especially theology, languages / literatures, jurisprudence, general / science studies / history of literature

Scope:

12,567 METS files of monographic works; 17 % of the relevant SBB-VD17 stock – complement to VD17 digital

Specifics:

241 titles of the library St. Nikolai Spandau containing; extensively manually entered structural data; scanned bindings; full texts of 298 works; subject indexing by AAD genre terms; pages not optional

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


VD song digital

Materials:

Historical song flight prints in German (including Low German) language independent of the place of publication with reference to all subjects, especially music, languages / literatures, theology

Scope:

2,917 METS files of monographic works with a small number of pages; relevant SBB stock almost complete

Specifics:

indexing on song level; full texts of 5 works; subject indexing by AAD genre terms; pages not optional

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


VD18 digital

Materials:

Historical prints of the 18th century from the German-speaking area, regardless of language, and German-language prints, regardless of place of publication, with reference to all subjects, especially theology, languages / literatures, law, general / science / history of literature

Scope:

29,909 METS files of monographic works and journals; approx. 25 % of the relevant SBB stock

Specifics:

55 titles from other libraries; manually entered structural data; full texts of 1,705 works; subject indexing by AAD genre terms; pages not optional

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


WegehauptDigital

Materials:

Non-fiction for children and young people, especially in the natural sciences and technology, regardless of language (86% German-speaking) and place of publication, published between 1633 and 1913

Scope:

2,009 METS files of monographic works and journals; approx. 29 % of the relevant SBB stock

Specifics:

in-depth manually entered structural data; full texts of 686 works; additional subject indexing by DDC notation; pages not optional

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


Digital picture sheet

Materials:

Bilderbogen (single-sheet prints not only for children and young people, also war picture sheets etc.), publication period 1730-1916

Scope:

574 METS files from a total of 1432 picture sheets (including 890 from 41 anthologies of Munich picture sheets); approx. 33 % of the relevant SBB stock

Specifics:

individually indexed picture sheets (in anthologies by means of structural data); subject indexing by GND- keywords; pages not exempted

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


German territorial law of the 19th century

Materials:

Printed works on the particular rights of German territories published between 1801 and 1900 (containing 150 foreign language titles)

Scope:

10,149 METS files of monographic works and journals; approx. 85-90 % of the relevant SBB stock

Specifics:

6 titles from other libraries; extensively manually entered structural data; full texts of 8,841 works; subject indexing in ARK notation; pages optional

Licenses:

Public Domain Mark 1.0

Links:

Contact persons:


Stand: 15.05.2020


 

Zeitschriftendatenbank

The Union Catalogue of Serials (ZDB), which is jointly maintained by the Deutsche Nationalbibliothek (DNB) and the Staatsbibliothek zu Berlin (SBB), contains records of journals, newspapers, book series and other serial publications from all countries, in all languages, from all periods, in printed, electronic or digitized form. The bibliographic records are supplemented by the corresponding holding records of libraries in Germany and Austria.

The current search interface of the ZDB Catalogue provides various search functions, including the visualization of title relations, a timeline for title histories and changes, a map of collections and a graphic chart of collections.

The ZDB Catalogue at a glance:

The ZDB provides various interfaces and data services on the basis of the centrally recorded data.

Interfaces

OAI-PMH

The OAI Protocol for Metadata Harvesting (OAI-PMH) is an XML-based protocol for querying and transferring metadata between a data provider and a service provider who provides customised research services based on the queried data.

The ZDB provides an OAI interface for querying bibliographic and holding records . The scope of the delivered data can be determined by selecting specific time intervals. Data formats available include MARC21 and OAI DC.

Access to the OAI interface requires you to register. For more information on OAI and registration, please visit the DNB website at http://www.dnb.de/oai or contact the DNB Interface Service via schnittstellen-service@dnb.de.

SRU

Search/Retrieve via URL (SRU) is a protocol used to query bibliographic databases using the HTTP protocol and is further developed by the Library of Congress. ZDB has  replaced the Z39.50 protocol with the SRU protocol and thus meets the requirements of modern web development.

SRU queries are written in the Common / Contextual Query Language (CQL) and sent to the SRU server via GET/POST. The response is returned as XML text.

SRU interfaces

The ZDB offers two SRU interfaces:

DatABASE CONTENT Base-URL Explain-Operation
ZDB Catalogue Bibliographic Records, Holding Records http://services.dnb.de/sru/zdb ZDB Explain-Operation
ZDB Address Database / ISIL and Library Identifier Index Address Data, Library Identifier, ISIL http://services.dnb.de/sru/bib Library ID Index Explain Operation

Both interfaces are currently still based on SRU Version 1.1 and CQL: Version 1.1, Level 2.

Formats

The following formats are available:

FORMAT DESCRIPTION EXAMPLE SCOPE / CONTENTS
SCHEME
MARC21-xml Description of MARC21 Format Example XML variant of MARC21
Title data or ISIL / address data
MARCXML Scheme
MARC21plus-1-xml Description of MARC21 Format for local Data Example XML variant of MARC21 with title and local data MARCXML Scheme
oai_dc Description of Dublin Core Format Example Selection of Dublin Core Elements
Title data and ISIL / address data
OAI Dublin Core Scheme
PicaPlus-xml Description of ISIL Format / Address Data Example XML version of Pica Plus
ISIL / Address data
PicaPlus-XML Scheme
RDF/XML RDF Representation of Bibliographic Data Example RDF/XML Serialization of Title Data RDF/XML Syntax Specification

Linked Open Data

The ZDB offers you access to your title data as Linked Open Data.

Modeling of the title data is based on the recommendations for the RDF representation of bibliographic data of theTitle Data group of DINI-AG KIM (see Vocabularies Used).

Only the most important data of each title aredisplayed However, the scope of the fields converted into RDF will be expanded over time. The service provided here should therefore be seen as an intermediate stage of the data modelling currently under development.

Service and data model

Bibliographic records are encoded in the Resource Description Framework (RDF). As RDF serialization the records are available as RDF/XML, Terse RDF Triple Language (Turtle) and JSON-LD.

The ZDB Linked Data Service has been developed with regard to the W3C Best Practices (Cool URIs for the Semantic Web) and is based on URIs with 303 Redirect and Content Negotiation.

Using content negotiation, the ZDB linked data service attempts to find the appropriate representation of the data for the respective client and returns a corresponding content type.

Used Vocabularies

Bibliographic data in the ZDB  are structured  according to the recommendations for the RDF representation of bibliographic data of the AG KIM group Title data DINI. The vocabularies and terms currently used are described using JSON-LD context objects:

JSON-LD context object for ZDB title data according to DINI-KIM recommendation
JSON-LD context object for ZDB title data according to DINI-KIM recommendation (with content type application/ld+json)

RDF dump

The ZDB data are provided as an RDF dump in the serializations RDF/XML, Turtle and JSON-LD for download on the download page for open data of the German National Library.

HDT

The ZDB data are also available as HDT files. HDT (Header, Dictionary, Triples) is a compact binary serialization format for RDF that compresses large datasets to save disk space. It is possible to search directly in a compressed dataset. This makes it an ideal format for storing and sharing RDF datasets on the web.

Changes and updates to the Linked Data Service will be announced via the DNB mailing list: http://lists.dnb.de/mailman/listinfo/lds.

Conditions for Use

All bibliographic data and a large part of the holding records are available under the Creative Commons Zero 1.0 license. Please refer to the data licensing information given by ZDB.

We would like to point out that our permission to use the interfaces is only valid under the prerequisite that the hosting function of the German National Library, i.e. is not impaired by any problems created by downloading data.

Contact

Hans-Jörg Lieder, Carsten Klee

Bibliographic Data from StaBiKat

Introduction

Our online catalogue StaBiKat includes the metadata of the complete printed and digital collections of Staatsbibliothek zu Berlin from the publication years 1500 to the present. Currently there are about 14 mio. searchable records.

StaBiKat data (excerpt) – https://zenodo.org/record/2590752

Are you interested in working with basic bibliographic data from our online catalogue StaBiKat? Sets of records organized along language families are already available. The data sets consist of the most important metadata including PPN (catalogue identifier), author, title, place/country of publication, publisher, year of publication and language code. The data sets do not include the full metadata, nor do they represent the content of the catalogue in full. Moreover, you should take note of the date of the last update. If you require more up-to-date data, you can use the following scripts to create your own data sets.

Which information is missing here?

• Records without a language code (slightly less than half of all SBB records)
• Shelf marks and location identifiers (e.g. to identify items lost in the war),
• existence of digital versions (including their PURLs)
• data sets selected on the basis of subject criteria or year of publication

These data sets are provided by the Gemeinsamer Bibliotheksverbund (GBV), the library network of which Stiftung Preussischer Kulturbesitz has been a member for over 20 years.

Interfaces

StaBiKat does not provide direct support for interfaces to export large data quantities. However, for specific queries you can use the SRU or UnAPI interfaces of the GBV network. You may have to get in touch with the network office of GBV.

SRU – http://sru.k10plus.de/opac-de-1

SRU, which is used here in the 1.1 version, is an http-based protocol for machine-based bibliographic data queries. You can use this interface to import data for your catalogues, subject gateways or digitizing your objects.

The retrieval language used is Contextual Query Language. You can use the StabiKat SRU interface to run concise queries yielding a limited set of results. Metadata provided here come in the Dublin Core (DC, v 1.1) and MODS (v 3.4) formats.

Search syntax and indexes: http://sru.k10plus.de/opac-de-1: http://sru.k10plus.de/opac-de-1

Examples for queries:

• A maximum of 10 titles in StaBiKat that contain the words „pupils“ and „Edict“, in MODS Format
http://sru.k10plus.de/opac-de-1?version=1.1&operation=searchRetrieve&query=pica.xtit=pupillen+edict&maximumRecords=10&recordSchema=mods
• Person search for Konrad Adenauer through the full StaBiKat database, output in Dublin Core, with a maximum of 300 results
http://sru.k10plus.de/opac-de-1?version=1.1&operation=searchRetrieve&query=pica.xprs=adenauer,konrad&maximumRecords=300&recordSchema=dc

SRU – K10-PLUS Network Catalogue

The corresponding queries on the level of the GBV network are:
http://sru.k10plus.de/gvk7?version=1.1&operation=searchRetrieve&query=pica.tit=pupillen+edict&maximumRecords=50&recordSchema=mods
and
http://sru.k10plus.de/gvk7?version=1.1&operation=searchRetrieve&query=pica.prs=adenauer,konrad&maximumRecords=300&recordSchema=dc

unAPI

UnAPI provides a straightforward web-based method to retrieve individual record in different formats. The unAPI interface does not enable searches through whole data collections but only provides individual records referenced with an identifier. Each query therefore has to include an unambiguous identitfier for the respective record and the metadata format required (cp. https://wiki.k10plus.de/display/K10PLUS/UnAPI, 1. Abs.)

If you want to download individual records whose PPN you know, you can use the unAPI interface of StaBiKat and GBV network catalogue as follows:

StaBiKat – http://unapi.k10plus.de/?id=opac-de-1

Syntax:
http://unapi.k10plus.de/?id=opac-de-1:ppn:##########&format=dc

Example:
http://unapi.k10plus.de/?id=opac-de-1:ppn:1000127265&format=dc

Network catalogue unAPI – http://unapi.k10plus.de/

Syntax:
http://unapi.k10plus.de/?id=gvk:ppn:##########&format=mods

Example:
http://unapi.k10plus.de/?id=gvk:ppn:178293199&format=mods

Just add the PPN in place of ########## and select the output format (e.g. „MODS“). GBV alternatively provides the possibility to retrieve data in the Pica, Dublin Core and MARC formats. Please note that the unAPI interface of StaBiKat only gives results in the Dublin Core and MODS formats.

Conditions of Use

SBB pursues to an Open Data Policy and provides its metadata for free under the CC0 licence. The conditions of use fort he interface service are defined by GBV.

Contact

Andrea Jacobs

ZEFYS

Introduction

The ZEitungsinFormationssYStem ZEFYS offers access to the digitized historical newspapers of Staatsbibliothek zu Berlin.

Currently ZEFYS provides access to 276.015 issues of 193 historical newspapers from Germany, and of German-language newspapers in foreign countries.

Interfaces

For legal reasons, the APIs listed here are only available for the public domain titles in ZEFYS. For the contents of the “DDR Presse” portal we can unfortunately not provide direct access to the data.

Retrieval of content, images and full-text, for digitised newspapers is supported via the International Image Interoperability Framework (IIIF) protocol. An increasing number of free clients and libraries for IIIF in numerous programming languages are available on the web.

Currently, digitised newspaper images and metadata can be retrieved by requests following the schema:
http://content.staatsbibliothek-berlin.de/zefys/SNP{ZDB-ID}-{YYYYMMDD}-{Issue}-{Page}-{Article}-{Version}

The ZDB-ID is a unique identifier for every newspaper title and can be found either within the ZEFYS newspaper portal or directly from the ZDB.

Next, a date of issue needs to be specified in the YYYYMMDD format, e.g. 18900101 for the issue published on January 1st, 1890. If you want to see which date ranges of a specific title have already been digitised, please refer to the ZEFYS newspaper portal.

To retrieve the scanned images for the newspaper, further information needs to be specified in the URL, such as the addition of /full/{width in pixel},/0/default.jpg with width in pixel can be chosen freely and the height will be calculated, e.g.
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-1-0-0/full/1200,/0/default.jpg
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-1-0-0/full/250,/0/default.jpg

The IIIF format allows more image manipulations via URL. Besides changing the size of the image it is possible to view a section of the image or turn the image. In the following example a 300 x 300 pixel sized section turned 90° will be delivered.
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-1-0-0/1000,1000,300,300/full/90/default.png

It is also possible to retrieve the original TIFF images via IIIF by replacing the width in pixel with full and specifying default.tif instead of default.jpg in the URL as follows:
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-1-0-0/full/full/0/default.tif

By combining the page number 0 with the ending .xml in the URL, the metadata METS document for each newspaper title can be obtained, e.g.
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-0-0-0.xml

Further working examples:
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-1-0-0/full/full/0/default.tif -> TIF, Seite 1
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-1-0-0/full/1200,/0/default.jpg -> JPG, Seite 1
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-1-0-0.pdf -> PDF, Seite 1
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-0-0-0.pdf -> PDF, alle Seiten
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-1-0-0.xml -> ALTO, Seite 1
https://content.staatsbibliothek-berlin.de/zefys/SNP27974534-19010712-0-0-0-0.xml -> METS

Full texts

For the project Amtspresse Preußens full texts in different formats can be retrieved.

For the newspaper Teltower Kreisblatt the full texts are delivered in the ALTO format. Compared to the delivery of the METS file its necessary to add in the letter A for the issue
and to use the correct page number in the URL. Then instead of the METS data the ORC data in the ALTO format will be delivered for every page:
https://content.staatsbibliothek-berlin.de/zefys/SNP25128437-18580116-A-1-0-0.xml for page 1,
https://content.staatsbibliothek-berlin.de/zefys/SNP25128437-18580116-A-2-0-0.xml for page 2 etc.

For the other newspapers with fulltext, Provinzial-Correspondenz and Neueste Mittheilungen, the data are saved in a free XML format.
Here the issue has to be specified with the letter F and the page number with 0, because the whole full text is contained in one file.
Examples for the delivery of the OCR data are for the Provinzial-Correspondenz:
https://content.staatsbibliothek-berlin.de/zefys/SNP9838247-18770117-F-0-0-0.xml
and for the Neueste Mittheilungen:
https://content.staatsbibliothek-berlin.de/zefys/SNP11614109-18930721-F-0-0-0.xml

Conditions of Use

Contact

Kalliope Union Catalog

Kalliope is a Union Catalog for collections of personal papers, manuscripts, and publishers’ archives and the National Information System for these material types.

More than 19,300 holdings from more than 950 institutions with a total of more than three million individual items are currently indexed online. Kalliope contains metadata of correspondence archives, manuscripts, private and professional document files, diaries, family albums, lecture notes, photographs, posters, films, music, but also hair curls, … by and about 600,000 people and 100,000 organizations.

Interfaces

SRU – http://kalliope-verbund.info/sru?version=1.2

SRU, here used in the 1.2 version, is an HTTP-based protocol for the automatic retrieval of bibliographic data. By using this interface you can use the data for your catalogues, your subject portals, or tfor digitizing your objects.

The retrieval language used is the Contextual Query Language; it is also used for the expert search. Data from Kalliope are  are vailable in the formats Dublin Core (DC, v. 1.1) and Metadata Object Description Schema (MODS, v. 3.4).

Query the SRU interface

Indexes can be queried directly (see index list), e.g:

Documentation of data formats

The overview of the elements for MODS and Dublin Core can be found here  (PDF).

The URL in ./recordIdentifier/@url (MODS format) is a persistent URL. It consists of the domain name http://kalliope-verbund.info/ + record number: http://kalliope-verbund.info/{ID}, e.g. http://kalliope-verbund.info/DE-611-HS-2321418.

Further examples

Conditions for Use

The vast majority of the data are licensed under CC BY-SA. Only licenses that differ are specified in the individual data record.

Contact

Gerhard Müller

Digitised Collections

Introduction

Probably you know our Digitised Collections where currently (November 2020) roughly 175,000 digitized objects from the archives of the SBB present online? With a variety of features (for features that are in the current development process have a look at the Beta-Version of the digitized collections) we hope that it is easy and efficient for you to search and browse our digitized objects.

But if you want to get these data to process it or integrate it in your application? For this purpose we provide different technical interfaces (APIs).

 

Interfaces

Currently the Digitised Collections provide two interfaces, OAI-PMH und iiif.

1. OAI-PMH – https://oai.sbb.berlin

Retrieval of metadata for objects in the digitised collections is established by use of the The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) standard. A wide range of client applications for OAI-PMH in numerous programming languages are freely available on the web.

The base URL for the OAI-PMH endpoint of the digitised collections of the SBB is
https://oai.sbb.berlin/

Using the 6 verbs provided by OAI-PMH, requests such as the following can be generated

The SBB implements DublinCore (DC) for basic bibliographic metadata and METS for all metadata about the contents and structure of a digital object.

By combination of OAI-PMH verbs and the DC-Metadata, more specific requests can be formulated such as

The response contains a unique identifier for each digital oject, the PPN, e.g. oai:digital.staatsbibliothek-berlin.de:PPN867445300. Using the PPN, additional information about a digital object can be retrieved
https://oai.sbb.berlin/?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai%3Adigital.staatsbibliothek-berlin.de%3APPN867445300

By changing the metadata-prefix to mets, the complete METS metadata record containing all references to any related files (images, OCR) can be retrieved
https://oai.sbb.berlin/?verb=GetRecord&metadataPrefix=mets&identifier=oai%3Adigital.staatsbibliothek-berlin.de%3APPN867445300

The METS file contains a section <fileSec> which holds child elements of the type <fileGrp> which contain references to various files that belong to the digital object, typically images in either JPG or PNG format.
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-00000001/full/full/0/default.jpg
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-00000001/full/full/0/default.png
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-00000001/full/full/0/default.tif

2. IIIF

Retrieval of content (images and full-text) the digitised collections is supported via the International Image Interoperability Framework (IIIF) protocol. Also here a growing number of free clients and libraries for IIIF in numerous programming languages are available on the web.

Currently, digitised images, metadata and fulltext data can be retrieved by requests following this schema:
https://content.staatsbibliothek-berlin.de/dc/{PPN}-{Page}

The PPN is an unique ID for every work that can be found in the digitised collections.

To get scanned images for a specific object further parameters, following the IIIF protocol have to provided in the URL:
/full/{width in pixel},/0/default.jpg wobei width in pixel die Höhe automatisch anpasst, z.B.
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-00000001/full/1200,/0/default.jpg
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-00000001/full/800,/0/default.jpg
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-00000001/full/250,/0/default.jpg

The IIIF protocol permits more image manipulations via URL, e.g. cutting, resizing and rotating the image. In the following example contains an image detail, sized 300 x 300 px, rotated by 90°.
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-00000001/100,100,300,300/full/90/default.png

Furtermore its possible to get the orinial TIFF image. Just change the default.jpg to default.tif in the URL:
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-00000001/full/full/0/default.tif

For even more possibilties of manipulation of singe images have a closer look to the IIIF Image API 2.1.1 thats implemented by the content server.

Additionally the content server can deliver more data for specific works. An overview of the addional functions can be found in the NGCS routes documentation. This includes, for example, dynamic highlighting on the pictures. The highlighted areas are defined in the same way as sections of the image. As a further parameter a color can be specified as a hex code: https://content.staatsbibliothek-berlin.de/dc/PPN646236717-00000011/full/1200,/0/default.jpg?highlight=55,100,120,100|1150,460,110,80&highlightColor=ff0000

As specified in the IIIF Presentation API the IIIF manifest file of the object can be retrieved with the URL:
https://content.staatsbibliothek-berlin.de/dc/PPN867445300/manifest

This manifest can be loaded in every IIIF viewer, e.g. in the Mirador viewer, hosted by the SBB:
https://mirador.staatsbibliothek-berlin.de/?manifest=https://content.staatsbibliothek-berlin.de/dc/PPN897443810/manifest&manifest=https://content.staatsbibliothek-berlin.de/PPN876457189/manifestNext to the manifest the metadata can als be retrieved in the METS/MODS format:
https://content.staatsbibliothek-berlin.de/dc/PPN867445300.mets.xml

Fulltexts

With adding the page number to the URL and the suffix ocr.xml the OCR file in ALTO format per page will delivered:
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-0009.ocr.xml for page 9,
https://content.staatsbibliothek-berlin.de/dc/PPN867445300-0010.ocr.xml for page 10 etc.

The OCR files also can loaded completely packed in a ZIP file.
https://content.staatsbibliothek-berlin.de/dc/PPN867445300.ocr.zip

Conditions of Use

SBB pursues an Open Data Policy and endeavours to make all digitised works published before 1920 available to the public under a Public Domain Mark 1.0 licence. In exceptional cases and for works published later than 1920, different licences may be used.

You can recognize the valid license for an object in the Digitized Collections if you display the complete bibliographic information about the object.

 

and there scroll down to the point license / rights info

Of course, you can also find this information in the metadata in METS format under <mods:accessCondition>

Special data sets

For the Hackathon Coding Gender: Women In Cultural Data, which took place at the end of August 2019, thematic datasets were provided, which are described and listed here.

Contact