Bibliographic Data from StaBiKat
Introduction
Our online catalogue StaBiKat includes the metadata of the complete printed and digital collections of Staatsbibliothek zu Berlin from the publication years 1500 to the present. Currently there are about 14 mio. searchable records.
StaBiKat data (excerpt) – https://zenodo.org/record/2590752
Are you interested in working with basic bibliographic data from our online catalogue StaBiKat? Sets of records organized along language families are already available. The data sets consist of the most important metadata including PPN (catalogue identifier), author, title, place/country of publication, publisher, year of publication and language code. The data sets do not include the full metadata, nor do they represent the content of the catalogue in full. Moreover, you should take note of the date of the last update. If you require more up-to-date data, you can use the following scripts to create your own data sets.
Which information is missing here?
• Records without a language code (slightly less than half of all SBB records)
• Shelf marks and location identifiers (e.g. to identify items lost in the war),
• existence of digital versions (including their PURLs)
• data sets selected on the basis of subject criteria or year of publication
These data sets are provided by the Gemeinsamer Bibliotheksverbund (GBV), the library network of which Stiftung Preussischer Kulturbesitz has been a member for over 20 years.
Interfaces
StaBiKat does not provide direct support for interfaces to export large data quantities. However, for specific queries you can use the SRU or UnAPI interfaces of the GBV network. You may have to get in touch with the network office of GBV.
SRU – http://sru.k10plus.de/opac-de-1
SRU, which is used here in the 1.1 version, is an http-based protocol for machine-based bibliographic data queries. You can use this interface to import data for your catalogues, subject gateways or digitizing your objects.
The retrieval language used is Contextual Query Language. You can use the StabiKat SRU interface to run concise queries yielding a limited set of results. Metadata provided here come in the Dublin Core (DC, v 1.1) and MODS (v 3.4) formats.
Search syntax and indexes: http://sru.k10plus.de/opac-de-1: http://sru.k10plus.de/opac-de-1
Examples for queries:
• A maximum of 10 titles in StaBiKat that contain the words „pupils“ and „Edict“, in MODS Format
http://sru.k10plus.de/opac-de-1?version=1.1&operation=searchRetrieve&query=pica.xtit=pupillen+edict&maximumRecords=10&recordSchema=mods
• Person search for Konrad Adenauer through the full StaBiKat database, output in Dublin Core, with a maximum of 300 results
http://sru.k10plus.de/opac-de-1?version=1.1&operation=searchRetrieve&query=pica.xprs=adenauer,konrad&maximumRecords=300&recordSchema=dc
SRU – K10-PLUS Network Catalogue
The corresponding queries on the level of the GBV network are:
http://sru.k10plus.de/gvk7?version=1.1&operation=searchRetrieve&query=pica.tit=pupillen+edict&maximumRecords=50&recordSchema=mods
and
http://sru.k10plus.de/gvk7?version=1.1&operation=searchRetrieve&query=pica.prs=adenauer,konrad&maximumRecords=300&recordSchema=dc
unAPI
UnAPI provides a straightforward web-based method to retrieve individual record in different formats. The unAPI interface does not enable searches through whole data collections but only provides individual records referenced with an identifier. Each query therefore has to include an unambiguous identitfier for the respective record and the metadata format required (cp. https://wiki.k10plus.de/display/K10PLUS/UnAPI, 1. Abs.)
If you want to download individual records whose PPN you know, you can use the unAPI interface of StaBiKat and GBV network catalogue as follows:
StaBiKat – http://unapi.k10plus.de/?id=opac-de-1
Syntax:
http://unapi.k10plus.de/?id=opac-de-1:ppn:##########&format=dc
Example:
http://unapi.k10plus.de/?id=opac-de-1:ppn:1000127265&format=dc
Network catalogue unAPI – http://unapi.k10plus.de/
Syntax:
http://unapi.k10plus.de/?id=gvk:ppn:##########&format=mods
Example:
http://unapi.k10plus.de/?id=gvk:ppn:178293199&format=mods
Just add the PPN in place of ########## and select the output format (e.g. „MODS“). GBV alternatively provides the possibility to retrieve data in the Pica, Dublin Core and MARC formats. Please note that the unAPI interface of StaBiKat only gives results in the Dublin Core and MODS formats.
Conditions of Use
SBB pursues to an Open Data Policy and provides its metadata for free under the CC0 licence. The conditions of use fort he interface service are defined by GBV.
Example Dataset: Metadata of the “Alter Realkatalog” (ARK) of Berlin State Library (SBB)
The dataset comprises of descriptive metadata of 2.619.397 titles, which together form the “Alte Realkatalog” of Berlin State Library, which may be translated to “Old Subject Catalogue”. The data are stored in columnar format, containing 375 columns. They were downloaded in December 2023 from the German central library system (CBS). Exemplary tasks which can be served by this dataset comprise studies on the history of books between 1501 and 1955, on the paratextual formatting of scientific books between 1800 and 1955, and on pattern recognition on the basis of bibliographical metadata.
Zenodo DOI: https://zenodo.org/doi/10.5281/zenodo.12783813
Licence: Creative Commons Attribution 4.0 International
Leave a Reply
Want to join the discussion?Feel free to contribute!