2.5 Linked Open Data and Data Publishing

This section is under development.

While many databases, services, or museums might expose their data via a web API, there can be limitations. Matthew Lincoln has an excellent tutorial at The Programming Historian that walks us through some of these differences, but the key one is in the way the data is represented. When data is described using a ‘Resource Description Framework’, RDF, the resource - the ‘thing’- is described via a series of relationships, rather than as rows in a table or keys having values.

Information is in the relationships. It’s a network. It’s a graph. Thus, every ‘thing’ in this graph can have its own uniform resource identifier (URI) that lives as a location on the internet. Information can then be created by making statements that use these URIs, similarly to how English grammar creates meaning: subject verb object. Or, in RDF-speak, ‘subject predicate object’, also known as a triple. In this way, data in different places can be linked together by referencing the elements they have in common. This is Linked Open Data (LOD). The access point for interrogating LOD is called an ‘endpoint’.

Finally, SPARQL is an acronymn for SPARQL Protocol and RDF Query Language (yes, it’s one of those kinds of acronyms).

In the notebook for this section, we’re not using Python or R directly. Instead, we’ve set up a ‘kernel’ (think of that as the ‘engine’ for the notebook) that already includes everything necessary to set up and run SPARQL queries. (For reference, the kernel code is here). Both R and Python can interact with and query endpoints, and manipulate linked open data, but for the sake of learning a bit of what one can do with SPARQL, this notebook keeps all of that ancillary code tucked away. The followup notebook shows you how to use R to do some basic manipulations of the query results.

The SPARQL endpoint for the British Museum has been more or less abandoned by that institution. While you will still learn much about how SPARQL and LOD work by studying our notebook, we regret that we can’t guarantee full functionality.