Linking data through semantic relationships is a critical part of building a useful knowledge graph. To do this there are a number of important principles to understand:
- How graphs join data entities using linked-data principles
- Creating relationships in your domain or ontology model
- How to load related Concepts into that model using the Data Graphs app
- How to load related Concepts into Data Graphs in bulk using a CSV file
- How to load related Concepts into Data Graphs in bulk via the API
Linked Data Principles
Linked-data patterns use the identity of entities to connect them; this ID usually takes the form of a URI or a URN.
In Data Graphs, each entity – known as a Concept – is identified by a URN, which is globally unique in your knowledge graph and takes the form urn:{project}:{type}:{id}
. For example, urn:demo:Person:12345
.
Thus two Concepts, a Person and an Organization, can be linked together via their IDs as follows:
Creating Relationships in the Domain Model
Creating relationships in your Data Graphs domain model is easy. It just requires some sequencing.
Relationships are defined between Concepts: a Concept owns a relationship when one of its properties is assigned another Concept as its datatype. In the above example – where a Person worksFor
an Organization – the relationship is defined by adding the property worksFor
to Person with the datatype Organization. (With this in mind, it is important to define Organization before creating this property on Person.)
Once you have added your property, Data Graphs knows that this property belongs to Person and can only have Organization Concepts – or concepts that are sub-classes of Organization – assigned to this relationship.
In the RDF world, this principle is like assigning domains and ranges to properties in an ontology. The domain of property worksFor
is class Person and the range is class Organization.
Loading Linked Data via the Data Graphs app
Once a relationship is defined in your Data Graphs domain model, the Add New Concept (or Edit Concept) screen knows which types (classes) of Concept can be used for which property. In essense, Data Graphs is range aware - it understands the semantics of the properties in your model.
For example, when creating a new Person instance, Data Graphs will only allow you select an Organization to populate the worksFor
field. Of course, if you have not already loaded any Organization instances, there will be none available.
Data Graphs is also aware of class inheritance, so if you have created subclasses of an Organization, such as Political Party, the editor will let you choose concepts that are instances of those subclasses as well.
Saving the above example creates a Person (Jane Doe) connected to an Organization (Apple) by the worksFor
property, illustrated below in the Graph Explorer:
Loading Linked Data via CSV
Loading data in bulk with Data Graphs' CSV importer also uses the Concept identifiers to make the relationship joins. To illustrate this process, let's assume that we have three Organization Concepts as follows:
Organization name | Organization ID |
Apple |
59k0AZ8rcripmaLcL2BisK |
Amazon | 5ir3poyfNQLjjsra50BDRI |
Microsoft | 5qnrcsngabk85QSrrT4M1E |
Our CSV lists people that work for these organizations. It has two columns, which contain the person's name and the identifier of the company they work for:
When loading this CSV into our Dataset with the CSV Importer, we need to identify the data as Person Concepts and to map the worksFor
column of the CSV to the worksFor
property of the Person type:
This way Data Graphs will be able to establish the relationships between each person and the organization(s) they work for automatically. (Note that Carol James and Billy Crane both work for two organizations so their worksFor
column has the identifiers of both companies in a quoted array.)
One the import completes, our People graph is as follows:
Loading linked concepts using the API
When loading Concepts using the Data Graphs API, you need to send a POST request to the Create Concept(s) endpoint specifying the key of Dataset where the Concept type is defined.
The Concepts themselves should be defined as fully formed JSON in the request body. Relationship properties must identify the Concept being connected with its entire URN. Creating Person Concepts in our worksFor
example, the JSON would be as follows (assuming the Person Concept is defined with automatically assigned identifiers):
[ { "type": "Person", "name": "Jane Doe", "worksFor": [ "urn:test:Organization:5ir3poyfNQLjjsra50BDRI" ] }, { "type": "Person", "name": "Jim Smith", "worksFor": [ "urn:test:Organization:59k0AZ8rcripmaLcL2BisK" ] }, { "type": "Person", "name": "John Jones", "worksFor": [ "urn:test:Organization:59k0AZ8rcripmaLcL2BisK" ] }, { "type": "Person", "name": "Carol James", "worksFor": [ "urn:test:Organization:5qnrcsngabk85QSrrT4M1E", "urn:test:Organization:5ir3poyfNQLjjsra50BDRI" ] } ]