Get in touch
Get in touch

Creating a Dataset

To create a Dataset click on the create tile on the Data Graphs app homepage. You will be guided through a set of steps in which you define the types of Concepts you want your Dataset to contain.

Step 1: Dataset settings

Choose an appropriate name for the Dataset, an example would be to name it after the collection of Concepts that it will contain. You need to name it unambiguously. The namespace for the Dataset is a URN and is set automatically – urn:test:countries in the example below. The namespace is also important as this will be used as part of the API endpoint URL if you want to access this Dataset via the API.

Choose whether the Dataset is public or private with respect to API access. A public Dataset can be accessed using the API from any client on the internet using an API key. A private Dataset can only be accessed using the API using oAuth / OpenID authentication. 

When you are happy with the main settings, click Add Types to configure the Concept types you want the Dataset to contain.

Step 2: Add a Concept type

Next you must add one or more types of Concepts that you want to include in the Dataset. We call this the Dataset Schema, You are able to add more Concept types later to an existing Dataset schema.

Choose the first type you want to add, either selecting an existing type you have already defined or one of the types we have defined for you, or create a brand new type. In the example below we are naming a new type Country as we want to include Country Concepts in the Dataset. When happy, click the Add Type button.

Step 3: Customise the type


Now you need to choose if this Concept type is sub-class of some existing type or not. If this Concept type is a sub-class, you can select the Parent Type from the drop-down list of existing types. If you do choose a parent type, you can then choose to include the properties from the parent Concept type in this new (child) type – typically you should choose to inherit the parent Concept properties.

Automatic or manual identifiers

You must choose whether to use manual or automatic identifiers. If you choose manual, for every Concept of this type you create you will need to supply a unique identifier of your choosing. The identifier must be unique with respect to the class / type of Concept. If you choose automatic you will not need to supply identifiers for each Concept, instead Data Graphs will generate a globally unique identifier for you.

Defining the Concept properties

Now add the properties to your Concepts schema. By default each Concept has an id and a label. The id property name cannot be changed by you can change the label property name to something else (e.g. name is a good alternative).

Now you can define the custom properties that will describe each instance of your Concept. In our Country example a Country may have a longName, a latitude and longitude, a capital city and a flagImage.

For each property you must choose its data type. Property data types can be primitive types like text, decimal, integer, boolean, or specialised data types like latitude / longitude, or you can choose to make a relationship property that will reference a different Concept type that may exist in this or any other Dataset.

For each property, you can choose modifiers, whether the property :

  • is optional – you do not need to supply a value for an optional field when editing data
  • is identifier – if you have selected manual identifiers, you may select one custom property as the field that will uniquely identify Concepts of this type. You do not need to choose a field to be the identifier, and instead you can just use the default id field if this is preferable. If you do define an identifier property, the property datatype must be a keyword or an integer.
  • allow multiple – indicates if the property allows multiple values (an array).
  • preview image – if you have selected an imageURL datatype, you can choose if this property holds a canonical preview image of the Concept.
  • is nested – if you have defined a relationship property that references another Concept - you can optionally choose this to be nested inside the Concept itself. This allows you to create complex data structures within the concept. A more typical scenario is simply to link to Concepts elsewhere in your Datasets (the knowledge graph and linked-data pattern). In this case leave this option unchecked.

In our Country example I have chosen manual identifiers, which means I need to either use the standard  id property, or I can define a new property that will uniquely identify every Concept. In this case we have defined an isoCode field and marked it as the identifier.  You can see the other custom properties we have chosen for our Country type below :

When you are happy with this Concept schema definition, click Review Types

Step 4: Review types and save the Dataset

Finally you get to review the Concept types you have defined for this Dataset, you can edit, remove, or add more types from this final screen. If  you choose to add another type, you will repeat Step 2 and 3 for the additional Concept type.

When you are happy you have defined all the types you need, click the Save Dataset button and Data Graphs will create the Dataset empty , and ready for you to add Concepts into.