Data Graphs lets you import data in tabular form using CSV files with a user interface where you can map CSV columns to your Dataset and Concept schemas.
The key steps in importing data are:
- Prepare a CSV file with the data you want to import, with each row representing one Concept or entity in your Data Graphs Dataset
- Create a Concept schema in a Dataset where Concept properties can be mapped to the the data in your CSV column
- Use the Import data wizard in a Dataset to guide you though mapping different CSV columns to the right Concept properties
- Your CSV does not need column headers, but this can help
- Column headers do not need to match the property names in the Concept schema, but if they do, Data Graphs will match them automatically, if they do not, you can match them yourself.
- Each row in the CSV represents one Concept in Data Graphs, but you can import rows from one CSV file into multiple Concept types if you have a column in the CSV that identifies the Concept type of the row (eg Person, Organization etc)
- If the Concepts being imported are configured in Data Graphs to have automatic identifiers you do not need an identifier column in the CSV file. If the Concepts are configured for manual identifiers you will need one of the columns in the CSV to hold the identifiers for the Concepts being ingested.
- You can match columns to relationship properties if the column contains identifiers of Concepts you have either already imported or are planning to import.
- Data Graphs allows for "eventual consistency" – you can import links to other Concepts either before or after the linked Concepts are imported.
- Importing handles subclassing - If you import linked data where the target (linked) Concepts are of subclasses of the relationship you can either choose the type of linked Concept when you map your columns, or use a different column in the CSV as a type discriminator.
Importing a simple CSV
The following example demonstrates how to import a simple CSV where all the rows are mapped to a single Concept type. In this example we import a CSV of countries, where each row represents one Country, and the identifier's are the 2-character ISO codes.
Step 1: Prepare the CSV and schema, and upload the file
Our CSV is structured like this:
and we have created a Countries Dataset with a Concept type Country with these properties:
Now upload your CSV using the file uploader, select your CSV file and click the import/upload button:
Step 2: Check the CSV headers
Data Graphs asks you if the CSV has a header row – choose Yes or No:
Step 3: Map the columns to Concept properties
For each property of our Country Concept, choose the CSV column that matches it. Data Graphs will attempt to match those columns that have the same names, but if not you can choose the correct column. Data Graphs will tell you how many rows have value data and flag up any potential data issues. When you have selected the column, choose the Confirm option. You can also ignore a column in the CSV if you wish.
Step 4: Review your data and import
When you have mapped all your columns to properties, you can review the data you are importing. If you have made a mistake, you can go back to the previous steps and adjust. If you think you need to start again and adjust your CSV for data quality, you can still cancel, as nothing has been stored yet.
When you've finished reviewing the data, click the Import button. The data will be uploaded to Data Graphs, processed and loaded into your Dataset. This ingest is done asynchronously, so if it is a very large import, you can safely close your browser and come back later. Progress will be reported.
Once imported you can view your data, for example: