Knowledge Base
How to create a dataset
Overview
Any kind of data, in any schema, can be pushed into the Narrative Data Streaming Platform as a dataset--exactly as it is stored in your own system.
Creating a dataset defines a container that will hold your raw data. Afterwards, data can be added to your dataset container.
Follow these instructions to make your data available for use in My Data.
New Dataset
- Navigate to https://app.narrative.io/platform/my-data/new-dataset
- Click "New Dataset" to create a new dataset.
1. Dataset Upload Process
Set a name and upload a sample that represents the schema and values of the future (larger) file. The sample should be less 10MBs.
Supported Data Types
The Narrative platform supports all of the following file types:
- CSV and other common delimiters (such as pipes, tabs, etc)
- Parquet
- JSON
Note: When uploading JSON files to Narrative, please note that we accept either a single JSON object, or JSON in the popular JSON Lines format (one JSON object per line.) Read more about JSON Lines here: https://jsonlines.org/.
2. Dataset Details
Once your dataset is ingested, you should see the new object listed at the top of your datasets at https://app.narrative.io/platform/my-data/datasets
Below are the available metadata options for each dataset. Some metadata will have placeholders which you can update:
- Name: The name of the field. This should not contain spaces!
- Description: Provide information about the dataset.
- Schema:
- Whether the field is Required: If this field must contain a value for a record to be added to your dataset.
- Whether the field is Queryable: If this field can be made available to buyers. * Marking a field as not queryable will ensure that this field cannot be transmitted to other parties on the Data Collaboartion Platform.
- Whether the field is Sensitive: Should data in this field be redacted when the data is displayed in sample UIs and APIs.
* Marking a field sensitive will ensure that the data in this field is never shown to users within the Narrative platform unless purchased by that user and exported. To configure a field as sensitive, select the toggle on initial upload of the dataset. * Unlike not sellable data, data in "sensitive" fields will be delivered, in the original format, when purchased. - File Type: Inferred from the sample upload.
- Write Mode: This field informs how to treat new data uploaded to your dataset. Two options are supported:
- append: Use if new data should be added to your dataset.
- overwrite: Use if new data should replace your entire dataset.
Add Data
To add data to your dataset, you can either add data using the "Upload Files" button, or you can configure an ingestion connector such as the AWS S3 Connector and ingest data that way.
For more information on ingesting data via S3, see Setting up a managed S3 Bucket.