Knowledge Base

NQL Metadata Feature

NQL Metadata Feature

This feature release includes a new way of representing row-level metadata in your NQL query outputs. The metadata provides essential information about each row's origin, licensing, and pricing. By default, this metadata is now captured in a structured format, ensuring that downstream processes can reliably interpret and manage the data's provenance.


1. Why Include Metadata?

By moving to a structured metadata representation, we can streamline how users access row-level metadata while optimizing internal storage. It also ensures consistent and automatic assignment of licensing details and source identifiers for data that crosses company account boundaries.


2. Metadata Structure

Each row's metadata is now stored in an array of structured objects—accessible as a field named _nio_sources in the output. A typical row might look like this:

{
  "id": "2", //column in output dataset 
  "tag": "technology", //column in output dataset 
  "score": "92", //column in output dataset 
  "_nio_sources": [
    {
      "access_rule_id": null,
      "price_per_row": {
        "currency": "USD",
        "microcents": "0"
      },
      "company_id": "1",
      "dataset_ids": ["12249"],
      "licensing": {
        "period": "P30D",
        "expiration_date": 1746371498000,
        "license": "7fe746ee-6b71-4130-af40-41fd7e7ba7eb"
      }
    }
  ],
  "_nio_sample_128": true,
  "_nio_last_modified_at": 1743779507900.119
}

Fields in _nio_sources

  • access_rule_id: Identifies the specific access rule; can be null if querying your own data.
  • price_per_row: Contains the currency (e.g., USD) and microcents (price per row).
  • company_id: The owner or supplier of this particular data row.
  • dataset_ids: One or more dataset IDs that contributed data to this row.
  • licensing: Includes the period (in ISO8601 format, e.g., P30D), expiration_date (timestamp), and license (UUID).

Example: External Data via Access Rule

When data comes from an external account (not your own) through an access rule, the metadata will include the specific access rule ID. Here's how that looks:

{
  "id": "4",
  "tag": "finance",
  "score": "123",
  "_nio_sources": [
    {
      "company_id": "123",
      "licensing": {
        "period": "P30D",
        "license": "7fe746ee-6b71-4130-af40-41fd7e7ba7eb",
        "expiration_date": 1234543360000
      },
      "access_rule_id": "1234",
      "price_per_row": {
        "currency": "USD",
        "microcents": "0"
      },
      "dataset_ids": ["12345"]
    }
  ],
  "_nio_sample_128": true,
  "_nio_last_modified_at": 1745242376019.317
}

In this example:

  • access_rule_id has the value "1234", indicating this row came through a specific access rule
  • company_id is "123", showing this data belongs to a different company than your own
  • The data has the same licensing structure, but may have different terms based on the access rule agreement

This metadata helps you track which external sources contributed to your result set and understand the provenance of each data point.


3. Always On by Default

  • You do not need to enable metadata manually; it is always on.
  • Every row returned by your NQL query will include the _nio_sources field when applicable, ensuring that the row's origin, pricing, and license details are always captured.

4. Usage Limitations

The _nio_sources metadata:

  • Works with common operations like SELECT, UNION, FILTER, and SORT.
  • Does Not Work with certain advanced operations like JOIN, AGGREGATION, or SUBQUERIES. If your query joins or aggregates data, _nio_sources will not be available. We plan on lifting this limitation in the future.

If you need to combine or process metadata with other tables, consider selecting the metadata into a separate result set or using a downstream process that consumes the output JSON.


5. Storage & Noise Reduction

For internal datasets that do not require licensing details, the metadata may appear in a simplified form (e.g., zero-cost pricing, access_rule_id as null). This ensures:

  • Minimal overhead for rows that do not truly require external licensing details.
  • A consistent structure so any downstream system knows where to look for metadata when it does exist.

6. Backward Compatibility

  • Existing queries will not break due to this change.
  • The _nio_sources field is added to outputs but does not interfere with prior syntax or results.
  • No action is required to maintain compatibility—any query that returns rows automatically includes metadata.

7. Sample Rows

Below is a shortened sample of how your data might look when returned by NQL:

{
  "id": "3", //column in output dataset 
  "tag": "cooking", //column in output dataset 
  "score": "91", //column in output dataset 
  "_nio_sources": [
    {
      "company_id": "1",
      "licensing": {
        "period": "P30D",
        "license": "7fe746ee-6b71-4130-af40-41fd7e7ba7eb",
        "expiration_date": 1746371498000
      },
      "access_rule_id": null,
      "price_per_row": {
        "currency": "USD",
        "microcents": "0"
      },
      "dataset_ids": ["12249"]
    }
  ],
  "_nio_sample_128": false,
  "_nio_last_modified_at": 1743779507900.119
}

8. Frequently Asked Questions

Q: How do I remove or hide metadata if I don't need it?

A: Currently, metadata is always included in the output. If you don't need it for downstream use, you can ignore or filter it out in your post-processing script.

Q: Can I filter or aggregate based on _nio_sources?

A: You can filter or sort using _nio_sources, but for joins or aggregations, you'll likely need to separate out metadata in a subsequent operation rather than in the same NQL query.

Q: Will I be able to see multiple sources if the row was combined from different datasets?

A: Yes, _nio_sources is an array—there can be multiple entries if multiple companies or datasets contributed to that single row.

Q: How can I distinguish between my own data and data from external sources?

A: Check the access_rule_id field—it will be null for your own data and contain a specific ID for data accessed through an access rule from another company.


Summary

The new metadata feature simplifies how you see and use origin, licensing, and pricing details:

  1. Unified _nio_sources Field: A structured, always-on source of row-level metadata.
  2. No Changes Needed: Existing queries keep working; the new field is simply added to the output.
  3. Clear Separation: Internal storage is optimized, while users receive a consistent external representation.
  4. Downstream Friendly: The metadata is ideal for compliance checks, cost calculations, or licensing validations in any post-processing pipeline.
  5. Cross-Company Data Tracking: Easily identify which data came from external companies via access rules.

With _nio_sources, you gain confidence in understanding exactly where a row came from and how it can be used—without extra overhead or complicated query syntax.

< Back
Rosetta

Hi! I’m Rosetta, your big data assistant. Ask me anything! If you want to talk to one of our wonderful human team members, let me know! I can schedule a call for you.