Lynk docs
  • Introduction
  • Reference
    • Data Modeling
      • Entities
      • Relationships
        • Entity-to-Entity Relations
        • Entity-to-Asset Relations
      • Features
        • Field
        • Metric
        • First-Last
        • Formula
        • Filters
        • Chaining Features
      • Measures
      • Data Assets
      • Time Aggregations
      • Lynk Functions
        • POP
      • Context
    • Consume & APIs
      • Authentication
      • SQL API
      • SQL REST API
      • Cache & Pre-Aggregations
    • Governance
    • Integrations
      • Git
      • Query Engines
    • AI Agents
Powered by GitBook
LogoLogo

Start now

  • Request Access

Website

  • Home
On this page
  • Data assets YAML file
  • asset
  • key
  • business_key [optional]
  • defaults
  • time_field [optional]
  • Measures
  • Name
  • Description [optional]
  • SQL
  1. Reference
  2. Data Modeling

Data Assets

PreviousMeasuresNextTime Aggregations

Last updated 3 days ago

Data assets are tables and views from the underlying data source (mostly a warehouse). Data assets can be to entities and they are being used for creating entity features.

It is recommended to use a transformation tool like dbt to transform raw data into dimensional model with DIM and FACT tables, and expose that dimensional model to Lynk Semantic Layer.


Data assets YAML file

Data assets can be modified either via code or via Lynk Studio UI. The below example shows a YAML file for the data asset db_prod.core.orders

See the following example:

# db_prod.core.orders.yml

asset: db_prod.core.orders

key: order_id

business_key: []

defaults:
  time_field: order_date

measures:
- name: count_orders
  description: count of orders
  sql: count(1)
- name: total_order_amount
  description: sum of order amount
  sql: sum({total_amount})

asset

Data Asset name and location as it appears in your underlying Data Warehouse (db.schema.name)

key

business_key [optional]

Sometimes the key is just "id". In such cases, some combination of other fields might represent the level of granularity of that data asset, and make more sense business wise. We call this combination of fields the business_key of that asset.

Business keys help us be clear on what this data asset is about, and what each row represents.

defaults

time_field [optional]

It is recommended to choose a default time field for each data asset. Lynk will use the default time field to aggregate and filter time-related queries on features.

For example, if we have a data asset for orders db_prod.core.orders and we set the default_time_field to be order_date, Lynk will use this field for time-based aggregations by default.

In case a data asset has no default time_field , and no other time field will be chosen on the feature definition, Lynk will not skip aggregating that feature on any time level (the aggregation will be on "all" time)

Measures

In practice, measures are definitions of how aggregate functions should be applied to fields;

# db_prod.core.orders.yml

measures:

- name: total_order_amount
  description: sum of order amount
  sql: sum({total_amount})

Name

Description [optional]

Describe the measure. It is recommended to give measures informative names that indicate their purpose - for other team members to be able to reuse the measure and for AI apps as well.

SQL

The measure definition. It should be composed of an aggregate function and a field. It is possible to chain multiple aggregate functions and / or multiple fields, just like you would do on plain SQL when needed.

Lynk is SQL-first, meaning anything that would work on plain SQL will work with Lynk as well. You can type any SQL aggregate function compatible with your query engine, and Lynk will apply that as the measure definition and chain it to the query engine.

Examples: SUM , COUNT , MIN , MAX , COUNT DISTINCT , APPROX_PERCENTILE etc

Some more examples:

# db_prod.core.orders.yml

measures:

- name: count_orders
  description: count of orders
  sql: count(1)

- name: total_order_amount
  description: sum of order amount
  sql: sum({total_amount})

- name: successful_order_amount
  description: sum of successful orders amount
  sql: sum(IFF({order_status} = 'success', {total_amount}, 0))

Data assets are stored on the Lynk during the process. If changes are made to a data asset within Lynk (e.g adding fields / measures), a YAML file will be created as shown on the example above.

Once a YAML file gets created for a data asset, the asset is now stored in Git as well as on the Graph DB. Lynk will take care for syncing the Graph DB and your Git repository. See more on Lynk here.

The asset key field (primary key). It is important to state the correct data asset key in order to avoid duplications and errors. Lynk will automatically find and suggest data asset keys after the process is completed.

Holds the defaults for the data asset. See for example.

Measures are reusable components that define how the data asset fields should be aggregated. Lynk applies the measure logic once a feature of type feature is created and consumed.

Give the measure a name. This will be used when creating features and also will be shown on the Studio UI. It is recommended to give measures informative names that indicate their purpose.

related
Graph DB
discovery
Graph
DB
discovery
default time field
metric
metric