Time aggregation
Last updated
Last updated
When entities and their features, we can tell Lynk on which time frame to aggregate the features, by using the time_agg
property in the USE
config block.
Calendric: day / month / year etc
Rolling windows : "last 30 days" / "first 7 days after signup" etc
No time frame (full date range)
For example, let's assume we have an entity customer
and it has a simple metric feature count_orders
that counts how many orders a customer has.
The above SQL API request will calculate for each customer
the count of total orders on all available time range in the underlying data asset - in this case orders
(where the measure count_orders
comes from).
If we would like to get the daily amount of count_orders
per customer, we can change the query as follows:
In the above example, a record per customer and day will return, counting the amount of orders per customer and day.
time_agg
USE
time_agg
time_agg
is an object that holds all the options on how to apply time aggregation to our query.
The options for time_agg
are:
time_grain
window_size
direction
time_grain
[optional]Optional, defaults to day
.
A time grain refers to the level of granularity at which time is divided in the result of the API query. It determines how entity-level aggregations or calculations (e.g., sums, averages, counts) are grouped.
For example, if the time grain is day
and the main entity is customer
, the returned result will be on a customer + day level.
The supported values for time_grain
are:
year
quarter
month
week
day
hour
minute
For example:
The above example returns the result of the feature count_orders
for each customer
and week
.
window_size
[optional]Optional, defaults to 1
.
Window size is for aggregating rolling windows - this property will determine the window size of the rolling window. Note that the time grain will be taken into account here for determining the "size" of the window as well.
The supported values for window_size
are:
integer values
unbounded
window_size
example, using an integer value:
The above example returns the cumulative results of the feature count_orders
in the last 3
months, for each customer
and month
.
window_size
example, using unbounded
:
The above example returns for each customer
and month
, the cumulative results of count_orders
up until that hour. The window of time in this case starts from the first order of each customer
that matches the business logic of the metric count_orders
.
If we would choose direction: forward
, the returned results would be the cumulating the feature count_orders
for each data point (customer
and month
) up until the last known order in the underlying data asset.
direction
[optional]Optional, defaults to backward
.
Determines the direction of the rolling window.
The supported values for direction
are:
backward
forward
For example:
The above example returns the result of the feature count_orders
in the next 7
days for each customer
and day
.
Once the time_agg option is passed to the query through the USE config, Lynk applies time aggregation to all the features on the query.
time_field
The aggregation is applied according to the specified time_field,
as follows:
For features on which time_field was not specified on either feature nor asset level, Lynk will not apply any time aggregation logic.
Set this object on the SQL API / REST API query, to define all the query level configurations, including time_agg. See for in depth information on this.
Using the feature level property.
If no time_field
is specified on the feature level, Lynk will use the default on the data asset level.