First-Last
First-Last features are great for enrichments of fields from one-to-many relations. It is commonly used to enrich an entity with fields of the first or last appearance of a one-to-many related assets or entities.
Simple first-last feature
In this example, we define simple first-last
features for the entity customer
:
type
type
The feature type.
In case of first-last features, it should be set to first_last
.
name
name
Give the feature a name.
asset
asset
The data asset with the field to be added as feature to our entity.
asset
should be the full path: "db.schema.name".
join_name
[optional]
join_name
[optional]In case multiple join patterns are defined between an entity and a data asset, join_name
is used to determine which join path to use for a specific feature.
data_type
[optional]
data_type
[optional]Specify the feature data type.
If no data_type specified, Lynk will assume the data type is string
.
The options for data types are:
string
For any type of string data type
number
For any type of number data type. For example: integer, float, decimal etc..
datetime
For any type of time-based data type. For example: date, timestamp, datetime etc..
bool
For boolean data type.
time_field
[optional]
time_field
[optional]options
options
The options for the first-last definitions on which field we would like to get and how to sort the related data asset
method
method
Determines which instance of the data asset to retrieve - the first
or the last
, based on the sort_by
option.
sort_by
sort_by
The data asset field to sort by.
field
field
The name of the data asset field to retrieve as the entity feature.
offset
offset
Use offset
to choose the second, third or other field values with an offset from the start or from the end of the sorted list
Offset - Example 1
Enriching the customer
entity with the order_status
of the third order, ordered by created_at
:
Offset - Example 2
Enriching the customer
entity with the order status
of the second order from the last (one before the last order), ordered by created_at
:
filters
filters
Understanding First-Last Features
In case we have a many-to-one relationship between a Data Asset and an Entity, and we need to enrich the entity with a field from the Data Asset without aggregating it - we can't just take the field - we need to define which occurrence to take. This is when we should use first-last features.
To better understand how First-Last features work, look the following diagram;
The above diagram shows an example when a customer (entity) has many orders (data asset). That means, each customer
may be associated with more than one row in db.schema.orders
.
In this case, we took the field total_price
of the last order, sorted by order_date
, and added it to the customer entity as a feature.
Last updated