SQL API
Last updated
Last updated
Querying the SQL API is as simple as writing a simple SQL SELECT statement.
Connecting to Lynk SQL API is done via a regular Postgres SQL connector.
In order to successfully connect to Lynk SQL API follow the following steps
Create an via the Studio "my account" page and keep that token in a safe place.
From your BI tool or any other tool you're using, choose Postgres connection with the following credentials:
HOST: sqlapi.app.getlynk.ai
PORT: 5433
PASSWORD: paste the authentication token here
You are now ready to connect and start querying Lynk via SQL API
The SQL API supports standard SELECT
statements.
The query format is similar to a vanilla (regular) SQL query, using the SQL flavor of your query engine (Snowflake, BigQuery etc).
When querying Lynk via the SQL API, we are actually querying entities and their features and measures. As shown in the above example, the FROM
statement expects an entity using the entity()
function, and the fields in the SELECT
statement are actually features.
The main entity is the first entity passed to the FROM
statement, using the entity()
function.
The above example will return a record per customer
.
In this case, all customers will return, and for each customer Lynk will return exactly one row. The metric feature count_orders
will be calculated for each customer across all available time range
The above example will return a record per customer
and day
, for each day in 2024.
When aggregating an entity (rolling up), in order to apply a predefined measure logic, use the measure()
function as follows:
In the above example we are using the measure average_orders
defined on the entity customer
. It is used to calculate the average orders of all customers - in this case by nation_name
, which is a feature on a customer level.
Note that team_name
is a feature that was defined on the entity team
and joined to each customer on this SQL API query.
In some cases, there might be more than one way to join two entities. Lynk supports this scenario by supporting the definition of more than one join path between two entities.
In this example we join the entity agent
to the main entity customer
twice. Once via a join path that joins the sales_agent
to a customer and once via a join path that joins the last_support_agent
to a customer.
SELECT
FROM
JOIN
WHERE
GROUP BY
HAVING
ORDER BY
LIMIT
Any of the statements above are supported, using the SQL flavor of the underlying query engine in use.
The above query will return a histogram of the number of customers per number of orders, for customers which their team size is 100 or more. It will also order the results by the number of orders in an ascending order, and return only 20 rows.
USE
Passing query-level configurations is done using the USE
config block.
The supported query level configurations are:
branch
Specify which git branch
to use.
By default Lynk will retrieve semantic definitions from the default git branch, according to the project's repository. In case the branch
option is passed to the USE
block, Lynk will take the semantic definitions (entities, features, join paths etc) from the specified branch.
In the above example we are using the branch
"dev"
context
Specify which context
to use.
By default Lynk will retrieve semantic definitions from the default context ("shared"). In case the context
option is passed to the USE
block, Lynk will take the semantic definitions (entities, features, join paths etc) from the specified context.
The above query will return for each customer
, it's customer_id
and the value of the feature is_active_customer
. In case the context marketing
has a feature called is_active_customer
on a customer level, Lynk will use this feature definition to build the query. If the context marketing
does not have a feature called is_active_customer
, Lynk will use the definition from the shared
context for the feature is_active_customer
.
time_agg
Use time_agg
to specify how to aggregate the query features, in terms of time aggregation.
start_time
Use start_time
to specify a lower time barrier for the query.
The start_time
parameter will be applied to all of the underlying data assets that Lynk will query in order to build the requested features.
stop_time
Use stop_time
to specify an upper time barrier for the query.
The stop_time
option will be applied to all of the underlying data assets that Lynk will query in order to build the requested features.
In the above example, Lynk will return for each customer, it's customer_id
and the feature count_orders,
where count_orders
is calculated for each user on the time frame between 2024-01-01
and 2025-01-01
.
Under the hood, the process between sending a query to Lynk and receiving the results does the following:
Parsing the SQL API query to the relevant query engine
Scanning for errors
Translating the query, including:
Translating features to their business logic as defined in Lynk
Using the correct git branch
as stated in the USE
config
Composing the final parsed query including all SQL statements and feature definitions
Sending the parsed SQL query to the underlying query engine
Receiving the results (data) from the query engine.
Streaming the results back to the requesting client. In case of SQL error occurs on the query engine, the original error will return.
It is possible to query the query engine directly via Lynk. We call this operation a pushdown
.
In case the entity()
function is not passed to the FROM
statement, Lynk will assume you are not querying entities, and will pass the query as is to the underlying query engine - and return the results.
For more information about time aggregations, visit the page.
Joining entities is done using the JOIN
statement.
Under the hood, Lynk applies a LEFT JOIN operation between each two joined entities, according to the join path specified between the two entities. See for more information on how to specify join paths between two entities.
If passed to the SQL API query, Lynk will use the join path name
to join the two entities. If no join path name stated, the default join path will be applied in order to join the two entities. See section in the page for more information on this.
Note that in order to use a named join pattern, we use the ON
keyword and then the join path NAME
. In order to maintain the concept of a single source of truth, only a join path NAME
can be passed here (not the path itself). Join paths should be centrally defined as .
See for in depth information on this.
See for in depth information on this.
Reading the query configuration from the statement
Using the correct as stated in the USE
config
Using the correct as stated in the USE
config