Skip to content

Core Syntax

SemQL is syntactically compatible with PostgreSQL. If you know PostgreSQL, you already know most of SemQL. The differences are in what you reference — semantic model names instead of physical table names — not in how you write queries.

This page covers every clause you'll use.


SELECT

Select the fields you want in the output. These can be dimensions, measures, aliases, literals, or expressions.

-- Dimensions (attributes you slice by)
SELECT name, region, created_at FROM customers

-- Measures (pre-aggregated metrics)
SELECT total_revenue, order_count FROM sales

-- With model qualifier (use when field names are ambiguous)
SELECT organizations.name, users.name FROM organizations JOIN users ...

-- Aliases
SELECT name AS organization_name, total_revenue AS revenue FROM organizations

-- Literals
SELECT name, 'active' AS status FROM organizations[active_only]

When to qualify field names: You only need to qualify (e.g. organizations.name) when the same field name exists in multiple tables you're querying. When it's unambiguous, just use the bare field name.


FROM

Reference a semantic model by name — no schema prefix, no physical table path.

FROM organizations
FROM sales
FROM customers

Applying predefined filters

Add reusable filter conditions directly in the FROM clause using square brackets:

FROM organizations[active_only]
FROM sales[last_quarter]
FROM sales[last_quarter, high_value]   -- multiple filters, all applied

Filters can also appear in JOIN clauses:

FROM sales
JOIN customers[enterprise] ON sales.customer_id = customers.customer_id

Important

Filters go in FROM or JOIN — never in SELECT. See Dimensions vs. Measures for more on filters.


WHERE

Filter rows based on dimension values, before any aggregation happens.

SELECT name, total_revenue
FROM sales
WHERE region = 'Europe'

SELECT name, order_count
FROM orders
WHERE created_at >= CURRENT_DATE - INTERVAL '30 days'
  AND status = 'completed'

WHERE vs. HAVING: Use WHERE to filter dimensions (raw values). Use HAVING to filter measures (aggregated results). See Dimensions vs. Measures for the full explanation.


GROUP BY

Group results by one or more dimensions. Required whenever your query mixes dimensions with measures.

SELECT region, total_revenue
FROM sales
GROUP BY region

SELECT year, quarter, product_category, total_revenue, order_count
FROM sales
GROUP BY year, quarter, product_category
ORDER BY year, quarter

HAVING

Filter groups after aggregation. Use this to filter by measure values.

SELECT region, total_revenue
FROM sales
GROUP BY region
HAVING total_revenue > 100000

SELECT customer_id, order_count
FROM orders
GROUP BY customer_id
HAVING order_count >= 5
ORDER BY order_count DESC

ORDER BY / LIMIT

Standard SQL ordering and row limiting.

SELECT name, total_revenue
FROM organizations
GROUP BY name
ORDER BY total_revenue DESC
LIMIT 10

-- Ascending order (default)
SELECT name, created_at FROM customers ORDER BY created_at ASC

JOIN

Join multiple models together. Use the relationship fields defined in your semantic layer — check the model definitions to find the correct join keys.

-- Inner join
SELECT organizations.name, users.email
FROM organizations
JOIN users ON organizations.organization_id = users.organization_id

-- Left join (keep all rows from the left model)
SELECT organizations.name, COUNT(users.user_id) AS user_count
FROM organizations
LEFT JOIN users ON organizations.organization_id = users.organization_id
GROUP BY organizations.name

-- With filters on joined models
SELECT organizations.name, users.email
FROM organizations[active_only]
JOIN users[verified] ON organizations.organization_id = users.organization_id

Tip

Use the relationships documented in your semantic layer to find the correct join columns. Don't assume — look up the exact field names.


WITH (CTEs)

Common Table Expressions let you break complex queries into named, reusable steps. Standard SQL WITH syntax works as-is.

WITH revenue_by_region AS (
    SELECT region, total_revenue
    FROM sales
    WHERE year = 2024
    GROUP BY region
),
top_regions AS (
    SELECT region, total_revenue
    FROM revenue_by_region
    WHERE total_revenue > 500000
)
SELECT region, total_revenue
FROM top_regions
ORDER BY total_revenue DESC

Note

Inside a CTE body, you're writing against semantic models normally. But once a CTE is defined, references to it are plain SQL — region in top_regions is just a column alias, not a semantic reference.


UNION / UNION ALL

Combine results from multiple queries. Both queries must return the same number of columns with compatible types.

-- UNION removes duplicates; UNION ALL keeps them (faster)
SELECT 'Q1' AS quarter, total_revenue FROM sales WHERE quarter = 1
UNION ALL
SELECT 'Q2' AS quarter, total_revenue FROM sales WHERE quarter = 2

-- Comparing two segments side-by-side
SELECT 'Enterprise' AS segment, customer_count FROM customers[enterprise]
UNION ALL
SELECT 'SMB' AS segment, customer_count FROM customers[smb]

Window Functions

Window functions let you compute running totals, rankings, and period-over-period comparisons without collapsing rows into groups.

Ranking

SELECT
    region,
    total_revenue,
    ROW_NUMBER() OVER (ORDER BY total_revenue DESC) AS rank,
    RANK() OVER (ORDER BY total_revenue DESC) AS rank_with_ties,
    DENSE_RANK() OVER (ORDER BY total_revenue DESC) AS dense_rank
FROM sales
GROUP BY region

Running totals

SELECT
    order_date,
    total_revenue,
    SUM(total_revenue) OVER (
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM orders
GROUP BY order_date
ORDER BY order_date

Period-over-period comparisons

SELECT
    order_date,
    total_revenue,
    LAG(total_revenue) OVER (ORDER BY order_date) AS prev_period_revenue,
    total_revenue - LAG(total_revenue) OVER (ORDER BY order_date) AS change
FROM orders
GROUP BY order_date
ORDER BY order_date

Partitioned windows

SELECT
    region,
    product_category,
    total_revenue,
    ROW_NUMBER() OVER (
        PARTITION BY region
        ORDER BY total_revenue DESC
    ) AS rank_within_region
FROM sales
GROUP BY region, product_category

Supported window functions: ROW_NUMBER, RANK, DENSE_RANK, NTILE, PERCENT_RANK, CUME_DIST, LEAD, LAG, FIRST_VALUE, LAST_VALUE, NTH_VALUE


Subqueries

Standard SQL subqueries work in SemQL.

SELECT name, total_revenue
FROM organizations
WHERE organization_id IN (
    SELECT organization_id
    FROM orders
    WHERE created_at >= CURRENT_DATE - INTERVAL '90 days'
    GROUP BY organization_id
    HAVING COUNT(*) >= 3
)
GROUP BY name, total_revenue

What SemQL does NOT support

  • DELETE, UPDATE, INSERT, DROP, TRUNCATE — SemQL is read-only by design
  • Schema-qualified table names like my_schema.my_table — use model names instead
  • Direct INFORMATION_SCHEMA queries
  • Bypassing auto-applied model filters — they are always active

Supported Functions

Aggregate

FunctionDescription
COUNT(x), COUNT(DISTINCT x)Count rows or unique values
SUM(x)Sum of values
AVG(x)Average
MIN(x), MAX(x)Minimum / maximum
STDDEV(x), VARIANCE(x)Standard deviation / variance
STRING_AGG(x, sep)Concatenate strings with separator
ARRAY_AGG(x)Aggregate into an array
PERCENTILE_CONT(p), MEDIAN(x)Percentile calculations
BOOL_AND(x), BOOL_OR(x)Boolean aggregations
CORR(x, y)Correlation coefficient

Warning

Never wrap a predefined measure in an aggregate function. SUM(total_revenue) will double-aggregate. See Dimensions vs. Measures.

Date / Time

FunctionDescription
DATE_TRUNC('month', date)Truncate to time period
DATE_PART('year', date)Extract part of a date
EXTRACT(YEAR FROM date)Extract part of a date
NOW(), CURRENT_DATECurrent timestamp / date
CURRENT_TIMESTAMPCurrent timestamp with timezone

String

FunctionDescription
UPPER(x), LOWER(x)Change case
LENGTH(x)String length
TRIM(x)Remove whitespace
SUBSTRING(x, start, len)Extract substring
CONCAT(x, y, ...)Concatenate strings
REPLACE(x, from, to)Replace substring

Math

FunctionDescription
ABS(x)Absolute value
ROUND(x, n)Round to n decimal places
CEIL(x), FLOOR(x)Round up / down
MOD(x, y)Modulo
POWER(x, y), SQRT(x)Exponent / square root
LEAST(x, y), GREATEST(x, y)Minimum / maximum of arguments

Geo

FunctionDescription
GEO_DISTANCE(lat1, lng1, lat2, lng2)Distance in km between two coordinates
GEO_WITHIN_RADIUS(lat1, lng1, lat2, lng2, radius_km)True/false: is point within radius?

See Geo Functions for full documentation.