Use SQL with Table Sources

Use SQL expressions to manage your source data.

Limited Availability

Access requirements

Profiles: Data Engineer

Custom profile with these capabilities: Data (Write, Detailed)

Data required: Access to the sources you're managing

Reach out to your administrator for access.

Overview

Sources are the building blocks for data in Visier. They are the raw materials to map your data to Visier's analytic model.

Visier supports the following source types:

File sources: File-based containers that use regex and override behavior to manage data. For more information about acceptable files, see Data File Guidelines.
Table sources: Row-based containers that support incremental updates and SQL transformations during sync. For more information, see Use SQL with Table Sources.

The following table describes the differences between the behavior of each source type.

File sources	Table sources
Each data transfer or extraction outputs to a unique file.	Each data transfer or extraction directly modifies the same table.
Use Visier Extraction Language to query records in a source's Records tab or SQL to query sources in the Data > Query room. For more information, see Query Source Data.	Use Spark SQL to interact with the table data. For more information, see Use SQL with Table Sources.
To delete an individual's data, such as for GDPR compliance, open a Visier Technical Support ticket to remove the information one-by-one from every extracted file.	To delete an individual's data, such as for GDPR compliance, use a DELETE SQL query on the table without contacting Visier Technical Support. Note: The snapshot retention period specifies how long to keep previous snapshots of tables before permanently deleting them. You can set the retention period between 1 and 180 days. The default is 3 days.
The data is available in the Data Transfers room. For more information, see Navigate Studio.	The data is available in the associated table source in the Sources room. For more information, see Sources.

You can create table sources:

Using data connectors. For more information, see Data In Connectors.
With Visier APIs. For more information, see Table Source API.
Manually. For more information, see Manually create a table source.

Difference from file sources

You can use Spark SQL to query, upsert, replace, commit, rollback, and process data. SQL expressions tell Visier how to map and replace data. File sources use Visier Extraction Language (VEL) and Override behavior.

A table source temporarily stores data while you commit the data to Visier, then discards the files. Table sources are better for near real-time data processing because they do not track individual files indefinitely and can efficiently handle updates and deletions. As a result, the volume of data to process is much lower.

About transactions

To bring data into a table source, start a transaction. A transaction is a temporary container that stores data and SQL actions, such as inserting or replacing data in the table source. An open transaction is a write transaction or a read transaction.

Write transaction: The transaction makes changes to a table source. One write transaction can be open at a time. Close a transaction by committing the transaction and processing the data. In an API call, specify tablesToModify to lock tables for writing during the transaction.
Read transaction: The transaction can view data within a table source. While a SQL job processes changes to a table source, a read transaction will only see the data available prior to the job. In an API call, omit tablesToModify to make the transaction read-only.

About input files

A data file that you upload to a table source transaction is an input file. An input file can contain data to insert into your sources, IDs to match against to change or delete rows, or any other input for your SQL query.

SQL expressions reference input files in the transaction by the id field (set in the job's inputs array), not the filename.

Example: An input file is declared as "id": "new_employees" in the inputs field. SQL expressions reference this file as new_employees, regardless of the input file's filename.

If the id contains special characters such as dots or hyphens, wrap the ID in backticks; for example, `my-input.csv`.

Add data to a table source

Prerequisites: Connect your source data to objects in Visier. For more information, see Add a Mapping. If mapping by formula, use SQL expressions instead of VEL.

Use connectors to add data

If your table source is associated with a data connector, the table source receives new data after a data connector extraction job runs. For more information, see Run a Data Connector Extraction Job.

After a successful extraction job, the data is available in the associated table source in the Sources room. For more information, see Sources.

Use APIs to add data

Start a transaction: POST /v1alpha/data/table-sources/transactions. For more information and code samples, see Start a transaction.
Upload a file to the transaction: PUT /v1alpha/data/table-sources/transactions/{transactionId}/inputs/{filename}. For more information and code samples, see Upload an input file to a transaction.
Apply SQL expressions and process the data: POST /v1alpha/data/table-sources/transactions/{transactionId}/jobs. For more information and code samples, see Run a SQL job within a transaction.

Manually add data

On the global navigation bar, click Sources.
Select the table source and click Records.
In Temporary Data, click Add Files.
Select the files to upload.
In SQL Commands, type the SQL expressions to apply to the source. For an example of potential SQL expressions, see SQL examples.
Click Execute SQL. The expression moves to Current Transaction.
Optional: To remove SQL expressions, in Current Transaction, click Rollback.
When finished, click Commit.
To process the data, run a job. For more information, see Run a Job.

Define column settings

Assign each column a data type and optionally designate one column as the primary timestamp.

Use APIs to define columns

Define the columns array in any of the following API calls:

Create a table source: POST /v1alpha/data/table-sources/tables. For more information and code samples, see Create a table source.
Update a table source: PUT /v1alpha/data/table-sources/tables/{tableId}. For more information and code samples, see Update a table source.

Manually define columns

When you access a table source in the Sources room, you can create columns, assign column data types, and choose the primary timestamp. For more information, see Manually create a table source.

Edit key-value pairs

View and change the user-defined key-value pairs for a table source. In addition to key-value pairs, a table source's Table Properties tab displays the source's number of records, last updated date, and data size.

Use APIs to edit key-value pairs

Retrieve a table source's existing key-value pairs: GET /v1alpha/data/table-sources/tables/{tableId}/metadata. In the response, look for metadata. For more information and code samples, see Retrieve a table source's metadata.
Run a SQL job with the setMetadata action. POST /v1alpha/data/table-sources/transactions/{transactionId}/jobs. For more information and code samples, see setMetadata.

Manually edit key-value pairs

On the global navigation bar, click Sources.
Select the table source and click Table Properties.
In Key/Value Metadata, do any of the following:
1. To add a new key-value pair, click Add Pair. Type a key and value.
2. To change an existing key-value pair, type a new key or value.
3. To delete a key-value pair, click the Delete button .
To revert your changes without saving, click Reset.
To save your changes, click Commit. This executes the changes to the table source.

SQL examples

This section provides sample SQL expressions to use in table source queries and mapping formulas. For more information about SQL functions, see Apache Spark SQL Functions.

Table source mapping formulas have protective limitations. They cannot use functions that change row counts or depend on table order. For a list of SQL functions that are not supported, see Unsupported SQL functions.

Samples in this section use the following table.

Copy

-- employee_custom_data
-- Columns: employee_id (STRING), badge_number (STRING), office_floor (INTEGER),
--          is_remote (BOOLEAN), effective_date (DATE)

SELECT

Copy

Example: Basic select

1
2
SELECT employee_id, badge_number, office_floor, effective_date
FROM employee_custom_data

Copy

Example: Filter rows

1
2
3
4
SELECT employee_id, badge_number, office_floor, effective_date
FROM employee_custom_data
WHERE effective_date >= '2026-01-01'
  AND is_remote = false

Copy

Example: Aggregate by office floor

1
2
3
4
5
6
7
SELECT office_floor,
       COUNT(*)                                         AS headcount,
       SUM(CASE WHEN is_remote THEN 1 ELSE 0 END)      AS remote_count
FROM employee_custom_data
WHERE effective_date = '2026-01-01'
GROUP BY office_floor
ORDER BY office_floor

Copy

Example: Join against an input file with ID new_employees

1
2
3
4
5
6
7
8
9
SELECT e.employee_id,
       e.badge_number,
       n.office_floor,
       n.is_remote,
       n.effective_date
FROM new_employees n
LEFT JOIN employee_custom_data e
    ON e.employee_id = n.employee_id
   AND e.effective_date = n.effective_date

Copy

Example: Remove duplicate rows and keep the most recent record per employee

1
2
3
4
5
6
7
SELECT employee_id, badge_number, office_floor, is_remote, effective_date
FROM (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY employee_id ORDER BY effective_date DESC) AS rn
    FROM employee_custom_data
) ranked
WHERE rn = 1

INSERT

Copy

Example: Insert rows from an input view with ID new_employees

1
2
3
INSERT INTO employee_custom_data
SELECT employee_id, badge_number, office_floor, is_remote, effective_date
FROM new_employees

Copy

Example: Insert a single row

1
2
INSERT INTO employee_custom_data (employee_id, badge_number, office_floor, is_remote, effective_date)
VALUES ('E001', 'B-1042', 3, false, '2026-03-01')

Copy

Example: Insert only rows that don't already exist in the target file

1
2
3
4
5
6
7
8
9
INSERT INTO employee_custom_data
SELECT n.employee_id, n.badge_number, n.office_floor, n.is_remote, n.effective_date
FROM new_employees n
WHERE NOT EXISTS (
    SELECT 1
    FROM employee_custom_data e
    WHERE e.employee_id   = n.employee_id
      AND e.effective_date = n.effective_date
)

UPDATE

Copy

Example: Update a single column for all matching rows

1
2
3
UPDATE employee_custom_data
SET is_remote = true
WHERE office_floor = 0

Copy

Example: Update multiple columns using values from an input view

1
2
3
4
5
6
7
8
MERGE INTO employee_custom_data AS target
USING new_employees AS source
    ON target.employee_id    = source.employee_id
   AND target.effective_date = source.effective_date
WHEN MATCHED THEN
    UPDATE SET
        office_floor = source.office_floor,
        is_remote    = source.is_remote

Note: UPDATE only supports a WHERE clause. UPDATE does not support join-based updates. Use MERGE with a WHEN MATCHED clause to update existing rows from a source view without inserting new rows.

DELETE

Copy

Example: Delete rows for a specific date

1
2
DELETE FROM employee_custom_data
WHERE effective_date = '2025-12-31'

Copy

Example: Delete rows from the table that have no matching record in the input file. This acts as a hard delete for removed employees.

1
2
3
4
DELETE FROM employee_custom_data
WHERE employee_id NOT IN (
    SELECT employee_id FROM new_employees
)

Copy

Example: Change data capture (CDC) to delete rows flagged for deletion in the source

1
2
3
4
5
6
DELETE FROM employee_custom_data
WHERE (employee_id, effective_date) IN (
    SELECT employee_id, effective_date
    FROM new_employees
    WHERE is_deleted = true
)

The above example processes a CDC feed where the input file includes an is_deleted column. This removes target file rows that have a matching key in the source file and are flagged for deletion.

MERGE

Copy

Example: Upsert (insert new rows, update existing rows)

1
2
3
4
5
6
7
8
9
10
11
12
MERGE INTO employee_custom_data AS target
USING new_employees AS source
    ON target.employee_id   = source.employee_id
   AND target.effective_date = source.effective_date
WHEN MATCHED THEN
    UPDATE SET
        badge_number = source.badge_number,
        office_floor = source.office_floor,
        is_remote    = source.is_remote
WHEN NOT MATCHED THEN
    INSERT (employee_id, badge_number, office_floor, is_remote, effective_date)
    VALUES (source.employee_id, source.badge_number, source.office_floor, source.is_remote, source.effective_date)

A single MERGE statement can handle updates and inserts but cannot delete rows that are absent from the source. To implement a full sync, run a MERGE for the upsert followed by a separate DELETE for rows no longer present in the source. These can be submitted as two separate query actions in the same job.

Copy

Example: Full sync (insert new rows, update existing rows, delete removed rows)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Step 1: Upsert from the source file
MERGE INTO employee_custom_data AS target
USING new_employees AS source
    ON target.employee_id   = source.employee_id
   AND target.effective_date = source.effective_date
WHEN MATCHED THEN
    UPDATE SET
        badge_number = source.badge_number,
        office_floor = source.office_floor,
        is_remote    = source.is_remote
WHEN NOT MATCHED THEN
    INSERT (employee_id, badge_number, office_floor, is_remote, effective_date)
    VALUES (source.employee_id, source.badge_number, source.office_floor, source.is_remote, source.effective_date)

// Step 2: Delete rows that are no longer in the source file
DELETE FROM employee_custom_data
WHERE employee_id NOT IN (SELECT employee_id FROM new_employees)

The following example processes a CDC feed where the input includes an op column indicating the operation type:

'I' for insert.
'U' for update.
'D' for delete.

This removes rows matched by key that are flagged for deletion, updates other matched rows are updated, and inserts unmatched rows that are not deletions.

Copy

Example: Change data capture (CDC) to apply a change feed with inserts, updates, and deletes

1
2
3
4
5
6
7
8
9
10
11
12
13
14
MERGE INTO employee_custom_data AS target
USING new_employees AS source
    ON target.employee_id   = source.employee_id
   AND target.effective_date = source.effective_date
WHEN MATCHED AND source.op = 'D' THEN
    DELETE
WHEN MATCHED AND source.op != 'D' THEN
    UPDATE SET
        badge_number = source.badge_number,
        office_floor = source.office_floor,
        is_remote    = source.is_remote
WHEN NOT MATCHED AND source.op != 'D' THEN
    INSERT (employee_id, badge_number, office_floor, is_remote, effective_date)
    VALUES (source.employee_id, source.badge_number, source.office_floor, source.is_remote, source.effective_date)

Copy

Example: Only update rows where specific columns are different

1
2
3
4
5
6
7
8
9
10
11
12
13
14
MERGE INTO employee_custom_data AS target
USING new_employees AS source
    ON target.employee_id   = source.employee_id
   AND target.effective_date = source.effective_date
WHEN MATCHED AND (
    target.office_floor <> source.office_floor OR
    target.is_remote    <> source.is_remote
) THEN
    UPDATE SET
        office_floor = source.office_floor,
        is_remote    = source.is_remote
WHEN NOT MATCHED THEN
    INSERT (employee_id, badge_number, office_floor, is_remote, effective_date)
    VALUES (source.employee_id, source.badge_number, source.office_floor, source.is_remote, source.effective_date)

Spark SQL for column mappings

Use these expressions in the SELECT list of a query to transform or derive column values before writing them to a table source.

String functions

Copy

Example: Normalize to uppercase

1
UPPER(employee_id)                                  AS employee_id

Copy

Example: Strip leading and trailing whitespace

1
TRIM(badge_number)                                  AS badge_number

Copy

Example: Concatenate columns

1
CONCAT(first_name, ' ', last_name)                  AS full_name

Copy

Example: Concatenate columns, taking the separator once

CONCAT_WS(' ', first_name, last_name)

Copy

Example: Extract a substring (1-indexed) for characters 1 to 3

1
SUBSTR(employee_id, 1, 3)                           AS id_prefix

Copy

Example: Remove non-alphanumeric characters

1
REGEXP_REPLACE(badge_number, '[^A-Za-z0-9]', '')    AS badge_clean

Copy

Example: Split a delimited string and take the first element

1
SPLIT(location_code, '-')[0]                        AS region_code

Copy

Example: Replace nulls with a default string

1
COALESCE(department_name, 'Unknown')                AS department_name

Copy

Example: Replace nulls with a default string

1
COALESCE(department_name, 'Unknown')                AS department_name

Date and time functions

Copy

Example: Parse a string into a date

1
TO_DATE(hire_date_str, 'yyyy-MM-dd')                AS hire_date

Copy

Example: Format a date as a string

1
DATE_FORMAT(effective_date, 'yyyy-MM')              AS year_month

Copy

Example: Add 90 days to a date

1
DATE_ADD(effective_date, 90)                        AS review_date

Copy

Example: Number of days between two dates

1
DATEDIFF(end_date, start_date)                      AS tenure_days

Copy

Example: Extract parts of a date

1
2
3
YEAR(effective_date)                                AS year
MONTH(effective_date)                               AS month
DAYOFWEEK(effective_date)                           AS day_of_week

Copy

Example: Current date at query time

1
CURRENT_DATE()                                      AS snapshot_date

Copy

Example: Truncate to the start of the month

1
DATE_TRUNC('month', effective_date)                 AS month_start

Conditional expressions

Copy

Example: Simple if/else

1
IF(is_remote, 'Remote', 'On-site')                  AS work_location

Copy

Example: Multi-branch conditional

1
2
3
4
5
CASE
    WHEN office_floor >= 10 THEN 'Executive'
    WHEN office_floor >= 5  THEN 'Senior'
    ELSE                         'Standard'
END                                                 AS floor_tier

Copy

Example: Return null if two values are equal (useful for clearing sentinel values)

1
NULLIF(badge_number, 'N/A')                         AS badge_number

Copy

Example: First non-null value in a list

1
COALESCE(preferred_name, legal_name, employee_id)   AS display_name

Numeric functions

Copy

Example: Round to 2 decimal places

1
ROUND(salary / 12.0, 2)                             AS monthly_salary

Copy

Example: Absolute value

1
ABS(variance)                                       AS abs_variance

Copy

Example: Integer division remainder

1
MOD(employee_seq, 10)                               AS shard_key

Copy

Example: Cast string to integer

1
CAST(headcount_str AS INTEGER)                      AS headcount

Copy

Example: Safe division (avoid divide-by-zero)

1
CASE WHEN total > 0 THEN ROUND(part / total * 100, 1) ELSE 0.0 END AS pct

Type casting

Copy

Example: String to date

1
CAST('2026-01-01' AS DATE)                          AS start_date

Copy

Example: Integer to string

1
CAST(office_floor AS STRING)                        AS floor_label

Copy

Example: String to Boolean

1
CAST(is_remote_flag AS BOOLEAN)                     AS is_remote

Copy

Example: Timestamp to date (drops the time component)

1
CAST(updated_at AS DATE)                            AS updated_date

Hashing and anonymization

Copy

Example: SHA-256 hash of a column (for pseudonymization)

1
SHA2(employee_id, 256)                              AS employee_id_hash

Copy

Example: MD5 hash

1
MD5(CONCAT(employee_id, effective_date))            AS row_fingerprint

Unsupported SQL functions

Table source mapping formulas have protective limitations. They cannot use functions that change row counts or depend on table order.

Unsupported in mapping formulas and source queries

Table sources do not support the following functions in any way.

Data definition language (DDL): CREATE, DROP, ALTER
Procedures: CALL
Table properties: TBLPROPERTIES
Insecure functions: These functions expose sensitive information or allow arbitrary code execution, and are not supported.
- current_catalog
- current_database
- current_schema
- current_user
- java_method
- raise_method
- raise_error
- reflect
- try_reflect
- user
- version

Unsupported in mapping formulas

You can use the following functions in source queries but not in mapping formulas.

Aggregating functions

array_agg
bitmap_or_agg
bitmap_construct_agg
count
count_if
count_min_sketch
cume_dist
histogram_numeric
hll_sketch_agg
hll_union_agg
kurtosis
grouping
grouping_id
sum
avg
min
max
collect_list
collect_set
approx_count_distinct
first
last

Set generating functions

explode
explode_outer
inline
inline_outer
json_tuple
posexplode
posexplode_outer
stack

Side effect functions

curdate
current_date
current_timestamp
localtimestamp
monotonically_increasing_id
now
rand
randn
random
session_window
shuffle

Window functions

row_number
rank
dense_rank
ntile
lag
lead
cume_dist
percent_rank