Adding a Matillion Connection
Prerequisites
Supported Versions
The Matillion module is compatible with the "Basic" edition of the online version of the product.
Installing the Plugin
Declaring the Connection
Creating and configuring connectors is done through a dedicated configuration file located in the /connections folder of the relevant scanner. The scanner frequently checks for any change and resynchronises automatically.
In order to establish a connection with a Matillion instance, the following parameters in the dedicated file are required:
Changes in version 4.1.0
Due to a name conflict between the new parameter (filter) and the old parameters, we were unable to ensure a smooth migration period during which both setting modes could be used.
The filter.includes and filter.excludes parameters are replaced by the single filter parameter.
The filter.versions parameter is replaced by the versions parameter, which works in exactly the same way.
Parameter | Expected value |
|---|
name | The name that will be displayed to catalog users for this connection. |
code | The unique identifier of the connection on the Zeenea platform. Once registered on the platform, this code must not be modified or the connection will be considered as new and the old one removed from the scanner. |
connector_id | The type of connector to be used for the connection. Here, the value must be matillion and this value must not be modified. |
connection.url | Matillion instance URL |
connection.username | Username |
connection.password | Password |
multi_catalog.enabled | Indicates whether the backing database is configured with the multi catalog option or not. The value should be the same as in the corresponding connection. |
tls.truststore.path | The Trust Store file path. This file must be provided in case TLS encryption is activated (protocol https) and when certificates of Matillion servers are delivered by a specific authority. It must contain the certification chain. |
tls.truststore.password | Password of the trust store file |
tls.truststore.type | Type of the trust store file (PKCS12 or JKS). Default value is discovered from the file extension. |
filter | Since version 4.1.0.
Filter the jobs (transformation only for now) that should be synchronized. See Filters below. |
filter.includes | Up to version 4.0.0 (replaced by filter). List of regular expression with comma separated representing elements to include. |
filter.excludes | Up to version 4.0.0 (replaced by filter). List of regular expression with comma separated representing elements to exclude. |
Since version 4.1.0: filter
Since version 4.0.0: filter.versions | - List of version rules separated by a comma. - Only one version of a project is synchronized. - The rules are used to define which version of a project should be synchronized. - Each rule consists of three segments separated by a slash /. - The first segment is a pattern that should match the group name. - The second segment is a pattern that should match the project name. - The last one is the version name to use. It can uses variable if capturing groups have been defined in the other regex. - If the project matches no rule, the synchronized version will be default. Example: versions="g1/p1/v1,group(\d+)/project(\d+)/version_$1_$2,.*/.*/other_default" - The synchronized version of project p1 from group g1 is v1. - The version of project p25 from group g42 is v_42_25. - The version of others projects is other_default. |
proxy.scheme | Depending on the proxy, http or https. |
proxy.hostname | Proxy address |
proxy.port | Proxy port |
proxy.username | Proxy username |
proxy.password | Proxy account password |
User Permissions
In order to collect metadata, the running user's permissions must allow them to access and read transformations that need cataloging.
Filters
Since version 4.1.0 the Matillion connector benefits from the feature of rich filters in the configuration of the connector.
Criteria | Description |
|---|
group | Group name |
project | Project name |
job | Transformation job name |
Data Extraction
The Matillion connector feeds Zeenea Database with Data Processes. Like other Data Process connectors, it has a single job "synchronize" that discovers all items of interest and creates or updates their documentation in the catalog. Matillion items of interest are Transformations. So, for each Transformation in Matillion, a Data Process should be created in Zeenea.
We can logically split the process into two steps, the discovery of the transformations and the extraction of the transformation's metadata.
To discover the transformations, the connector does a top-down crawl of the Matillion object.
It lists the groups with the /group endpoint.
List group from the endpoint /group.
For each group, list the projects with the /group/name/{group_name}/project endpoint.
For each project, list the versions with the /group/name/{group_name}/project/name/{project_name}/version endpoint.
If only a single version is returned, it is selected for synchronization.
If several versions are returned, the connector will select the version configured for the connection. If no version is configured, "default" version will be considered.
For the selected version, list the transformations with the /group/name/{group_name}/project/name/{project_name}/version/name/{version_name}/transformation endpoint.
For each transformation, the connector gathers metadata from two endpoints:
Transformation job export: group/name/%s/project/name/%s/version/name/%s/job/name/%s/export.
Project export: group/name/%s/project/name/%s/export.
In addition, the connector is able to detect whether tables are involved in transformations through SQL Components and thus includes the link to these tables in the lineage.
Synchronization
This connector will harvest all transformation processes identified in the Matillion instance, and automatically represent them in Zeenea.
Lineage
The Matillion connector is able to retrieve the lineage with:
Matillion for Redshift
Matillion for Snowflake
Redshift
To do so, an alias must be defined in the Redshift connection, as detailed below:
alias = ["<host>:<port>/<database>"]
In this instance, variables <host>, <port> and <database> need to be replaced with the actual Redshift values.
Snowflake
Since 4.0.0
The multi_catalog.enabled parameter should have the same value as the parameter of the same name in the Snowflake connection.
An alias should also be defined in the Snowflake connection. Its value depends on the value of multi_catalog.enabled.
If multi_catalog.enabled is true:
alias = ["[org_id]-[account_id].snowflakecomputing.com"]
If multi_catalog.enabled is false:
alias = ["[org_id]-[account_id].snowflakecomputing.com/[db_name]"]
Where [org_id] is the Snowflake organization identifier, [account_id] is the snowflake account identifier, and [db_name] is the database of the Snowflake connection.
Collected Metadata
Data Process
A data process is a Matillion transformation.
Name
Source Description
Input: Input datasets
Output: Output datasets
Technical Data:
Group
Project
Version
Id
Name
Description
Creation Date
Unique Identifier Keys
A key is associated with each item of the catalog. When the object comes from an external system, the key is built and provided by the connector.
Object | Identifier Key | Description |
|---|
Data process | code/group/project/transformation | - code: Unique identifier of the connection noted in the configuration file - group: group name - project: project name - transformation: transformation name |
Last modified date: 11/28/2025