Adding an Elasticsearch Connection
Prerequisites
Supported Versions
The Elasticsearch module has been tested with versions 5, 6 and 7.
Its behavior will differ based on the version, as to adapt itself to the changes in the Elasticsearch modeling (namely, the removal of mapping types). For instance, when cataloging on a version 5 instance, the dataset is the mapping type; whereas for versions 6 and 7, the dataset is the index. However, version 6 still displays the type, a unique value but whose name might have been customized.
Installing the Plugin
Declaring the Connection
Creating and configuring connectors is done through a dedicated configuration file located in the /connections folder of the relevant scanner.
In order to establish a connection with Elasticsearch, specifying the following parameters in the dedicated file is required:
Parameter | Expected value |
|---|
name | The name that will be displayed to catalog users for this connection. |
code | The unique identifier of the connection on the Zeenea platform. Once registered on the platform, this code must not be modified or the connection will be considered as new and the old one removed from the scanner. |
connector_id | The type of connector to be used for the connection. Here, the value must be Elasticsearch and this value must not be modified. |
connection.protocol | Connection protocol to the cluster. Possible values are https or http. |
connection.nodes | A list of cluster connection nodes separated with spaces.
Each node is shown using the host_name[:port]. The port value is optional, and 9200 will be used as a default.
Examples - es1.cogip.com:9200 - es1.cogip.com:9200
This list does not have to be complete, but it is recommended to have multiple nodes. |
tls.trust_store.type | Trust Store type. Possible values are pkcs12 or jks. |
tls.trust_store.path | This file must be provided in case TLS encryption is activated (protocol https) and when certificates of Elasticsearch servers are delivered by a specific authority. It must contain the certification chain.
See https://www.elastic.co/guide/en/elasticsearch/reference/current/configuring-tls.html#node-certificates. |
tls.trust_store.password | Trust Store password |
connection.username | Name of the Elasticsearch user used to connect to Zeenea. The user must have sufficient permissions. |
connection.password | User password |
inventory.index_partition.pattern | The index partition pattern. See Partitioned Virtual Indexes. |
Partitioned Virtual Indexes
The connector is able to manage virtual indexes that have been partitioned on multiple real indexes.
A Partitioned Virtual Index is a collection of indexes containing the same dataset and whose use the following naming convention:
Shared first part, used to identify the Partitioned Index
Specific second part, showing the contents of the partition.
For instance, in the case of the Elasticsearch indexation of the Cogip application logs, best practice is to split the indexes by time frames (for example, on a daily basis). The indexes names will then be:
log-cogip-2020-06-02
log-cogip-2020-06-03
log-cogip-2020-06-04
The share part is log-cogip, whereas the part representing the partition can be described with the rational expression:
\d{4}-\d{2}-\d{2}$
Declaring the rational expression above will be enough for Zeenea to take into account the partitioned virtual indexes.
During inventory, Zeenea will replace the first occurrence of this model with a star. All indexes with the same name structure will be considered as being a part of the same partitioned virtual index.
In our example above, all three partitions will be read as a unique dataset named log-cogip-*.
If multiple models are used, all corresponding rational expressions must be declared and separated with spaces. The first of these expressions that returns a result will be used.
For instance, if the following indexes are used:
log-cogip-2020-05-30
log-cogip-2020-06-01
log-acme-2020-06-03
log-acme-2020-06-04
x-files-01
x-files-02
By defining the setting Index Partition Pattern with the value: \d{4}-\d{2}-\d{2}$ \d{2}$, Zeenea will display the following datasets:
log-cogip-*
log-acme-*
x-files-*
However, be mindful of the pattern order: if above expression is replaced with \d{2}$ \d{4}-\d{2}-\d{2}$, the first option, replacing the last two digits with a star, will be used, and the datasets displayed by Zeenea will be the following:
log-cogip-2020-05-*
log-cogip-2020-06-*
log-acme-2020-06-*
x-files-*
The pattern does not have to be at the end of a string; when using the rational expression \d{4}-\d{2}-\d{2} (note the absence of the dollar sign) on the following indexes:
cogip-2020-06-02-log
cogip-2020-06-03-log
will return the unique dataset copgip-$-log.
On the other hand, an index named log-cogip-2020-06-01_2020-06-07 will be identified as log-cogip-*_2020-06-07 when using an Index Partition Pattern value of \d{4}-\d{2}-\d{2}, and log-cogip-2020-06-01 with a value of \d{4}-\d{2}-\d{2}$.
It is then strongly recommended to use a rational expression as specific as possible to avoid unexpected results.
User Permissions
The Elasticsearch user being used for the connection must have sufficient permissions to extract metadata and collect data statistics.
The following requests are necessary:
GET /: Returns the search engine version
GET /_cat/indices: Returns the list of indexes
GET /{index}/_mappings or GET /{index}/_mappings/{esType} for versions before v7. Returns the index mapping.
GET /{index}/_stats: Returns index stats
GET /{index}/_search: Returns data profiles
Collected Metadata
Inventory
For Elasticsearch version 5, the inventory collects the names of indexes and their types. The dataset corresponds to the type.
For Elasticsearch version 6, the inventory collects the names of indexes and their unique type.
From version 7 onwards, the inventory collects the names of indexes.
Indexes are unpartitioned using the procedure described in Partitioned Virtual Indexes.
Dataset
Based on the version of Elasticsearch, it is one dataset per index (v6 onwards) or one dataset per type (v5).
Field
Object Identification Keys
An identification key is associated with each object in the catalog. In the case of the object being created by a connector, the connector builds it.
Object | Identification Key | Description |
|---|
Dataset | code/keyspace/dataset name | - code: Unique identifier of the connection noted in the configuration file - schema: Dataset keyspace - dataset name |
Field | code/keyspace/dataset name/field name | - code: Unique identifier of the connection noted in the configuration file - schema: Dataset keyspace - dataset name - field name |
Last modified date: 11/28/2025