Redshift Integration
What we need from you
Stemma needs certain information and credentials to extract Redshift metadata into the catalog. To provide these, proceed as follows.

- Navigate to the Admin tab of the Stemma UI and choose Connections and Add New Connection:
- In the pop-up window, choose Redshift as the Connection type and then provide the following information in the dialog box:
- A username and password for Stemma to use to access the
INFORMATION_SCHEMA
- The hostname of the Amazon Redshift cluster. You can find it in the Properties tab of your Redshift Cluster Details page. Under the Connection details section, copy the Endpoint up to but not including the colon.
- A port number. The default port number for Redshift is 5439. From your Redshift Cluster Details page, you can find the port number under the Database configurations section of the Properties tab.
- A list of databases: Stemma whitelists databases, so you will need to provide a list of the databases we will be importing.
- A username and password for Stemma to use to access the
Access
The Stemma Redshift integration requires access to two pieces of information within the Redshift install:
information_schema
metadata- Query logs
information_schema
The specific tables and views we access include:
PG_GET_LATE_BINDING_VIEW_COLS
svv_external_columns
COLUMNS
Because of the way permissions work in Redshift, these tables are only populated with data to which the user has SELECT
access. If you don’t want to use column stats, and therefore want to restrict access for this credential, Stemma will give you the SQL code required to create a materialized view. You create that materialized view as a superuser, and grant our credential the ability to read and refresh that materialized view.
Query logs
Stemma Redshift requires query logs. We can extract these in two ways: push or pull.
Push
The recommended integration involves configuring Redshift to dump user activity logs into S3.
There are multiple methods to achieve this, either using CloudTrail or directly from Redshift. the exact right way differs between organizations (many already have audit logging configured for other purposes).
Pull
Stemma can extract logs from STL_QUERY
. However, because this option requires the credentials be a superuser, it is not the recommended integration method.
Column stats – Optional
The Stemma integration user must have full SELECT access to any tables or views for which you want to see column stats.