Docly

Snowflake Integration

Estimated reading: 3 minutes

What we need from you

Stemma needs certain information and credentials to extract Snowflake metadata into the catalog. To provide these, proceed as follows.

  1. Navigate to the Admin tab of the Stemma UI and choose Connections and Add New Connection:
  1. In the pop-up window, choose Snowflake as the Connection type, provide a Connection Name, and then in the dialog box fill out the remaining fields as follows:
    • Credentials
      We support authentication with a username and either a password or a private key.
      • Username and Password: You need to provide a username and password for Stemma to use to access the INFORMATION_SCHEMA as described above.
      • Username and Private Key: If you set up authentication via key pairs you will need to provide Stemma with a private key to use when accessing INFORMATION_SCHEMA as described above. If your private key is encrypted, you will also need to provide Stemma with the passphrase.
    • Snowflake Account Name: If you use the snowflake web interface, the URL you use should look something like abc.snowflakecomputing.com. Everything that appears before Snowflake computing is your account name.
    • Warehouse: The name of the warehouse (also called Virtual Warehouse) Stemma will connect to.
    • Role: The role Stemma will use once connected. If Stemma is extracting query logs, this role will need to have MONITOR access on the warehouse.
  2. When you are finished, click Save Connection.

Access

The Stemma Snowflake integration requires read access to all tables that will be included in the catalog, plus query logs.

Metadata

The specific tables and views we access include:

  • TABLES
  • COLUMNS

Due to the permissions model of Snowflake, these tables are only populated with data to which the role has SELECT access. Metadata will be extracted one database at a time since each Snowflake database has its own INFORMATION_SCHEMA.

Query logs

Stemma extract logs from the QUERY_HISTORY table function in the INFORMATION_SCHEMA. Logs are used to inform lineage, usage, common usage patterns and more. Due to the permissions model of Snowflake, this table is only populated with queries executed in virtual warehouses that the role has MONITOR access on.

Access History

Stemma uses ACCESS_HISTORY to enrich lineage derived from query logs. To grant Stemma access, Stemma’s user role will need to have the IMPORTED PRIVILEGES:

grant imported privileges on database snowflake to role stemma_role;

If Stemma cannot read Access History, lineage will still work however accuracy will be reduced.

Note: If you try giving the Stemma role access only to the required view, without giving access to the SNOWFLAKE database and ACCOUNT_USAGE schema before that, Snowflake will probably reject your request with the “Grant not executed: Insufficient privileges.” error.