Skip to main content

Adding Amazon Redshift Data Warehouses

An Amazon Redshift data warehouse is an Amazon Redshift cluster and database that contains the tables and views that you want to access as model facts and dimensions. It also contains aggregate table instances that either you or the AtScale engine creates in a schema that you specify.

Prerequisites

Before adding a Redshift data warehouse, ensure you have the following:

  • The datawarehouses_admin role assigned in the Identity Broker.
  • The schema to use for building aggregate tables in the data warehouse. AtScale reads and writes aggregate data to this schema, so the AtScale service account user must have ALL privileges for it. BI tool user accounts should not have the select permission for this schema.
  • If you want to use trigger files to initiate aggregate builds, you must have an S3 bucket for AtScale to connect to. This does not need to be publicly visible, but the AtScale services will require access to it.

Also be aware of the following when using AtScale with a Redshift data warehouse:

  • AtScale routinely updates the CA certificates that it ships with, making it unlikely that you will have to update the AtScale JVM with the Amazon ACM certificates. However, if the certificate changes in the future before you can install another AtScale upgrade, you will need to import the new certificate. For detailed information on Redshift SSL configuration options and certificates, see Configuring security options for connections in the Amazon documentation.
  • If you want to apply at-rest encryption to your data, see the Amazon Redshift database encryption documentation.
  • Redshift doesn't support use of LN or LOG functions with NUMERIC or DECIMAL columns, as described in LN function in the Amazon documentation. If this limitation affects your AtScale models, open a ticket with AWS Redshift.

Add an IAM role for AtScale

Before you can connect to a Redshift data warehouse, you need to create an IAM role for AtScale. Use the GRANT command to define the following Redshift privileges:

  • For the aggregate schema: SELECT, UPDATE, CREATE, DROP
  • For all other tables/schemas AtScale will have access to: SELECT

If you plan on connecting an S3 bucket, grant the AtScale IAM role read/write access to it:

{
"Version": "<version>",
"Statement": [
{
"Sid": "ListObjectsInBucket",
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": ["arn:aws:s3:::<bucket-name>"]
},
{
"Sid": "AllObjectActions",
"Effect": "Allow",
"Action": "s3:*Object",
"Resource": ["arn:aws:s3:::<bucket-name>/*"]
}
]
}

Add a Redshift data warehouse

To add a Redshift data warehouse:

  1. In Design Center, click the main menu and select Settings. The Settings page opens.

  2. In the Settings panel, click Data Warehouses.

  3. In the Add Data Warehouse section, click the Redshift icon. The Add Redshift Data Warehouse panel opens.

  4. Enter a Name for the data warehouse.

  5. (Optional) Enter the External connection ID. This field defaults to the Name value, but can be overridden by enabling the Override generated value? toggle.

  6. In the Database field, enter the name of the database to use when creating aggregate tables. This must already exist; AtScale does not create it automatically.

  7. Enter the name of the Aggregate Schema to use when creating aggregate tables. This must already exist; AtScale does not create it automatically.

  8. (Optional) Specify whether the data warehouse is a Read-only source.

    Select this option if AtScale will be configured to access the database with credentials that do not have permission to create new tables, or if you need to prevent AtScale from placing any aggregate tables in this data warehouse.

    note

    If you enable this option while editing an existing data warehouse, any aggregates the data warehouse currently contains will be deactivated.

  9. (Optional) If you want to connect an S3 bucket, do the following in the S3 Bucket Details section:

    1. Enable the Connect a S3 Bucket toggle.
    2. Select the region in which the bucket is located.
    3. Enter the Bucket Name. This must be in the format s3a://<name>.
    4. Enter the bucket's Access Key ID and Secret Access Key. These are required to read from and write to the bucket.
    5. Click Test S3 Connection to test your connection.
  10. In the Access Controls section, add the users and groups that you want to be able to access the data warehouse and its data. For more information on how data warehouse security works in AtScale, see Data Warehouse Security.

  11. Click Apply.

Next, you must configure a connection to your data warehouse.

Add a Redshift connection

Once you have added your Redshift data warehouse, you need to add a connection.

To add a Redshift connection:

  1. On the Data Warehouses page, click the expand icon for your Redshift data warehouse.

  2. In the Add Connection field, click the plus icon. The Create Redshift Connection panel opens.

  3. Complete the following fields:

    • Name: A unique name for the connection.
    • Host: The hostname for your data warehouse.
    • Port: The port of your data warehouse.
    • Database: The name of the default database to use after connecting. This defaults to the value set when adding the data warehouse and cannot be changed.
    • Username: The username AtScale uses when connecting to the data warehouse.
  4. Select the authentication method to use when connecting to the data warehouse, and enter the required credentials:

    • Use Access Keys: Enter the Access key id and Secret access key. These values are saved in the AtScale database. The access key is saved in plain text, while the secret access key is encrypted.

    • Use Password: Enter the password. This value is encrypted and saved in the AtScale database.

      note

      If you are connecting to an S3 bucket, you must select Use Access Keys.

  5. (Optional) Specify any extra JDBC flags needed to customize the Redshift driver's behavior. For example, see Configuring security options for connections in the Amazon documentation for optional settings you can enter to customize SSL behavior.

  6. Click Test connection to verify that the connection is working.

  7. Click Apply.

Next, you can configure query mapping for your data warehouse connections.