About Aggregates

📄️ Aggregates Overview

AtScale incorporates the data-warehousing concept of aggregate tables. Such tables most often contain measures from one or more fact datasets and include aggregated values for these measures. (There are dimension-only aggregate tables.) The aggregation of the data is at the level of one or more dimensional attributes or, if no dimensional attributes are included, the aggregated data is a total of the values for the included measures.

📄️ Types of Aggregate Tables in AtScale

There are two types of aggregate table that the AtScale engine can use: System-Defined aggregate tables and User-Defined aggregate tables. System-Defined tables are further divided into Demand-Defined and Prediction-Defined. User-Defined tables are further divided into manual and hinted.

📄️ How Aggregate Tables Are Populated With Data

After an aggregate table is defined, an instance of it needs to be created. An instance is an actual table that contains the aggregated data. The process of creating an instance of an aggregate table and populating it is referred to as a build of the table.

📄️ Life Cycle of Aggregate Tables

The following diagram summarizes the stages in the life cycle of an aggregate table.

📄️ Flowchart of the Actions Taken for Each Query

Each query that requests aggregated data causes the AtScale engine to find out whether an aggregate table exists that will satisfy the query. There are two possible outcomes of this search: an appropriate aggregate table does not exist and an appropriate aggregate table does exist. This flowchart shows the actions that the engine takes for both outcomes.

📄️ Aggregate Definitions and Aggregate Instances

AtScale separates the concept aggregate table into two separate concepts: the definition of an aggregate table and an actual instance (or materialization) of that definition. To summarize, each aggregate table in AtScale is an instance of a definition.

📄️ When the AtScale Engine Creates Demand-Defined Aggregates

Demand-defined aggregates are a type of system-defined aggregate that the AtScale engine creates according to various criteria, such as the history of queries against a model and statistics about the use of existing demand-defined aggregates.

📄️ When to Define Your Own Aggregate Tables

AtScale recommends that you rely on the AtScale engine to generate aggregate tables for you. There are use cases that fall outside of those covered by system-defined aggregates. For these cases you can create user-defined aggregates (UDAs).

📄️ About Hinted Aggregate Tables

Hinted aggregates are a type of user-defined aggregate table; you determine whether to allow the AtScale engine to create such aggregate tables on query datasets. You also control the definition of a hinted aggregate with the SQL statement that defines the corresponding query data set. The join keys, measures, and degenerate dimensions that the SQL statement references will be included in the hinted aggregate table for the query dataset. If you change the references in that SQL statement, the definition of the hinted aggregate changes and is put into effect when the instance of the aggregate table is rebuilt.

📄️ Partitioned System-Defined Aggregate Tables

AtScale Model Architects provide prioritized partitioning hints to the AtScale Engine to inform the creation of partitioned aggregate tables. The actual partitioning scheme used by the Engine depends on a number of factors, including:

📄️ Partitioned User-Defined Aggregates

When you create or edit a user-defined aggregate (UDA), you can specify a dimensional attribute to use as a partitioning key. If you do so, instances of the aggregate will be partitioned, with one partition created for each key value. Partitioning an aggregate can result in faster query performance.