📄️ Aggregates Overview
AtScale incorporates the data-warehousing concept of aggregate tables. Such tables most often contain measures from one or more fact datasets and include aggregated values for these measures. (There are dimension-only aggregate tables.) The aggregation of the data is at the level of one or more dimensional attributes or, if no dimensional attributes are included, the aggregated data is a total of the values for the included measures.
📄️ Types of Aggregate Tables in AtScale
There are two types of aggregate table that the AtScale engine can use: System-Defined aggregate tables and User-Defined aggregate tables. System-Defined tables are further divided into Demand-Defined and Prediction-Defined. User-Defined tables are further divided into manual and hinted.
📄️ How Aggregate Tables Are Populated With Data
After an aggregate table is defined, an instance of it needs to be created. An instance is an actual table that contains the aggregated data. The process of creating an instance of an aggregate table and populating it is referred to as a build of the table.
📄️ Life Cycle of Aggregate Tables
The following diagram summarizes the stages in the life cycle of an aggregate table.
📄️ Flowchart of the Actions Taken for Each Query
Each query that requests aggregated data causes the AtScale engine to find out whether an aggregate table exists that will satisfy the query. There are two possible outcomes of this search: an appropriate aggregate table does not exist and an appropriate aggregate table does exist. This flowchart shows the actions that the engine takes for both outcomes.
📄️ Aggregate Definitions and Aggregate Instances
AtScale separates the concept aggregate table into two separate concepts: the definition of an aggregate table and an actual instance (or materialization) of that definition. To summarize, each aggregate table in AtScale is an instance of a definition.
📄️ When the AtScale Engine Creates Demand-Defined Aggregates
Demand-defined aggregates are a type of system-defined aggregate that the AtScale engine creates according to various criteria, such as the history of queries against a model and statistics about the use of existing demand-defined aggregates.
📄️ When to Define Your Own Aggregate Tables
AtScale recommends that you rely on the AtScale engine to generate aggregate tables for you. There are use cases that fall outside of those covered by system-defined aggregates. For these cases you can create user-defined aggregates (UDAs).
📄️ About Hinted Aggregate Tables
Hinted aggregates are a type of user-defined aggregate table; you determine whether to allow the AtScale engine to create such aggregate tables on query datasets. You also control the definition of a hinted aggregate with the SQL statement that defines the corresponding query data set. The join keys, measures, and degenerate dimensions that the SQL statement references will be included in the hinted aggregate table for the query dataset. If you change the references in that SQL statement, the definition of the hinted aggregate changes and is put into effect when the instance of the aggregate table is rebuilt.
📄️ Partitioned System-Defined Aggregate Tables
AtScale Model Architects provide prioritized partitioning hints to the AtScale Engine to inform the creation of partitioned aggregate tables. The actual partitioning scheme used by the Engine depends on a number of factors, including:
📄️ Partitioned User-Defined Aggregates
When you create or edit a user-defined aggregate (UDA), you can specify a dimensional attribute to use as a partitioning key. If you do so, instances of the aggregate will be partitioned, with one partition created for each key value. Partitioning an aggregate can result in faster query performance.
📄️ About Incremental Rebuilds
This type of rebuild of instances of aggregate tables appends new rows and updates existing rows that fall within a period of time that you can specify.
📄️ Aggregates for Fact Datasets that Use Joins
If a fact dataset uses inner joins to join to one more dimensions, the AtScale engine can generate demand-defined aggregates that include data from those dimensions. You can also define your own aggregate-table definitions that use inner joins.
📄️ Deactivating an Active Aggregate Instance
Active aggregate instances can be manually deactivated from the Aggregates page, as well as the Aggregates panel of the Queries page. Hover over an aggregate in the table and click the Deactivate button, shown below.