Optimizing Queries for Calculations
AtScale automatically optimizes the planning stage for queries on complex calculation involving large CASE statements. The following sections describe how this optimization works.
Query Planning Optimization
Before beginning the physical query planning, AtScale simplifies the queried metrics by resolving any large CASE statements they contain. The engine does the following:
- Resolves the contextual properties (level number, level name, member name, etc.) that result in constants.
- Simplifies the expressions in which the constants are used.
The following sections describe these two processes in detail.
Resolving constraints
Although there are several possible types of constraints that can be used, the optimization primarily considers equality, inequality, disjunction (or), and conjunction (and). These are resolved as follows:
-
Equality and inequality: If the constant expression can be resolved as
true
, it is replaced bytrue
. -
Disjunctions: If one of the constant expressions can be resolved as
true
, the whole expression is replaced bytrue
. -
Conjunctions: The following logic is used:
- If any constant expression can be resolved as
false
, the whole constraint is replaced byfalse
. - If all constant expressions can be resolved as
true
, but there is at least one that can only be resolved in the database, then the conjunction cannot be resolved and is therefore not replaced.
- If any constant expression can be resolved as
For example, consider the following queries on a level called A:
level_name = "A" OR sales > 0
is replaced bytrue
.level_name = "B" OR sales > 0
is not replaced.level_name = "B" AND sales > 0
is replaced byfalse
.level_name = "A" AND sales > 0
is not replaced.
Resolving CASE statements
When simplifying a CASE statement, there are three possible outcomes:
- Return the value of the first CASE that is true.
- Return the default value (else).
- Return a simplified version of the CASE statement.
When processing the CASE statements (in order), the engine searches for one that is evaluated as true, keeping the ones that are not resolved and discarding the ones resolved as false.
When a true case cx is found, if there are no unresolved cases before it, then the cx returned is the correct one. However, if there are unresolved expressions before cx, they could return true when evaluated in the database, so cx is added to the list of unresolved statements.
To simplify:
- If a result is found, it is returned.
- If no result is found and there is no unresolved expression, the default value is returned.
- If no result is found but there are unresolved expressions, a new CASE statement using only the unresolved expressions is returned.
For example, consider the following queries on level A:
case when level_name = "A" then 1
results in1
.case when level_name = "B" then 1 else 0
results in0
.case when level_name = "A" OR level_name = "B" then 1
results in1
.case when level_name = "B" then 1 case level_name = "A" then 2 else 3
results in2
.case when sales > 10 then 0 when level_name = "B" then 1 case level_name = "A" then 2 else 3
results incase when sales > 10 then 0 case level_name = "A" then 2 else 3
.