In-Memory Dimensional Aggregates
In-memory dimensional aggregates are dimension-only aggregate tables that are loaded and queried in memory. These can be used to improve response time for filtered dimensional lookup queries.
To use in-memory dimensional aggregates, you must have a license that contains the aggregate_table_cache
feature attribute. For information on obtaining such a license, reach out to your Account Executive.
Enable in-memory dimensional aggregates
At a high level, the process for enabling in-memory dimensional aggregates is as follows:
- Obtain a license that allows for in-memory dimensional aggregates.
- Enable the functionality in the engine settings.
- (Optional) Update your UDAs to pin them to the local in-memory cache. This ensures that the system attempts to load dimension-only UDAs into memory.
- If you plan on using this functionality for secured dimensions, update your UDAs and security dimensions to enable them to join.
The following sections describe these steps in detail.
Enable in-memory dimensional aggregates in the engine
To enable in-memory dimensional aggregates, set the following custom engine settings to TRUE. For more information on changing engine settings, see Changing Engine Settings.
Setting | Value | Description |
---|---|---|
aggregates.tableCache.enabled | TRUE | Enables the in-memory dimension-only cache functionality. When enabled, the system automatically loads dimension-only aggregates if the in-memory aggregate pool size is small enough. You must restart the engine after changing this setting. |
aggregates.dimensional.allowJoinsToSecondaries.enabled | TRUE | Optional. Enables joining to secondary in-memory dimension aggregate tables. This must be set to TRUE to join to in-memory security dimension aggregates. You must restart the engine after changing this setting. |
(Optional) Pin UDAs to the local in-memory cache
If your AtScale license allows for in-memory caching, the Pin aggregate to local in-memory cache checkbox appears in the Create User-Defined Aggregate dialog. Checking this box tells the cache to include the aggregate, regardless of size and current cache contents. After the system builds the UDA, it is loaded into memory by each AtScale host. The system automatically reloads the aggregate table when it changes.
This step is optional, but ensures that the system attempts to load dimension-only UDAs into memory.
Configure joins with secured dimensions
To enable joining of an in-memory dimension-only aggregate table with a security dimension aggregate table:
- Set the
aggregates.dimensional.allowJoinsToSecondaries.enabled
engine setting to TRUE, then restart the engine. For more information, see Changing Engine Settings. - On the security dimension, set the Lookup Rules option to Use Filter Key. For more information, see Create a Security Dimension.
- Create a UDA that contains the secured dimension and the security dimension's filter attribute key. Be sure to select the Pin aggregate to local in-memory cache option for the UDA. For instructions on creating UDAs, see Defining Aggregates Yourself.
- Create another UDA that contains only the security dimension's attribute filter key. Be sure to select the Pin aggregate to local in-memory cache option for the UDA.
Tuning the in-memory dimension aggregate cache
You can use the following custom engine settings to configure the in-memory dimension aggregate cache. For more information on changing engine settings, see Changing Engine Settings.
Setting | Default Value | Description |
---|---|---|
aggregates.tableCache.nioMemFS.enabled | TRUE | Determines whether to use the “nioMemFS” storage system, which provides off-heap storage. If set to false, the on-heap storage is used instead. |
aggregates.tableCache.addTableToCache.concurrency | 1 | The maximum number of concurrent in-memory “add to cache” querying processes. If you observe high latency between engine startup and cache bootstrapping, try increasing this setting. |
Limitations and known issues
When using in-memory dimensional aggregates, be aware of the following limitations and issues:
- Attributes of type DECIMAL or NUMERIC cannot be used with in-memory dimensional aggregates. AtScale will not load a UDA into memory if it contains an attribute of type DECIMAL or NUMERIC.
- UDAs that contain measures will not be loaded into memory.
- This functionality only supports equal (=) operators. Queries that use the LIKE function are not routed to the in-memory datastore.
- UDAs that use hierarchy levels or secondary attributes that have custom empty members defined are not loaded into memory.
Troubleshooting
Memory issues
If you observe memory issues with the off-heap feature, it is likely related to the XX:MaxDirectMemorySize
JVM property: you must have access to the amount of physical memory defined by this property. If you don’t, and enough table rows are loaded into memory, you may run out and crash the machine. For more information, refer to the Oracle Java documentation.
Query failures
If the target aggregate was a UDA pinned to the cache, deselect the Pin aggregate to local in-memory cache checkbox and republish the cube. If the failure continues, disable the cache altogether.
Frequently Asked Questions
-
Will the system still interoperate with the query result cache?
Answer: Yes, queries that would be routed to the in-memory table cache can still be serviced by the in-memory query cache.
-
How do I know if a query hit the table cache?
Answer: The query screen will show “H2-Dialect” as the dialect used for an outgoing subquery.
-
Is there a way to control how long the cache stays?
Answer: There is no typical TTL for this cache - as long as an aggregate is still valid, it may be in the cache. When an aggregate is removed or updated, then the corresponding cached aggregate is also removed.
-
What if there are updates in the aggregates (like aggregate rebuild)?
Answer: The cache respects the presence of new aggregates in the same way that the planner does: invalidated/updated aggregates will never be hit if they have stale data.
-
How big can the cache be made?
Answer: If using off heap (the default), this is governed by the
XX:MaxDirectMemorySize
JVM property. If not using the off-heap setting, this value should be much smaller than the JVM heap allocation. -
Can the cache be set to persist in the disk?
Answer: Not supported. Persisting this data to disk on engine nodes violates AtScale’s current security policies.
-
Is the caching mechanism similar to the RDD concept in Spark?
Answer: It is similar to pinning Spark SQL tables in memory, but RDD is a different concept.
-
What determines when the cache will be flushed and when to reload?
Answer: When aggregates are removed, added, or updated, their corresponding cached entries are the same.