i2024.2.4 New Features and Improvements - Distinct Sum
Business Requirement
Want to compute deduped totals for repeated rows as part of an ad-hoc analysis of datasets without changing the model's 1-to-Many relationship as a Many-to-Many relationship. This is common if the dominant use-case for a model is ad-hoc analysis.
AtScale Feature Review
- Model the Inventory Quantity using the new Distinct Sum Aggregation function
- UDA Support - Requires exact hits without filters that cause Re-aggregation
- Experimental System Defined Aggregation Support - Operates the same ways as Distinct Count.
- aggregates.create.allowDistinctSumMeasures.enabled = True (Default = False)
- aggregates.withDistinctSums.widening.enabled = True (Default = False) (Optional)
- aggregates.systemGenerated.withDistinctSumMeasure.retentionPercentage (Default = 40
- Can be useful for ad-hoc Many-to-Many Totals of Integer / Long Types
- Example: Total Product SKU Inventory from Store Level to Warehouse level without inflating totals when grouping by related dimensions in ad-hoc queries.
- Stores source from many Warehouses
- Warehouse may supply many Stores
- Want to avoid overstating Totals, including rollup to Warehouse level.
- Example: Total Product SKU Inventory from Store Level to Warehouse level without inflating totals when grouping by related dimensions in ad-hoc queries.
- Use when Configuring Many-to-Many relationships in advance is not practical due to the ad-hoc nature of the analysis.
- Will use Aggregates!
- Finally, Not recommended for use with floating point columns with large numbers of distinct values.