How to Enable Statistics for Impala SQL Engine When Hive SQL Engine is Used to Create Aggregates

When the Hive SQL engine is set up to create the aggregates, only the analyze table command is executed in Hive, which doesn't generate column stats in Impala.

Enabling the computeTableStats.onAllSubgroups.enabled parameter will spawn the compute stats commands on all connection types in the AtScale environment. And this will spawn the Impala compute stats command, and thus column stats are generated.

1) Log in to Design Center UI as an admin user
2) Navigate to Manage --> Engine Overview --> Settings --> Advanced settings. 
3) Enable computeTableStats.onAllSubgroups.enabled
4) After the setting change, check the engine.log to see if it executes both Impala and Hive stats commands. 

 2018-05-10 21:34:21,154 DEBUG [query-executor-29] {service=AggregationService, envId=test, projectId=3c6e0e2f-3c86-4d1a-72c5-47d7543cb165, orgId=default, queryId=8836d315-3579-4835-b11f-e45813b6869c} com.atscale.engine.jdbc.DB               - Creating table with DDL (Hive-1.1): CREATE TABLE as_adventure.as_agg_b22490e2_cstmrk (key_c1 INT, customer_key_c2 INT)  STORED AS parquet ?? 
2018-05-10 21:35:00,642 DEBUG [query-executor-25] {} com.atscale.engine.jdbc.HiveDB           - Compute Table Stats: as_agg_b22490e2_cstmrk, SQL: compute stats as_adventure.as_agg_b22490e2_cstmrk 
2018-05-10 21:35:05,030 DEBUG [query-executor-25] {} com.atscale.engine.jdbc.HiveDB           - Compute Table Stats: as_agg_b22490e2_cstmrk completed after 4.39 s 
2018-05-10 21:35:05,031 DEBUG [atscale-akka.actor.aggregate-dispatcher-772] {envId=test, aggDefId=6082d4a1-7af0-47d9-b8e6-20696f3f821b, aggInstId=b22490e2-b522-4d03-ba5f-1bc4dd1139a9, projectId=3c6e0e2f-3c86-4d1a-72c5-47d7543cb165, orgId=default} com.atscale.engine.aggregation.materializer.AggregateInstanceMaterializer - Computing stats for table [SqlTable(Some(as_adventure),as_agg_b22490e2_cstmrk,None,None)] on subgroup [subgroup:a4496293-8d3b-485f-bc05-b0cf81bb3710] 
2018-05-10 21:35:05,416 DEBUG [query-executor-29] {} com.atscale.engine.jdbc.HiveDB           - Compute Table Stats: as_agg_b22490e2_cstmrk, SQL: analyze table as_adventure.as_agg_b22490e2_cstmrk  compute statistics 
2018-05-10 21:35:27,568 DEBUG [query-executor-29] {} com.atscale.engine.jdbc.HiveDB           - Compute Table Stats: as_agg_b22490e2_cstmrk completed after 22.2 s

How to Enable Statistics for ImpalaSQL.png
 

Was this article helpful?

0 out of 0 found this helpful