AtScale Help Center
Knowledge Base
Installer-based AtScale
FAQ

What Format are AtScale Aggregate Instances Stored as in HDFS?

By default, the AtScale engine will always choose to create aggregates in Parquet format for all aggregates, which use a highly compressed format to keep file sizes small and columnar storage so that compression is applied per column.

This can be overridden by the engine setting `aggregates.tableConfig.preferredStorageFormat`, which will create aggregates in the format specified. i.e., AtScale can also store instances in RC, ORC, and sequence files and even use Hive's SerDe interface.

AtScale's choice of storing aggregate data in Parquet files is based on the fact that the supported SQL engine (Impala, Hive, Spark, etc.) can work with the aggregate data format.

For example, Impala does not support the ORC file format as documented in Cloudera Impala Guide.

Was this article helpful?

0 out of 0 found this helpful

What Format are AtScale Aggregate Instances Stored as in HDFS?

Was this article helpful?

<%= heading %>

<% if (block.html_url) { %> <%= block.name %> <% } else { %> <%= block.name %> <% } %>

Can't find what you're looking for?

PREVIOUS ARTICLE

NEXT ARTICLE

In this article

Toggle navigation menu

Toggle navigation menu

<%= category.name %>

Categories