I am trying to optimize a very long and complex impala query that contains several CTEs. Each CTE is used several times. My expectation is that after creating the CTE, I should be able to direct impala that the results of this CTE should be reused in the main as-is query instead of the SCAN HDFS operation in the tables participating in the CTE, again with the main query. Is it possible? if so, how?
I am using impalad version 2.1.1-cdh5 RELEASE (build 7901877736e29716147c4804b0841afc4ebc9037) version
source
share