Your question is actually about the predicate in the hive.
Well, in the case above, the execution will be exactly the same as the hive will have the predicate A.ds='2014-01-01' AND B.ds='2014-01-01' to mappers before joining.
In a more general case, JOIN (inner join) is actually quite easy and can be summed up to:
If he can push, he will push.
It can click a predicate when only one table is involved ( where ax > 1 ), and cannot click if more than one table is involved ( A.userid > B.userid ), since the cartographer reads only the partition of one of the tables ..
The more complex part of OUTER JOIN and furtunelty is very clearly explained here .
PS
The pushdown predicate is controlled by hive.optimize.ppd , which is true by default.
dimamah
source share