How to collect all data in a bush from subdirectories

I have data arranged in directories in a specific format (shown below) and you want to add them to the hive table. I want to add all the data for the 2012 catalog. All names are below directory names, and the innermost dir (3rd level) has the actual data files. Is there a way to select data directly without changing this structure. Any pointers are appreciated.

/2012/ | |---------2012-01 |---------2012-01-01 |---------2012-01-02 |... |... |---------2012-01-31 | |---------2012-02 |---------2012-02-01 |---------2012-02-02 |... |... |---------2012-02-28 | |---------2012-03 |... |... |---------2012-12 

Queries done so far with no luck:

 CREATE EXTERNAL TABLE sampledata (datestr string, id string, locations string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION '/path/to/data/2012/*/*'; CREATE EXTERNAL TABLE sampledata (datestr string, id string, locations string) partitioned by (ystr string, ymstr string, ymdstr string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'; ALTER TABLE sampledata ADD PARTITION (ystr ='2012') LOCATION '/path/to/data/2012/'; 

SOLUTION: This small parameter fixes my problem. Adding to the question where this might be useful to others:

 SET mapred.input.dir.recursive=true; 
+7
hive partition
source share
4 answers

Answering my question with a solution that works for my case. SET mapred.input.dir.recursive = true;

+9
source share
 ALTER TABLE sampledata ADD PARTITION (ystr ='2012', ymstr='2012-01', ymdstr='2012-01-01') LOCATION '/path/to/data/2012/2012-01/2012-01-01'; 
+1
source share
 SET hive.mapred.supports.subdirectories=true; SET mapred.input.dir.recursive=true; 
+1
source share

The following worked on hortonworks

 alter table .... set blproperties ( "hive.input.dir.recursive" = "TRUE", "hive.mapred.supports.subdirectories" = "TRUE", "hive.supports.subdirectories" = "TRUE", "mapred.input.dir.recursive" = "TRUE"); 
0
source share

All Articles