What is the best way to support column types of an array with external tables in a hive?

So, I have external tabulated delimited data tables. A simple table looks like this:

create external table if not exists categories (id string, tag string, legid string, image string, parent string, created_date string, time_stamp int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION 's3n://somewhere/'; 

Now I add another field to the end, it will be a comma separated list of values.

Is there a way to indicate this in the same way that I specify a field terminator, or do I need to rely on one of the seds?

eg:

 ...list_of_names ARRAY<String>) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' ARRAY ELEMENTS SEPARATED BY ',' ... 

(I assume that I will need to use a heart for this, but I decided that there was no harm in the request)

+4
source share
1 answer

I do not know how to update an existing table for this, but to create a table; what you are looking for can be found in depth https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL . Excerpt from there

 row_format : DELIMITED [FIELDS TERMINATED BY char] [COLLECTION ITEMS TERMINATED BY char] [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char] 

An example of our creating a table is

 CREATE TABLE IF NOT EXISTS visits ( ... Columns Removed... ) PARTITIONED BY (userdate STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' COLLECTION ITEMS TERMINATED BY '\002' MAP KEYS TERMINATED BY '\003' STORED AS TEXTFILE ; 

The string you are looking for is COLLECTION ITEMS TERMINATED BY char for the array.

Hth

+5
source

All Articles