Create a hive index in a complex column

You can create an index on a complex column in the hive. Complex as in columns of a map, structure, array, etc.

Example:

CREATE TABLE employees (
  name         STRING,
  salary       FLOAT,
  subordinates ARRAY<STRING>,
  deductions   MAP<STRING, FLOAT>,
  address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>
)
PARTITIONED BY (country STRING, state STRING);

The following does not work:

CREATE INDEX employees_index
ON TABLE employees (address.street)
AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
WITH DEFERRED REBUILD;

FAILED: ParseException line 2:28 msgstr 'Input file' failed. ' waiting) next to the "address" in the creation of the operator index

+4
source share
2 answers

It is not possible to create an index for an element of a complex data type. The reason is that the hive does not provide a separate column for an element of a complex data type, and indexing is possible only in a table column. To better understand below.

Hive - . "WHERE tab1.col1 = 10" . col1, . , , .

CREATE INDEX employees_index
ON TABLE employees (address)
AS ‘org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler’
WITH DEFERRED REBUILD
IN TABLE employees_index_table
PARTITIONED BY (country,name)
COMMENTindex based on complex column’;

, , , :

 select * from employees where address.street='baker';

( STRUCT)
(, : , : , : XYZ, zip: 84902)

.street = baker

, . .

+2

All Articles