1. and 2. Use Hive and / or HCatalog to create, read, update the structure of the ORC table in the Hive metastar (HCatalog is only a side door than Pig / Sqoop / Spark / allows, regardless of access to the metastore directly)
2. ALTER TABLE command allows you to add / remove columns regardless of the type of storage, including ORC. But be careful with an unpleasant error , which can then discard vectorized reads (at least in V0.13 and V0.14)
3. and 4. The term "index" is rather inappropriate. Basically, this minimum / maximum information was stored in the band footer during recording, and then used during reading to skip all bands that clearly do not comply with WHERE requirements, drastically reducing I / O in some cases (a trick that has become popular in repositories columns, for example, InfoBright in MySQL, but also in Oracle Exadata devices [duplicated "intelligent scanning" by Oracle marketing])
5. Hive works with the formats "row storage" (Text, SequenceFile, AVRO) and "column storage" (ORC, Parquet). The optimizer uses only certain strategies and shortcuts in the initial phase of the map - for example, strip removal, vectorized operators - and, of course, the serialization / deserialization phases are a bit more complicated with column stores.
source share