What file formats can be read with PIG? How can I store them in different formats?
There are several built-in methods for loading and storing , but they are limited:
- BinStorage - "binary" storage
- PigStorage - , - (, )
- TextLoader - (.. )
piggybank - , , , XML, XML-.
, CSV. MXL, ?
, XML ... XML - Hadoop, , , root? , , - XML.
, , UDF, XML:
B = FOREACH A GENERATE customudfs.DataToXML(col1, col2, col3);
, col1, col2, col3 "foo", 37, "lemons", . UDF "<item><name>Foo</name><num>37</num><fruit>lemons</fruit></item>".
, STORE, part-m-00000, ?
part-m-00000. Hadoop. , - - hadoop fs -mv output/part-m-00000 newoutput/myoutputfile. bash script, script, .