I am trying to get a working installation of Apache Pig on a PC running Vista to use it as a training tool; I do not intend to do any serious data processing with Pig on this machine. The only node installation, the single installation of JVM -x local is what I want.
I come from the background of Windows, so UNIX is a big learning curve for me, but following the tips in the Apache Pig Getting Started online documentation, I installed cygwin and it seems to work fine. I included the Perl package in downloading and installing cygwin, as described in the Getting Started section, and it seems to work fine: the / bin directory contains perl.exe, and I can access all the Perl documentation.
Then I downloaded pig-0.11.1, unpacked it using tar -xzvf pig-0.11.1.tar.gz and spent a few (mostly enjoyable) days using the errors I got while trying pig -x local to study the reference manual Bash and go through the pig shell script, which I think I really understand now. cygpath calls to the cygwin utility cygpath in this script so that pig.jar will be found and the arguments passed to java.exe will be transformed by cygpath into a form that java.exe can understand, I get a hint. But my cries of joy were short-lived.
In fact, I get the same grunt prompt with downloading, installing and using pig-0.7.0 using pig -x local , as described by RELEASE_NOTES.txt, without any intervention with my pig shell script at all. But, unfortunately, this is the same hint as the pigs-0.11.1: a curious, pseudo-grant invitation in which the arrow keys can move the cursor throughout the request, practically across the screen, compared to previous commands, indicated by the dollar, and the return key (preceded by;) does nothing but move the cursor to a new line. Text can be written but not entered, and only ^ c and ^ \ work - graciously returning a dollar Bash hint and a bit of common sense.
From my pig-0.7.0 bin/pig -help entering bin/pig -help gives the correct reading:
Apache Pig version 0.7.0 (r941408)<br /> compiled May 05 2010, 11:15:55<br /> USAGE: Pig [options] [-] : Run interactively in grunt shell.</br > Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).<br /> Pig [options] [-f[ile]] file : Run cmds found in file. options include: ... *etc etc*<br />
From the pig-0.7.0 bin/pig -x local enter bin/pig -x local following answer:
13/04/18 10:37:51 INFO pig.Main: Logging error messages to: C:\cygwin\home\Richard\pig_installation\pig-0.7.0\pig_1366277871311.log<br /> 2013-04-18 10:37:51,540 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:
From any directory, since I installed PATH in my pig-0.11.1 / bin directory, enter pig -x local to get the following response:
which: no hadoop in (usr/local/bin:/cygdrive/c/Program Files ... *etc etc* .. )<br /> 2013-04-18 10:48:59,946 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1 (r1459641) compiled Mar 22 2013, 02:13:53<br /> 2013-04-18 10:48:59,946 [main] INFO org.apache.pig.Main - Logging error messages to: C:\cygwin\home\Richard\pig_installation\pig-0.7.0\pig_1366278539943.log<br /> 2013-04-18 10:48:59,965 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file C:\Users\Richard/.pigbootup not found<br /> 2013-04-18 10:49:01,404 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///<br />
Is this a fatal mistake or am I just missing a trick? The pig powder script in pig production-0.11.1 seems to imply that if hadoop was not detected, pig / lamb or pigs will be used instead -?!! (* Withouthadoop) .jar (e.g. pig-0.11.1.jar), and the documentation tells me that pigs on Windows with cygwin are supported (for -x local , but not -x mapreduce ). Is this pseudo-grant> clue a complete mirage, or indicates partial success?
- Postscript to the above: I completed the Lead Tutorial section of the Apache Pig Getting Started documentation, set environment variables, edited the pig-0.7.0 / tutorial / build.xml file according to the instructions, ran
ant , created the pigtutorial file. tar.gz, moved it, unpacked it, found pig script 1 and ran pig -x local script1-local.pig , and IT WORKS! The output file - part-r-00000 - does not contain any warnings at all, only five columns of records, as expected. However, a new attempt to get interactive mode using pig -x local leads to the same pseudo-grunt> request.