(1) Well Hive and Rhipe do not need clusters, you can run them in the same node cluster. RHipe is basically a structure (in the R-language package) that combines R and Hadoop, and you can use the power of R on Hadoop. To use Rhipe you do not need to have a cluster, you can run it anyway, for example, in cluster mode or in pseudo mode. Even if you have a Hadoop cluster of more than two nodes, you can still use Rhipe in local mode by specifying the property mapered.job.tracker = 'local'.
You can go to my website (search) for “Bangalore R User Groups” and you can see how I tried to solve problems with Rhipe, I hope you can get a fair idea
(2) Well, what does a hive mean that you mean a package of hives in R? since this package is somewhat misleading with Hive (a chaotic data warehouse).
The hive package in R is similar to Rhipe with only some additional functionality (I didn’t go through completely). The beehive package when I saw that I thought they integrated R with Hive, but after looking at the functionality it was not like dat.
Well Hadoop data ware house, which is HIVE, is mainly if you are interested in some subset of results that should go through a subset of the data that you usually use using SQL queries. Queries in HIVE are also very similar to SQL queries. To give you a very simple example: let's say you have 1 TB of stock data for different stocks over the past 10 years. Now the first thing you will do is save to HDFS, and then you will create an HIVE table on top of it. Thats it ... Now shoot any query you want. You may also need to perform a complex calculation, for example, find a simple moving average (SMA), in which case you can write your UDF (user-defined function). Besides this, you can also use UDTF (user-defined table generation function)
(3) If you have one system, it means that you are using Hadoop in pseudo-mode. In addition, you do not need to worry if Hadoop works in pseudo-mode or cluster mode, since Hive needs to be installed only on NameNode and not on data nodes. After the correct configuration is completed, the bush will take care of representing the task in the cluster. Unlike Hive, you need to install R and Rhipe on all data nodes, including NameNode. But then at any given time, if you want to run the task only in NameNode, you can do this, as I mentioned above.
(4) Another thing Rhipe is only for batch jobs, this means that the MR job will run in the entire dataset, and Hive in the subset of the data.
(5) I would like to understand what exactly you are doing in text mining, are you trying to make some kind of NLP king, for example, object name recognition using HMM (hidden Markov models), CRF (random condition fields), vector function or SVM (supporting vector machines). Or are you just trying to do document clustering, indexing, etc. Well there are packages like tm, openNLP, HMM, SVM, etc.