I was just starting to learn Big Data, and at this time I am working on Flume. A common example I met was handling tweets (an example from Cloudera) using some Java.
Just for testing and modeling, can I use the local file system as a Flume source? In particular, some Excel or CSV files? Should I also use some Java code besides the Flume configuration file, as when retrieving Twitter?
Will this source be event driven or infected?
Thanks for your input.
source share