I have the following error:
Error: undefined method `each' for "s3n://dico-count-words/Cache/dicoClazz.p#dicoClazzCache.p":String
When I run the following command line to run the mapreduce algorithm on an Amazon EMR cluster via flex-mapreduce, specifying a distributed cache file:
./elastic-mapreduce --create --stream \
> --input s3n://dico-count-words/Words \
> --output s3n://dico-count-words/Output \
> --mapper s3n://dico-count-words/Code/mapper.py \
> --reducer s3n://dico-count-words/Code/reducer.py \
> --log-uri s3n://dico-count-words/Logs \
> --cache s3n://dico-count-words/Cache/dicoClazz.p
I followed the instructions that I found here . I had no problems with similar commands to create other clusters that do not need a distributed cache file. I also managed to get this running using the AWS console. But I would rather do it through the CLI. I think this may be a problem with a ruby similar to this one . But I don’t know anything about ruby, so this is just a hunch. This is also the first time I'm using AWS and so is elastic-mapreduce. For your information, this is the ruby version for me:
ruby 2.0.0p451 (2014-02-24 revision 45167) [universal.x86_64-darwin13]
Do you have any ideas on where this error came from? Any suggestions for fixing it?
Many thanks.
source
share