Cannot use apache flink in amazon emr

I cannot start Apache Flink session thread in Amazons EMR. The error message I get is

$ tar xvfj flink-0.9.0-bin-hadoop26.tgz $ cd flink-0.9.0 $ ./bin/yarn-session.sh -n 4 -jm 1024 -tm 4096 ... Diagnostics: File file:/home/hadoop/.flink/application_1439466798234_0008/flink-conf.yaml does not exist java.io.FileNotFoundException: File file:/home/hadoop/.flink/application_1439466798234_0008/flink-conf.yaml does not exist ... 

I am using Flink veris 0.9 and Amazons Hadoop version 4.0.0. Any ideas or tips?

The full magazine can be found here: https://gist.github.com/headmyshoulder/48279f06c1850c62c28c

+7
emr yarn amazon-emr apache-flink
source share
2 answers

From the magazine:

The file system schema is β€œfile”. This indicates that the specified Hadoop configuration path is incorrect and the system uses the default Hadoop configuration values. The Flink YARN client needs to store its files in a distributed file system.

Flink could not read the Hadoop configuration files. They are either selected from environment variables, for example. HADOOP_HOME, or you can set the configuration directory to flink-conf.yaml before executing the YARN command.

Flink needs to read the Hadoop configuration to find out how to load the flank jar into the cluster file system so that the newly created YARN cluster can access it. If Flink cannot resolve the Hadoop configuration, it uses the local file system to load the jar. This means that the jar will be placed on the machine from which you start your cluster. Therefore, it will not be accessible from the Flink YARN cluster.

See the Flink Configuration Page page for more information.

edit: In Amazong EMR, export HADOOP_CONF_DIR=/etc/hadoop/conf , let Flink find the Hadoop configuration directory.

+8
source share

If I were you, I would try with this:

./bin/yarn-session.sh -n 1 -jm 768 -tm 768

0
source share

All Articles