I am trying to run Spark EC2 scripts to run a cluster under the IAM role, which my user can accept in my root account.
In accordance with this JIRA ticket, we can specify --profile when running Spark EC2 scripts and comments on the pull request say that the --profile option refers to what I consider the AWSCLI profile.
When I run scripts like
ec2/spark-ec2 -k key-name -i key-name.pem -s 1 --profile myprofile --instance-type=t2.medium launch test-cluster
I get
Profile "myprofile" not found!
However, working
aws s3 ls s3:
works as intended, making you think that the IAM role is correctly specified in ~/.aws/config (I do not think that you specify IAM roles in ~/.aws/credentials ).
However, when I add a test profile to ~/.aws/credentials as
[foobar] aws_secret_access_key=xxxxxxx aws_access_key_id=xxxxxxx
Spark finds a foobar profile. However after adding
[foobar] role_arn = arn:aws:iam::12345:role/MY_ROLE aws_secret_access_key=xxxxxxx aws_access_key_id=xxxxxxx
Spark finds the foobar profile, but it is not correctly registered in the IAM role. I get
boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request <?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>InvalidKeyPair.NotFound</Code><Message>The key pair 'key-name' does not exist</Message></Error></Errors><RequestID>fcebd475-a895-4a5b-9a29-9783fd6b7f3d</RequestID></Response>
This is because the key-name pair does not exist under my user, but it exists according to the IAM role that I have to accept. This tells me that Spark is not correctly registered as an IAM.
My ~/.aws/config :
[default] region = us-east-1 aws_secret_access_key = xxxxx aws_access_key_id = xxxxx [profile myprofile] role_arn = arn:aws:iam::12345:role/MY_ROLE source_profile = default
My ~/.aws/credentials :
[default] aws_secret_access_key = xxxxx aws_access_key_id = xxxxx
Side note - also tried:
Assuming a role manually using
aws sts assume-role --role-arn arn:aws:iam::12345:role/MY_ROLE --role-session-name temp-session
then exporting the AWS_SECRET_ACCESS_KEY , AWS_SESSION_TOKEN and AWS_ACCESS_KEY_ID environment variables. Then I ran EC2 scripts without specifying a profile and got
boto.exception.EC2ResponseError: EC2ResponseError: 401 Unauthorized <?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>AuthFailure</Code><Message>AWS was not able to validate the provided access credentials</Message></Error></Errors><RequestID>11402f6e-074c-478c-84c1-11fb92ad0bff</RequestID></Response>
Side note - also tried:
According to this JIRA, on Spark scripts with IAM roles , we can specify --instance-profile-name (is the instance profile the only way to use the IAM role in this way? I.e. ... I would have to ask our administrator for the IAM / list create permissions to start the cluster with the IAM role?). I tried using arn:aws:iam::12345:role/MY_ROLE and MY_ROLE , but I get
boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request <?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Value (arn:aws:iam::12345:role/MY_ROLE) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name</Message></Error></Errors><RequestID>ffeffef9-acad-4a34-a925-31f6b5bbbb3e</RequestID></Response>