8

I am using bigquery connector for aws glue in my glue job. It was working fine few days ago but now suddenly it is giving me below error:

LAUNCH ERROR | Glue ETL Marketplace - failed to download connector.Please refer logs for details.

Below is the full error that i am getting on cloudwatch


2021-11-08T11:33:02.045+05:00   Traceback (most recent call last): File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main

2021-11-08T11:33:02.070+05:00   "__main__", mod_spec) File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/tmp/aws_glue_custom_connector_python/docker/unpack_docker_image.py", line 361, in <module>

2021-11-08T11:33:02.070+05:00   main() File "/tmp/aws_glue_custom_connector_python/docker/unpack_docker_image.py", line 351, in main

2021-11-08T11:33:02.070+05:00   res += download_jars_per_connection(conn, region, endpoint, proxy) File "/tmp/aws_glue_custom_connector_python/docker/unpack_docker_image.py", line 304, in download_jars_per_connection

2021-11-08T11:33:02.070+05:00   download_and_unpack_docker_layer(ecr_url, layer["digest"], dir_prefix, http_header) File "/tmp/aws_glue_custom_connector_python/docker/unpack_docker_image.py", line 168, in download_and_unpack_docker_layer

2021-11-08T11:33:02.070+05:00   layer = send_get_request(layer_url, header) File "/tmp/aws_glue_custom_connector_python/docker/unpack_docker_image.py", line 80, in send_get_request

2021-11-08T11:33:02.070+05:00   

2021-11-08T11:33:02.070+05:00   response.raise_for_status() File "/home/spark/.local/lib/python3.7/site-packages/requests/models.py", line 765, in raise_for_status

2021-11-08T11:33:02.071+05:00   raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 400 Client Error: Bad Request

2021-11-08T11:33:02.119+05:00   Glue ETL Marketplace - failed to download connector, activation script exited with code 
7
  • Aws is not loading any connector in glue job. Commented Nov 12, 2021 at 16:10
  • I also have the same issue. ( with Hudi Connector ) In my case, Glue 2.0 doesn't have this issue. Glue 3.0 has this issue. Commented Nov 13, 2021 at 6:51
  • Yes on Glue 2.0 it is working fine. The only problem is for Glue 3.0 Commented Nov 13, 2021 at 8:41
  • I created a support ticket to AWS. Commented Nov 13, 2021 at 22:37
  • Do we still need to add Connector even if we can add the dependencies by using Jars path? This is sth I am going to try out. Commented Nov 13, 2021 at 22:41

5 Answers 5

8

When Glue job tried using connector, it has to download the connector in the form of container. The containers for connectors are available in amazon public ECR repo. To pull the container from the AWS public repo, we have to add "AmazonEC2ContainerRegistryFullAccess" policy to your IAM role. We can limit the access to read only as well.

Sign up to request clarification or add additional context in comments.

Comments

3

I've come around this in an organizational setting trying to use the BigQuery Markertplace connector. I was explicitly denied to GetAuthorizationToken on non eu regions. Hence, the Glue Job will fail in a similar way as describes by OP because it tries to download the docker image on run from here: https://709825985650.dkr.ecr.us-east-1.amazonaws.com/amazon-web-services/glue/bigquery:0.22.0-glue3.0-2

A possible workaround is to push a copy of the image to your private ECR. Then, when creating the GLUE connection, set CONNECTOR_URL in its connection_properties to your private ECR url. This will solve similar issues. Also, this seems more reasonable then adding wide reaching policies like AmazonEC2ContainerRegistryFullAccess (as suggested by Sparkian) to a role. You'll be able to give granular access permissions on this specific ECR repo instead.

/e: If AmazonEC2ContainerRegistryFullAccess solves your issue, you can also use AmazonEC2ContainerRegistryReadOnly instead. This will be more restrictive in the permissions you add and achieve the same. It's also described in this AWS Marketplace connector troubleshooting guide

Comments

1

This could very likely be a permissions issue. I was running into it and temporarily gave looser permissions, which seemed to resolve it.

2 Comments

This issue is only with glue version 3.0 on glue version 2.0 it works with same role.
maybe Glue 3.0 requires more permission than Glue 2.0? maybe that's what @StackPro_1111 meant?
1

I was also facing the same issue with Glue3.0.

Error log was:

Glue ETL Marketplace - Requesting ECR authorization token for registryIds=709825985650 and region_name=us-east-1.
...
...
socket.timeout: timed out
...
...
(<botocore.awsrequest.AWSHTTPSConnection object at 0x7f1136778a90>, 'Connection to api.ecr.us-east-1.amazonaws.com timed out. (connect timeout=60)')

...

botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "https://api.ecr.us-east-1.amazonaws.com/"
Glue ETL Marketplace - failed to download connector, activation script exited with code 1
LAUNCH ERROR | Glue ETL Marketplace - failed to download connector.Please refer logs for details.

I was able to solve this by adding AmazonEC2ContainerRegistryFullAccess to the service role, along with the following policy to get a random password. My guess is that when it tries to pull the ECR image, it needs to generate some random password for the time being.

I also added this IAM policy to IAM Role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "secretsmanager:GetRandomPassword",
            "Resource": "*"
        }
    ]
}

And then it worked.

Comments

0

I have this issue with the GCP BigQuery connector. Some jobs are able to run the connector, some aren't. All have the same permissions and settings. There seems to be an issue after requesting the ECR Authorization Token where the request times out.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.