Troubleshooting common docker pull errors on linux web app for containers

11 minute read | By Anthony Salemo

This post will cover troubleshooting common Docker Pull errors on Linux Web App for Containers. This won’t cover every error that can potentially manifest - but is designed to point you in a general direction for troubleshooting and potential resolution.

Overview

Web App for Containers uses Docker - which in this case, can be beneficial when looking at the errors the client throws back at us - since this means this is not typically App Service specific, but rather Docker specific.

The error itself may look something like this:

ERROR - DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get \"https://youracr.azurecr.io/v2/yourregistry/manifests/yourtag\": unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information."}

In general, the message itself is comprised of the following:

ERROR - DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"A first part of the message: A second part of the message"}

The message property in the response object would contain the general reason for the error.

FAQs

Where can I find these errors?

There are numerous places to view the stderr containing this. This will also always be contained in the docker.log log files that get generated by the platform.

In these cases, enabling application logs (App Service Logs) wouldn’t be needed since docker.log is generated by default. However, it’s always beneficial to keep these enabled

Another note about docker.log is the naming scheme in which the file name is created. This is good to note if you’re running on multiple instances, it will be generated as the following:

  • YYYY_MM_DD_machinename_docker.log

You can view this output in a few places:

  • Diagnose and Solve Problems -> Application Logs (detector)
  • Logstream
  • Directly via the Kudu site and browsing Log Files (/home/LogFiles)
  • Through an FTP client to view Log Files (/home/LogFiles)
  • etc.

In what scenarios would these occur?

This will always and only occur when docker pull is ran. Which could be:

  • During instance movement, such as scaling events, platform upgrades, or general platform events that require instances to be changed.
    • When new instances come into rotation, they don’t have the downloaded image layers on them yet - so a “fresh” pull with all layers will occur.
  • Application restarts. In some scenarios a restart can trigger these issues. Deployment events would be lumped into this since a restart also would be done:
    • Layers are cached on the current instance(s) the application is running on. The platform does run docker pull after each restart, but layers will ONLY be pulled if the image/tag in question actually has layers changed. Otherwise nothing happens.
    • It is during that time where a layer change/update/delete/etc. is done on the repository side, where we need to account for this, that these Docker Pull errors can manifest

Errors

There are many errors that can occur, and many of these can be grouped into certain categories.

One of the most common errors is the one already covered, which is:

ERROR - DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get \"https://youracr.azurecr.io/v2/yourregistry/manifests/yourtag\": unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information."}

This can occur in some of the following scenarios:

  • If you’re using Admin Credential (username/password) based authentication with your Azure Container Registry
    • If DOCKER_REGISTRY_SERVER_URL, DOCKER_REGISTRY_SERVER_USERNAME and/or DOCKER_REGISTRY_SERVER_PASSWORD are incorrect, missing or if the credentials were rotated on the registry side, this will surface. If credentials are rotated, the consumer (App Service) is not self-aware to the fact these were changed.
    • If Admin User is disabled and you’re expected to use username/password credentials
  • If you’re using Service Principal Authentication and DOCKER_REGISTRY_SERVER_URL, DOCKER_REGISTRY_SERVER_USERNAME and/or DOCKER_REGISTRY_SERVER_PASSWORD are missing or incorrect.
  • If you’re using Managed Identity Authentication for the image pull and the identity does not have the AcrPull role assigned.
  • If the registry in question does not have proper access to the client for pull priviledges
  • This can sometimes show alongside another error message that is actually the real issue.

The general premise is that this is due to authentication or authorization related issues. It’s important to review stderr (and docker.log) to ensure that the real reason for the pull failure is not logged out before this message.

Below are some other errors that can surface that can also use the above troubleshooting approaches to resolve:

Docker API responded with status code=NotFound, response={"message":"pull access denied for repository/site, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get https://registry-1.docker.io/v2/library/someimage/manifests/12345: unauthorized: incorrect username or password"}
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get \"https://someregistry.azurecr.io/v2/image/manifests/tag\": unauthorized: aad access token with sp failed client id must be guid"}
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get https://someregistry.azurecr.io/v2/someimage/manifests/latest: unauthorized: Invalid clientid or client secret."}
Docker API responded with status code=InternalServerError, response={"message":"Get https://someregistry.com/v2/someimage/manifests/latest: denied: access forbidden"}
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"unauthorized: The client does not have permission for manifest"}

Missing, incorrect tag/image name or invalid syntax

If the image and/or tag that you’re targeting does not exist in the registry you’re trying to pull from, then they may manifest as the below errors.

If this is the case ensure that the:

  • Image exists and is correctly spelled
  • Tag exists and is correctly spelled
  • The syntax of the Docker Registry URL, image and tag are all correct. Sometimes, this may call out the “bad” character in the error message, other times, it may not.

Sometimes it may be good to test if these images are able to be pulled to your local machine.

DockerApiException: Docker API responded with status code=NotFound, response={"message":"manifest for someregistry.azurecr.io/image:sometag not found: manifest unknown: manifest tagged by \"sometag\" is not found"}
DockerApiException: Docker API responded with status code=NotFound, response={"message":"pull access denied for someregistry, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"}
DockerApiException: Docker API responded with status code=NotFound, response={"message":"manifest for some.registry.com/someimage:sometag not found: manifest unknown: manifest unknown"}

This is an example of potential bad syntax:

Ex : DockerApiException: Docker API responded with status code=BadRequest, response={"message":"invalid reference format"}

Network blocks or access issues

In a networked environment, for example - such as in an ASE, App Service that is locked down, or using custom DNS and routing logic, amongst others - this may cause pulls to fail if traffic is not properly resolving or routed to the registry in question or being blocked by the registry itself.

In these cases, it is good to review if:

  • You can resolve the hostname of the container registry you’re trying to pull from. This is to validate if DNS can properly be looked up. Ensure if you need to use a jumpbox to validate this, that resolution is done from there.
  • Review the kind of access on the registry side - is it blocking access when docker pull is executed? docker.log on the App Service side is good to review in these situations.
  • If you’re pulling images through a VNET - and assuming that all traffic will be routed through the VNET to the registry - ensure that vnetImagePullEnabled is being used as called out here.
  • In some cases, it is a good idea to validate the image pull is successful through public network access - which can scope this down to the networked environment.
  • Take into consideration any routing (UDRs), VNETs (peered or not), and restrictions like “select networks”/Firewalls and Private Endpoints on the Azure Container Registry side.

The error produced by this can vary - but is normally along the lines of either a timeout, hostname resolution failed (usually DNS related), or explicitly mentioning that a certain Client IP is blocked.

The below 5 errors can occur if:

  • DNS resolution is failing due to environment misconfiguration
  • Firewalls or a device may be blocking the request
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get \"https://someregistry.azurecr.io/v2/\": dial tcp: lookup someregistry.azurecr.io: no such host"}
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get \"https://someregistry.azurecr.io/v2/\": dial tcp: lookup 
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get https://someregistry.com/v2/someimage/manifests/sometag: dial tcp 00.00.000.000:443: i/o timeout"}
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get \"https://someregistry.com/v2/\": dial tcp: lookup someregistry.com: Temporary failure in name resolution"}
DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"received unexpected HTTP status: 503 Service Unavailable"}

Although this below error can happen if there is a network issue/network misconfiguration issue, this can also happen if the registry that is being pulled from is also not properly set up - for example, such as if self hosting a Nexus registry behind a proxy.

DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get https://someregistry.com/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"}

This below error can happen for a few reasons:

  • If Azure Container Registry is set to only allow certain IP’s but the pull is done over one that is not whitelisted
  • If the App Service is VNET integrated (and the ACR has a Private Endpoint) but the App Service is not explicitly set to pull images through the VNET. In this case, the pull may happen over a public IP.
  • A misconfigured VNET set up (as well on the ACR side)
    DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get https://someregistry.azurecr.io/v2/someimage/manifests/latest: denied: client with IP '00.00.000.000' is not allowed access. Refer https://aka.ms/acr/firewall to grant access."}
    

Note about transient events

In the case that the App Service being used is in a networked environment (or even not, in some specific cases), one which was proven working for some time, and there is a message like the above where this is related to a networking issue - scaling up and/or down should be done to see if this mitigates the problem.

This initiates a new docker pull on different instances.

IMPORTANT: If your environment was never working because of misconfiugration, do not do this - as this will not resolve your issue.

Misc. errors

In one particular case, reported in this blog - Docker User Namespace remapping issues - you may see either of the below in your docker.log on a failed pull:

  • failed to register layer: Error processing tar file(exit status 1): Container ID 1000000 cannot be mapped to a host IDErr: 0, Message: failed to register layer: Error processing tar file(exit status 1): Container ID 1000000 cannot be mapped to a host ID
  • OCI runtime create failed: container_linux.go:380: starting container process caused: setup user: cannot set uid to unmapped user in user namespace: unknown

Followed by the following:

ERROR - DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get \"https://youracr.azurecr.io/v2/yourregistry/manifests/yourtag\": unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information."}

The real reason for the failure is the first message about UID’s - which can be investigated in the linked blog post. The authentication error is a generic error thrown and can be discarded as the issue.