How does Docker calculate the hash of each layer? Is it deterministic?

I tried to find this information in the official Docker docs, but was not successful.

What parts of the Docker data do you consider when calculating the hash of each commit / level?

It's pretty obvious that the line in the Dockerfile is part of the hash and, of course, the parent hash is the token. But what else needs to be considered when calculating this hash?

Specific use case. Suppose I have two developers on different machines at different points in time (and because of this, different $ docker build ... daemons and different caches) working under $ docker build ... against the same Docker file. The FROM ... directive FROM ... will give them the same starting point, but will the hash result of each operation work on the same hash? Is it deterministic?

+6
source share
1 answer

Thanks @thaJeztah. The answer is at https://gist.github.com/aaronlehmann/b42a2eaf633fc949f93b#id-definitions-and-calculations

  • layer.DiffID : identifier for a single layer

    Calculation: DiffID = SHA256hex (data of uncompressed tar level)

  • layer.ChainID : identifier for the layer and its parents. This identifier uniquely identifies a file system consisting of a set of layers.

    Calculation:

    • For the bottom layer: ChainID (layer0) = DiffID (layer0)
    • For other layers: ChainID (layerN) = SHA256hex (ChainID (layerN-1) + "" + DiffID (layerN))
  • image.ID : image identifier. Because the image configuration refers to the layers used by their images, this identifier includes file system data and the rest of the image configuration.

    Calculation: SHA256hex (imageConfigJSON)

0
source

All Articles