Cache issues are fixed when upgrading gems

I use dockers both in development and in production, and one thing that really scares me is the simplicity of the docker cache. I have a ruby ​​application that requires bundle install install dependencies, so I start with the following Docker file: ADD Gemfile Gemfile ADD Gemfile.lock Gemfile.lock RUN bundle install --path /root/bundle All dependencies are cached and work fine while I will not add a new stone. Even if the gem I added is only 0.5 MB, it still takes 10-15 minutes to install all the application stones from scratch. And another 10 minutes to deploy it due to the size of the dependency folder (about 300 MB).

I ran into the same problem with node_modules and npm. I was wondering, did anyone find a solution to this problem?

The results of my research:

  • Image source - caches arbitrary files in incremental lines. Unfortunately, because of how it works, it requires that all 300 MB be included in the registry, even if the gems are not changed. Fast assembly β†’ slow deployment, even if the gems are not updated.

  • Gemfile.tip - split the Gemfile into two different files and add only gems to one of them. A very specific solution for the bundle, and I'm not sure if it is going to scale beyond adding 1-2 gems.

  • Harpoon - it would be nice if it were not for the fact that they forced the Dockerfile ditch and switched to their own format.This means an additional pain for all new developers on the team, since this set of tools takes time to learn separately from dockers.

  • Temporarily packet cache. This is just an idea, I'm not sure if this is possible. Somehow, before installing packages, start the package manager cache (and not the dependency folder), and then delete it. Based on my hack, it greatly speeds up the installation of packages for both the package and npm without bloating the machine with unnecessary cache files.

+6
source share
3 answers

I cache gems in a tar file in the tmp directory of the application. Then I copy the gems to the layer using the ADD command before installing the package. From my Dockerfile.yml :

 WORKDIR /home/app # restore the gem cache. This only runs when # gemcache.tar.bz2 changes, so usually it takes # no time ADD tmp/gemcache.tar.bz2 /var/lib/gems/ COPY Gemfile /home/app/Gemfile COPY Gemfile.lock /home/app/Gemfile.lock RUN gem update --system && \ gem update bundler && \ bundle install --jobs 4 --retry 5 

Make sure you send the gem cache to your docker machine. My gemcache is 118 MB, but since I create locally, it is quickly copied. My .dockerignore :

 tmp !tmp/gemcache.tar.bz2 

You need to cache gems from the embedded image, but initially you may not have the image. Create an empty cache like this (I have this in a rake task):

 task :clear_cache do sh "tar -jcf tmp/gemcache.tar.bz2 -T /dev/null" end 

After creating the image, copy the gems into the cache. My image is tagged app . I am creating a docker container from an image, copy /var/lib/gems/2.2.0 to my gemcache using the docker cp , and then delete the container. Here is my task:

 task :cache_gems do id = `docker create app`.strip begin sh "docker cp #{id}:/var/lib/gems/2.2.0/ - | bzip2 > tmp/gemcache.tar.bz2" ensure sh "docker rm -v #{id}" end end 

The next time you build the image, gemcache is copied to the layer before calling bundle install . This takes some time, but it is faster than bundle install from scratch.

Building after that is even faster because docker loaded the ADD tmp/gemcache.tar.bz2 /var/lib/gems/ layer ADD tmp/gemcache.tar.bz2 /var/lib/gems/ . If there are any changes to Gemfile.lock , only those changes will be created.

There is no reason to restore the gem cache every time you change Gemfile.lock . When there are enough differences between the cache and the Gemfile.lock tag that bundle install is slow, you can restore the cache. When I want to rebuild the cache, this is a simple rake cache_gems .

+2
source

I found two possible solutions that use an external amount of data to store gems: one and two .

In short

  • you specify an image that is used to store only gems
  • in your application images, in docker-compose.yml you specify the mount point for BUNDLE_PATH via volumes_from .
  • when the application container starts, it executes bundle check || bundle install bundle check || bundle install and everything is going well.

This is one of the possible solutions, but for me it seems like it slightly contradicts the docker. In particular, bundle install sounds to me as if it should be part of the build process and should not be part of the runtime. Other things that depend on bundle install , like asset:precompile , are now also an execution task.

This solution is vaiable, but I look forward to something more reliable.

+3
source

The "copy local dependencies" approach (accepted answer) is a bad IMO idea. The whole point of dockertizing your environment is to have an isolated reproducible environment.

This is how we do it .

 # .docker/docker-compose.dev.yml version: '3.7' services: web: build: . command: 'bash -c "wait-for-it cache:1337 && bin/rails server"' depends_on: - cache volumes: - cache:/bundle environment: BUNDLE_PATH: '/bundle' cache: build: context: ../ dockerfile: .docker/bundle.Dockerfile volumes: - bundle:/bundle environment: BUNDLE_PATH: '/bundle' ports: - "1337:1337" volumes: cache: 
 # .docker/cache.Dockerfile FROM ruby:2.6.3 RUN apt-get update -qq && apt-get install -y netcat-openbsd COPY Gemfile* ./ COPY .docker/cache-entrypoint.sh ./ RUN chmod +x cache-entrypoint.sh ENTRYPOINT ./cache-entrypoint.sh 
 # .docker/cache-entrypoint.sh #!/bin/bash bundle check || bundle install nc -l -k -p 1337 
 # web.dev.Dockerfile FROM ruby:2.6.3 RUN apt-get update -qq && apt-get install -y nodejs wait-for-it WORKDIR ${GITHUB_WORKSPACE:-/app} # Note: bundle install step removed COPY . ./ 

This is similar to the concept explained by @EightyEight, but it does not put bundle install at the start of the main service, instead the update is controlled by another service. In any case, do not use this approach in production. starting services without their dependencies installed during the build phase will at least cause more downtime than necessary.

0
source

All Articles