Purging images from the cache on Capistrano deployment
For some projects we use Imgix to optimise and cache images. That means we also need to purge them every time we deploy with Capistrano. Here’s how I automated this process with Capistrano and bash.
I created a bash script that will list all the images which were changed, and then purge them one by one. I then call this script from Capistrano during deployment.
Step 1: retrieving the commit hashes
Capistrano 3 keeps track of all the deployments in a file called revisions.log
in its target directory (defined by the :deploy_to
variable). Its contents looks like this:
…
Branch master (at b153b01) deployed as release 20180819205048 by stevenrombauts
Branch master (at bd32775) deployed as release 20190215074916 by stevenrombauts
We can use this file to find the git commit hashes for the current and previous deployments, like so:
SHA=($(tail -2 /var/www/capistrano/revisions.log | sed 's/)//' | awk ' { print $4 }'))
Here’s how this line works:
- We echo the last two lines from revisions.log with
tail
- The output is piped to
sed
which will simply remove the)
character from the string. (because each commit hash in the revisions.log file ends with a parenthesis, this is a quick and easy way to get rid of it) - Finally we grab the fourth (
$4
) field from the input string withawk
- Because we enclosed this command inside parentheses (
SHA=(..)
) we’ll get back an array. The first element is the previous commit hash, the second one is the one from this deployment.
Based on the example revisions.log
above, the $SHA
variable will now look like this:
$ echo $SHA
b153b01 bd32775
Step 2: list the changed files
Now that we know which commits to compare, we can grab a list of changed files from the Git repository with the git diff
command:
FILES=($(git --git-dir="/var/www/capistrano/repo" diff --name-only ${SHA[0]} ${SHA[1]}))
Again, $FILES
is an array and will look something like this if you print it out line by line:
$ printf '%s\n' "${FILES[@]}"
index.php
images/foobar.png
images/foobar.jpg
This list will contain all changed files, so we’ll have to filter out the images next.
Step 3: purge the images
Now that we know which files have been changed, we can loop over them and send a purge request to Imgix for every image.
An easy way to check if a file is an image is by using the file
command: if the output contains the string “image data” it’s an image. We can then send the purge request using cURL:
# Loop over the changed files:
for FILE in ${FILES[@]}; do
# Grab the output from the `file` command
RESULT=$(file "/var/www/capistrano/current/${FULLPATH}")
# If the output contains the string "image data", send the purge request
if echo "${RESULT}" | grep -q "image data"; then
curl -s -o /dev/null "https://api.imgix.com/v2/image/purger" -u "YOUR_API_KEY:" -d "url=http://my-site.imgix.net/${FILE}"
fi
done
Step 4: bringing it all together
Here’s the full script:
#!/bin/bash
DIRECTORY="/var/www/capistrano/"
IMGIX_API_KEY=abc123
IMGIX_URL=https://my-site.imgix.net/
if [ ! -f "${DIRECTORY}/revisions.log" ]; then
echo "revisions.log could not be found, exiting."
exit
fi
SHA=($(tail -2 "${DIRECTORY}/revisions.log" | sed 's/)//' | awk ' { print $4 }'))
if [ "${#SHA[@]}" -ne 2 ]; then
echo "Unable to read last two commit hashes, exiting."
exit
fi
FILES=($(git --git-dir="${DIRECTORY}/repo" diff --name-only ${SHA[0]} ${SHA[1]}))
for FILE in ${FILES[@]}; do
RESULT=$(file "${DIRECTORY}/${FILE}")
if echo "${RESULT}" | grep -q "image data"; then
URL="${IMGIX_URL}/${FILE}"
echo "** purging $URL"
curl -s -o /dev/null "https://api.imgix.com/v2/image/purger" -u "${IMGIX_API_KEY}:" -d "url=$URL" > /dev/null 2>&1
fi
done
I’ve stored this file as purge-imgix.sh
in a scripts
directory in my repository.
Step 5: trigger the script from Capistrano
Now we just need to call the script on every deploy. I’ll create a imgix.rake
file in my lib/capistrano/tasks
directory which will only execute the script in the production environment:
namespace :imgix do
desc 'Clear the Imgix cache'
task :purge do
on roles(:app) do
if fetch(:stage) == :production
execute "/bin/bash #{ release_path }/scripts/purge-imgix.sh > /dev/null 2>&1 &", pty: false
end
end
end
end
Note the pty: false
argument: this is important to be able to send the task to the background. Since it’s a long-running task we don’t want to block Capistrano execution.
We can then trigger this task after deployment by adding the following to my capistrano/deploy.rb
file:
namespace :deploy do
after :deploy, 'imgix:purge'
end
All done! Just deploy to production with cap production deploy
and let the purge script handle everything for you.