Professional deployment of websites using Capistrano - Part 1
Please note that since Capistrano 3's release in October 2013 this Capistrano 2 based tutorial series has been superseded by an updated Capistrano 3 tutorial series.
John Lennon once said that 'Life is what happens to you when you're busy making other plans' and this week's blog post is good evidence of that.
I had originally planned this blog post to be about Redmine, a project management web application, and the benefits it can bring to a business. However, recently I have been spending a lot of time working with the deployment tool Capistrano and whilst everything is fresh in my mind I wanted to get it all down for posterity and the benefit of others. In fact, I have so much to write about Capistrano that it makes sense to start a series of posts rather than creating one overwhelmingly long post. The topics of this series will be as follows:
- Part 1
- What is Capistrano and why is it so good?
- Part 2
- Secure SSH key based Capistrano website deployment from Subversion for multi-developer teams
- Part 3
- Using Capistrano for deploying PHP and other none Rails based websites
- Part 4
- Combining Capistrano and Drush for deploying Drupal powered websites
This will be an unashamedly 'techie' series of posts so if you're not a developer consider yourself forewarned if you read on!
What is Capistrano and why is it so good?
A bit of background
I have a lot of time for the wisdom of Joel Spolsky and in particular 'The Joel Test'. This is excellent reading for any software developer and it will definitely resonate with you. Item number 2 on his list is 'Can you make a build in one step?'. Within the context of website development this means taking a cut of the codebase under version control and publishing it as a production ready release in one single step. The emphasis here is on 'one single step' as this ensures potential errors and stress are minimised.
I'm sure every web developer at some point in their career has released a site by manually copying files to a remote server followed by applying in place 'fixes' so that the site works in 'live'. This is typically accompanied by a client or senior manager phoning you every few minutes to ask if the site is live yet and why they are seeing a 'white screen' when they refresh the site in their browser. If you've ever done this then you'll hopefully also recall how much of a PITA it was and agree that deployment is an important aspect of the development process that needs to be thought about up front.
Releasing a website to live isn't the only time deployment comes to the fore within the website development process. It's fairly standard practice for code changes to be published on a 'staging' site for thorough testing prior to release to live. Whist you may release to live sporadically the code may be released to the staging site dozens of times a day depending on the size of the development team. If there's even a hint of manual labour needed to deploy to staging then it will cost you a lot of time over a website's lifetime. It will also get very boring very quickly!
So what is Capistrano?
Capistrano is a utility and framework for executing commands in parallel on multiple remote machines, via SSH.https://github.com/capistrano/capistrano
This doesn't convey much by way of practical meaning but the most common real world usage of Capistrano is as a deploy script framework for websites. Capistrano is written in Ruby but you don't have to know Ruby to work with it as it uses its own domain specific language and is very easy to learn by example.
Capistrano's features include:
Out of the box Capistrano is atomic.
This means that if Capistrano encounters an error whilst working though any of its deployment tasks then it will automatically roll back to the last successful release. When you write your custom deployment tasks you can add in your own 'roll back' functionality to the task to ensure deploy fails don't leave your website in a 'dirty' state.
Easy tracking of releases.
As you will learn Capistrano functions by using symlinks. By default each candidate release directory is named using the UNIX timestamp at the point of release. You can customise this further by including useful information such as the version control revision of the release.
Capistrano gives you security and control.
Capistrano tasks can be run as a dedicated 'special' Linux user e.g. 'deploy'. You can give this user permission to access relevant servers via its SSH key. If you set file permissions correctly then developers can run your Capistrano deploy script but won't be able to edit its contents or access remote servers unless you want them to.
Parallel execution of tasks on multiple remote servers.
Capistrano allows you to execute functionality on many different remote servers at once. For example you can run tasks on your database server, your application server or your web server. You can even have multiple entries for each server type if you work with clustered architectures.
Multi-stage environment support.
More recent versions of Capistrano natively support configuring tasks or variables differently for specific target environments e.g. staging and live.
Capistrano cleans up after itself.
Old releases can take up a lot of disk space needlessly. If you've got your deploy process fully automated then everything will be in version control anyway so why keep old releases on disk. Capistrano allows you to set the number of old releases to keep. Once the number of releases exceeds this config setting then rm –rf takes effect.
Rolling back is quick and easy.
As Capistrano controls which release candidate is live via symlinks it's a very quick task to revert back to a previous release.
Why use Capistrano instead of custom deploy scripts?
You could of course roll your own deploy script rather than use Capistrano as a framework. We used to use custom shell scripts at World First before Capistrano came along. These shell scripts were quite complex but nowhere near as complex or powerful as the features the Capistrano framework offers. This question is really very similar to the 'why use a framework' question. Some immediate answers that spring to mind include:
- You can focus on your 'real' work rather than investing lots of time and effort into tooling.
- Standardisation of approach to deployment offers instant familiarity to new developers.
- Capistrano is reasonably well documented: https://github.com/capistrano/capistrano/wiki
- There is a community to rely on for support: http://groups.google.com/group/capistrano
- Capistrano is actively maintained so it will continue to improve and support your applications as time passes: https://github.com/capistrano/capistrano/graphs
The ultimate decision on whether to use Capistrano or not will depend on your immediate circumstances. Hopefully your interest is piqued enough to read the next post in this series though!