jtimberman's Code Blog

Topics include: Opscode Chef, configuration management, Ruby, Linux/Unix administration. Opinions expressed here are my own and do not reflect those of my employer (Opscode, Inc.).

Starting ChefSpec Example

This is a quick post to introduce what I’m starting on testing with ChefSpec. This is from Opscode’s Java cookbook. While the recipe tested is really trivial, it actually has some nuances that require detailed testing.

First off, the whole thing is in this gist. I’m going to break it down into sections below. The file is spec/default_spec.rb in the java cookbook (not committed/pushed yet).

The chefspec gem is where all the magic comes from. You can read about ChefSpec on its home page. You’ll need to install the gem, and from there, run rspec to run the tests.

1
require 'chefspec'

Next, we’re going to describe the default recipe. We’re using the regular rspec “let” block to set up the runner to converge the recipe. Then, because we know/assume that the openjdk recipe is the default, we can say that this chef run should include the java::openjdk recipe.

1
2
3
4
5
describe 'java::default' do
  let (:chef_run) { ChefSpec::ChefRunner.new.converge('java::default') }
  it 'should include the openjdk recipe by default' do
    chef_run.should include_recipe 'java::openjdk'
  end

Next, this cookbook supports Windows. However, we have to set up the runner with the correct platform and version (this comes from fauxhai), and then set attributes that are required for it to work.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
context 'windows' do
    let(:chef_run) do
      runner = ChefSpec::ChefRunner.new(
        'platform' => 'windows',
        'version' => '2008R2'
        )
      runner.node.set['java']['install_flavor'] = 'windows'
      runner.node.set['java']['windows']['url'] = 'http://example.com/windows-java.msi'
      runner.converge('java::default')
    end
    it 'should include the windows recipe' do
      chef_run.should include_recipe 'java::windows'
    end
  end

Next are the contexts for other install flavors. The default recipe will include the right recipe based on the flavor, which is set by an attribute. So we set up an rspec context for each recipe, then set the install flavor attribute, and test that the right recipe was included.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
  context 'oracle' do
    let(:chef_run) do
      runner = ChefSpec::ChefRunner.new
      runner.node.set['java']['install_flavor'] = 'oracle'
      runner.converge('java::default')
    end
    it 'should include the oracle recipe' do
      chef_run.should include_recipe 'java::oracle'
    end
  end
  context 'oracle_i386' do
    let(:chef_run) do
      runner = ChefSpec::ChefRunner.new
      runner.node.set['java']['install_flavor'] = 'oracle_i386'
      runner.converge('java::default')
    end
    it 'should include the oracle_i386 recipe' do
      chef_run.should include_recipe 'java::oracle_i386'
    end
  end

Finally, a recent addition to this cookbook is support for IBM’s Java. In addition to setting the install flavor, we must set the URL where the IBM Java package is (see the README in the commit linked in that ticket for detail), and we can see that the ibm recipe is in fact included.

1
2
3
4
5
6
7
8
9
10
11
12
  context 'ibm' do
    let(:chef_run) do
      runner = ChefSpec::ChefRunner.new
      runner.node.set['java']['install_flavor'] = 'ibm'
      runner.node.set['java']['ibm']['url'] = 'http://example.com/ibm-java.bin'
      runner.converge('java::default')
    end
    it 'should include the ibm recipe' do
      chef_run.should include_recipe 'java::ibm'
    end
  end
end

This is just the start of the testing for this cookbook. We’ll need to test each individual recipe. However as I’ve not written that code yet, I don’t have examples. Stay tuned!

Test Kitchen and Jenkins

I’ve been working more with test-kitchen 1.0 alpha lately. The most recent thing I’ve done is set up a Jenkins build server to run test-kitchen on cookbooks. This post will describe how I did this for my own environment, and how you can use my new test-kitchen cookbook in yours… if you’re using Jenkins, anyway.

This is all powered by a relatively simple cookbook, and some click-click-clicking in the Jenkins UI. I’ll walk through what I did to set up my Jenkins system.

First, I started with Debian 7.0 (stable, released this past weekend). I installed the OS on it, and then bootstrapped with Chef. The initial test was to make sure everything installed correctly, and the commands were functioning. This was done in a VM, and is now handled by test-kitchen itself (how meta!) in the cookbook, kitchen-jenkins.

The cookbook, kitchen-jenkins is available on the Chef Community site. I started with a recipe, but extracted it to a cookbook to make it easier to share with you all. This is essentially a site cookbook that I use to customize my Jenkins installation so I can run test-kitchen builds.

I apply the recipe with a role, because I love the roles primitive in Chef :-). Here is the role I’m using:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
  "name": "jenkins",
  "description": "Jenkins Build Server",
  "run_list": [
    "recipe[kitchen-jenkins]"
  ],
  "default_attributes": {
    "jenkins": {
      "server": {
        "home": "/var/lib/jenkins",
        "plugins": ["git-client", "git"],
        "version": "1.511",
        "war_checksum": "7e676062231f6b80b60e53dc982eb89c36759bdd2da7f82ad8b35a002a36da9a"
      }
    }
  },
  "json_class": "Chef::Role",
  "chef_type": "role"
}

The run list is only slightly different here than my actual role, I have a few other things in the run list, which are other site-specific recipes. Don’t worry about those now. The jenkins attributes are set to ensure the right plugins I need are available, and the right version of jenkins is installed.

(I’m going to leave out the details such as uploading cookbooks and roles, if you’re interested in test-kitchen, I’ll assume you’ve got that covered :-).)

Once Chef completes on the Jenkins node, I can reach the Jenkins UI, conveniently enough, via “http://jenkins:8080” (because I’ve made a DNS entry, of course). The next release of the Jenkins cookbook will have a resource for managing jobs, but for now I’m just going to create them in the webui.

For this example, I want to have two kinds of cookbook testing jobs. The first, is to simply run foodcritic and fail on any correctness matches. Second, I want to actually run test-kitchen.

A foodcritic job is simple:

  1. New job -> Build a free-style software project “foodcritic-COOKBOOK”.
  2. Source Code Management -> Git, supply the repository and the master branch.
  3. Set a build trigger to Poll SCM every 5 minutes, once an hour, whenever you like.
  4. Add a build step to execute a shell, “foodcritic . -f correctness”

I created a view for foodcritic jobs, and added them all to the view for easy organizing.

Next, I create a test-kitchen job:

  1. New job -> Copy existing job “foodcritic-COOKBOOK”, name the new job “test-COOKBOOK”.
  2. Uncheck Poll SCM, check “Build after other projects are built” and enter “foodcritic-COOKBOOK”.
  3. Replace the foodcritic command in the build shell command with “kitchen test”.

Now, the test kitchen test will only run if the foodcritic build succeeds. If the cookbook has any correctness lint errors, then the foodcritic build fails, and the kitchen build won’t run. This will help conserve resources.

Hopefully the kitchen-jenkins cookbook is helpful and this blog post will give you some ideas how to go about adding cookbook tests to your CI system, even if it’s not Jenkins.

TDD Cookbook Ticket

This post will briefly describe how I did a TDD update to Opscode’s runit to resolve an issue reported last night.

First, the issue manifests itself only on Debian systems. The runit cookbook’s runit_service provider will write an LSB init.d script on Debian, rather than symlinking to /usr/bin/sv. The problem raised in the new ticket is that the template will follow the link and write to /usr/bin/sv. This is bad, as it will end up in a forkbomb as runsvdir attempts to restart sv on all the things. Oops! Sorry about that. Let’s get it fixed, and practice some TDD.

The runit cookbook includes support for test-kitchen, though I did need to update it for this effort. Part of this change was adding a box for Debian in the .kitchen.yml. I set about resolving this with TDD in mind.

First, the runit cookbook includes a couple “test” cookbooks to facilitate setting up the system with the runit_service resource so the outcome can be tested to ensure the behavior is correct. I started by adding a “failing test” in the runit_test::service recipe, meaning a link resource, and a runit_service resource that would overwrite /usr/bin/sv.

1
2
3
4
5
6
7
link "/etc/init.d/cook-2867" do
  to "/usr/bin/sv"
end

runit_service "cook-2867" do
  default_logger true
end

Then I ran kitchen test on the Debian box. As expected, the link was created, and then the runit service was configured. The service’s provider will wait until the service is up. Since we’ve destroyed the sv binary, that will never happen, so I destroyed it. I manually confirmed the behavior too, to make sure I wasn’t seeing something weird. Due to its very nature, this is really hard to test for automatically, but it will happen consistently.

Next, I had to write the code to implement the fix for this bug. Essentially, this means checking if the /etc/init.d/cook-2867 file is a symbolink link, and removing it.

1
2
initfile = ::File.join( '/etc', 'init.d', new_resource.service_name)
::File.unlink(initfile) if ::File.symlink?(initfile)

Simple enough. Next I tested again by destroying the existing environment and rerunning it from scratch. This takes some time, but it verifies that everything is working properly. Here’s the output on Debian:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
INFO: Processing link[/etc/init.d/cook-2867] action create (runit_test::service line 147)
INFO: link[/etc/init.d/cook-2867] created
INFO: Processing service[cook-2867] action nothing (dynamically defined)
INFO: Processing runit_service[cook-2867] action enable (runit_test::service line 151)
INFO: Processing directory[/etc/sv/cook-2867] action create (dynamically defined)
INFO: Processing template[/etc/sv/cook-2867/run] action create (dynamically defined)
INFO: Processing directory[/etc/sv/cook-2867/log] action create (dynamically defined)
INFO: Processing directory[/etc/sv/cook-2867/log/main] action create (dynamically defined)
INFO: Processing directory[/var/log/cook-2867] action create (dynamically defined)
INFO: Processing file[/etc/sv/cook-2867/log/run] action create (dynamically defined)
INFO: Processing template[/etc/init.d/cook-2867] action create (dynamically defined)
INFO: template[/etc/init.d/cook-2867] updated content
INFO: template[/etc/init.d/cook-2867] owner changed to 0
INFO: template[/etc/init.d/cook-2867] group changed to 0
INFO: template[/etc/init.d/cook-2867] mode changed to 755
INFO: runit_service[cook-2867] configured
INFO: Chef Run complete in 7.267132764 seconds
INFO: Running report handlers

I didn’t feel I needed a specific test for this in minitest-chef, because it wouldn’t have finished converging (earlier behavior I saw in the “failing” test).

If you’re contributing to cookbooks, and they have support for test-kitchen, it’s awesome if you can open a bug report with a failing test. In this case, it was fairly easy to reproduce the bug.

Anatomy of a Test Kitchen 1.0 Cookbook (Part 2)

DISCLAIMER Test Kitchen 1.0 is still in alpha at the time of this post.

Update We’re no longer required to use bundler, and in fact recommend installing the required RubyGems in your globalRuby environment (#3 below).

Update The log output from the various kitchen commands is not updated with the latest and greatest. Play along at home, it’ll be okay :-).

This is a continuation from part 1

In order to run the tests then, we need a few things on our machine:

  1. VirtualBox and Vagrant (1.1+)
  2. A compiler toolchain with XML/XSLT development headers (for building Gem dependencies)
  3. A sane, working Ruby environment (Ruby 1.9.3 or greater)
  4. Git

It is outside the scope of this post to cover how to get all those installed.

Once those are installed:

1
2
3
4
% vagrant plugin install vagrant-berkshelf
% gem install berkshelf
% gem install test-kitchen --pre
% gem install kitchen-vagrant

Test Kitchen combines the suite (default) with the platform names (e.g., ubuntu-12.04). To run all the suites on all platforms, simply do:

1
% kitchen test

This will take awhile, especially if you don’t already have the Vagrant boxes on your system, as it will download each one. To make this faster, we’ll just run Ubuntu 12.04:

1
% kitchen test default.*1204

Test Kitchen 1.0 can take a regular expression for the instances to test. This will match the box default-ubuntu-12.04. I could also just say 12 as that will match the single entry in my kitchen list (above).

It will take a few minutes to run Test Kitchen. Those familiar with Chef know that if it encounters an unhandled exception, it exits with a non-zero return code. This is important, because we know at the end of a successful run, Chef did the right thing, assuming our recipe is the right thing :-).

To recap the previous post, we have a run list like this:

1
["recipe[apt]", "recipe[minitest-handler]", "recipe[bluepill_test]"]

Let’s break down the output of our successful run. I’ll show the output first, and explain it after:

1
2
3
4
5
6
Starting Kitchen
Cleaning up any prior instances of <default-ubuntu-1204>
Destroying <default-ubuntu-1204>
Finished destroying <default-ubuntu-1204> (0m0.00s).
Testing <default-ubuntu-1204>
Creating <default-ubuntu-1204>

This is basic setup to ensure that “The Kitchen” is clean beforehand and we don’t have existing state interfering with the run.

1
2
3
4
5
6
[vagrant command] BEGIN (vagrant up default-ubuntu-1204 --no-provision)
[default-ubuntu-1204] Importing base box 'canonical-ubuntu-12.04'...
[default-ubuntu-1204] Matching MAC address for NAT networking...
[default-ubuntu-1204] Clearing any previously set forwarded ports...
[default-ubuntu-1204] Forwarding ports...
[default-ubuntu-1204] -- 22 => 2222 (adapter 1)

This will look familiar to Vagrant users, we’re just getting some basic setup from Vagrant initializing the box defined in the .kitchen.yml (passed to the Vagrantfile by the kitchen-vagrant plugin). This step does a vagrant up --no-provision.

1
2
3
4
5
6
7
8
[Berkshelf] installing cookbooks...
[Berkshelf] Using bluepill (2.2.2) at path: '/Users/jtimberman/Development/opscode/cookbooks/bluepill'
[Berkshelf] Using apt (1.8.4)
[Berkshelf] Using yum (2.0.0)
[Berkshelf] Using minitest-handler (0.1.2)
[Berkshelf] Using bluepill_test (0.0.1) at path: './test/cookbooks/bluepill_test'
[Berkshelf] Using rsyslog (1.5.0)
[Berkshelf] Using chef_handler (1.1.0)

Remember from the previous post that we’re using Berkshelf? This is the integration with Vagrant that ensures that the cookbooks are available. The first four, apt, yum, minitest-handler and bluepill_test are defined in the Berksfile. The next, rsyslog is a dependency of the bluepill cookbook (for rsyslog integration), and the last, chef_handler is a dependency of minitest-handler. Berkshelf extracts the dependencies from the cookbook metadata of each cookbook defined in the Berksfile.

1
2
3
4
5
6
7
8
9
10
11
12
13
[default-ubuntu-1204] Creating shared folders metadata...
[default-ubuntu-1204] Clearing any previously set network interfaces...
[default-ubuntu-1204] Running any VM customizations...
[default-ubuntu-1204] Booting VM...
[default-ubuntu-1204] Waiting for VM to boot. This can take a few minutes.
[default-ubuntu-1204] VM booted and ready for use!
[default-ubuntu-1204] Setting host name...
[default-ubuntu-1204] Mounting shared folders...
[default-ubuntu-1204] -- v-root: /vagrant
[default-ubuntu-1204] -- v-csc-1: /tmp/vagrant-chef-1/chef-solo-1/cookbooks
[vagrant command] END (0m48.76s)
Vagrant instance <default-ubuntu-1204> created.
Finished creating <default-ubuntu-1204> (0m53.12s).

Again, this is familiar output to Vagrant users, where Vagrant is making the cookbooks available to the instance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Converging <default-ubuntu-1204>
[vagrant command] BEGIN (vagrant ssh default-ubuntu-1204 --command 'should_update_chef() {\n...')
Installing Chef Omnibus (11.4.0)
Downloading Chef 11.4.0 for ubuntu...
Installing Chef 11.4.0
Selecting previously unselected package chef.
g database ...        60513 files and directories currently installed.)
Unpacking chef (from .../chef_11.4.0_amd64.deb) ...
Setting up chef (11.4.0-1.ubuntu.11.04) ...
Thank you for installing Chef!
[vagrant command] END (0m34.85s)
[vagrant command] BEGIN (vagrant provision default-ubuntu-1204)
[Berkshelf] installing cookbooks...
[Berkshelf] Using bluepill (2.2.2) at path: '/Users/jtimberman/Development/opscode/cookbooks/bluepill'
[Berkshelf] Using apt (1.8.4)
[Berkshelf] Using yum (2.0.0)
[Berkshelf] Using minitest-handler (0.1.2)
[Berkshelf] Using bluepill_test (0.0.1) at path: './test/cookbooks/bluepill_test'
[Berkshelf] Using rsyslog (1.5.0)
[Berkshelf] Using chef_handler (1.1.0)

This part is interesting, in that we’re going to install the Full Stack Chef (Omnibus) package. This means it doesn’t matter what the underlying base box has installed, we get the right version of Chef. This is defined in the .kitchen.yml. This is done through vagrant ssh (second line). Then, Test Kitchen does vagrant provision. The provisioning step is where Berkshelf happens, so we do see this happen again (perhaps a bug?).

1
2
3
4
5
6
7
8
[default-ubuntu-1204] Running provisioner: Vagrant::Provisioners::ChefSolo...
[default-ubuntu-1204] Generating chef JSON and uploading...
[default-ubuntu-1204] Running chef-solo...
INFO: *** Chef 11.4.0 ***
INFO: Setting the run_list to ["recipe[apt]", "recipe[minitest-handler]", "recipe[bluepill_test]"] from JSON
INFO: Run List is [recipe[apt], recipe[minitest-handler], recipe[bluepill_test]]
INFO: Run List expands to [apt, minitest-handler, bluepill_test]
INFO: Starting Chef Run for default-ubuntu-1204.vagrantup.com

This is the start of the actual Chef run, using Chef Solo by Vagrant’s provisioner. Note that we have our suite’s run list. I’m going to skip a lot of the Chef output because it isn’t required. Note that a few resources in the minitest–handler will report as failed, but they can be ignored because it means that those tests were simply not implemented.

1
2
3
4
5
6
7
8
9
INFO: Processing directory[/var/chef/minitest/bluepill_test] action create (minitest-handler::default line 50)
INFO: directory[/var/chef/minitest/bluepill_test] created directory /var/chef/minitest/bluepill_test
INFO: Processing cookbook_file[tests-bluepill_test-default] action create (minitest-handler::default line 53)
INFO: cookbook_file[tests-bluepill_test-default] created file /var/chef/minitest/bluepill_test/default_test.rb
INFO: Processing remote_directory[tests-support-bluepill_test-default] action create (minitest-handler::default line 60)
INFO: remote_directory[tests-support-bluepill_test-default] created directory /var/chef/minitest/bluepill_test/support
INFO: Processing cookbook_file[/var/chef/minitest/bluepill_test/support/helpers.rb] action create (dynamically defined)
INFO: cookbook_file[/var/chef/minitest/bluepill_test/support/helpers.rb] mode changed to 644
INFO: cookbook_file[/var/chef/minitest/bluepill_test/support/helpers.rb] created file /var/chef/minitest/bluepill_test/support/helpers.rb

These are the relevant parts of the minitest-handler recipe, where it has copied the tests from the bluepill_test cookbook into place.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
INFO: Processing gem_package[i18n] action install (bluepill::default line 20)
INFO: Processing gem_package[bluepill] action install (bluepill::default line 24)
INFO: Processing directory[/etc/bluepill] action create (bluepill::default line 34)
INFO: directory[/etc/bluepill] created directory /etc/bluepill
INFO: directory[/etc/bluepill] owner changed to 0
INFO: directory[/etc/bluepill] group changed to 0
INFO: Processing directory[/var/run/bluepill] action create (bluepill::default line 34)
INFO: directory[/var/run/bluepill] created directory /var/run/bluepill
INFO: directory[/var/run/bluepill] owner changed to 0
INFO: directory[/var/run/bluepill] group changed to 0
INFO: Processing directory[/var/lib/bluepill] action create (bluepill::default line 34)
INFO: directory[/var/lib/bluepill] created directory /var/lib/bluepill
INFO: directory[/var/lib/bluepill] owner changed to 0
INFO: directory[/var/lib/bluepill] group changed to 0
INFO: Processing file[/var/log/bluepill.log] action create_if_missing (bluepill::default line 41)
INFO: entered create
INFO: file[/var/log/bluepill.log] owner changed to 0
INFO: file[/var/log/bluepill.log] group changed to 0
INFO: file[/var/log/bluepill.log] mode changed to 755
INFO: file[/var/log/bluepill.log] created file /var/log/bluepill.log

Recall from the previous post that the bluepill_test recipe includes the bluepill recipe. This is the basic setup of bluepill.

1
2
3
4
5
6
7
8
9
INFO: Processing package[nc] action install (bluepill_test::default line 4)
INFO: Processing template[/etc/bluepill/test_app.pill] action create (bluepill_test::default line 16)
INFO: template[/etc/bluepill/test_app.pill] updated content
INFO: Processing bluepill_service[test_app] action enable (bluepill_test::default line 18)
INFO: Processing bluepill_service[test_app] action load (bluepill_test::default line 18)
INFO: Processing bluepill_service[test_app] action start (bluepill_test::default line 18)
INFO: Processing link[/etc/init.d/test_app] action create (/tmp/vagrant-chef-1/chef-solo-1/cookbooks/bluepill/providers/service.rb line 30)
INFO: link[/etc/init.d/test_app] created
INFO: Chef Run complete in 81.099185824 seconds

And this is the rest of the bluepill_test recipe. It sets up a test service that will basically be a netcat process listening on a port. Let’s take a moment here and discuss what we have.

First, we have successfully converged the default recipe in the bluepill cookbook via its inclusion in bluepill_test. This is awesome, because we know the recipe works exactly as we defined it, since Chef resources are declarative, and Chef exits if there’s a problem.

Second, we have successfully setup a service managed by bluepill itself using the LWRP included in the bluepill cookbook, bluepill_service. This means we know that the underlying provider configured all the resources correctly.

At this point, we could say “Ship it!” and release the cookbook, knowing it will do what we require. However, this may be disingenuous because we don’t know if the behavior of the system after all this runs is actually correct. Therefore we look to the next segment of output from Chef, from minitest:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
INFO: Running report handlers
Run options: -v --seed 38794
\# Running tests:
recipe::bluepill_test::default#test_0001_the_default_log_file_must_exist_cook_1295_ =
0.00 s = .
recipe::bluepill_test::default::create a bluepill configuration file#test_0001_anonymous =
0.00 s = .
recipe::bluepill_test::default::create a bluepill configuration file#test_0002_must_be_valid_ruby =
0.06 s = .
recipe::bluepill_test::default::runs the application as a service#test_0001_anonymous =
0.72 s = .
recipe::bluepill_test::default::runs the application as a service#test_0002_anonymous =
0.71 s = .
recipe::bluepill_test::default::spawn a netcat tcp client repeatedly#test_0001_should_receive_a_tcp_connection_from_netcat =
2.24 s = .
Finished tests in 3.746002s, 1.6017 tests/s, 1.8687 assertions/s.
6 tests, 7 assertions, 0 failures, 0 errors, 0 skips

This is performed by the minitest-handler, which runs the tests copied from the bluepill_test cookbook before. It’s outside the scope of this post to describe how to write minitest-chef tests, but we can talk about the output.

We have 6 separate tests that perform 7 assertions, and they all passed. The tests are asserting:

  1. The log file is created, and by the full name of the test, this is to check for a regression from COOK-1295.
  2. The .pill config file for the service must exist and be valid Ruby.
  3. The bluepill service must actually be enabled and running, thereby testing that those actions in the LWRP work.
  4. The running service, which listens on a TCP port, must be up and available, thereby testing that bluepill started the service correctly.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[vagrant command] END (1m29.24s)
Finished converging <default-ubuntu-1204> (2m15.45s).
Setting up <default-ubuntu-1204>
Finished setting up <default-ubuntu-1204> (0m0.00s).
Verifying <default-ubuntu-1204>
Finished verifying <default-ubuntu-1204> (0m0.00s).
Destroying <default-ubuntu-1204>
[vagrant command] BEGIN (vagrant destroy default-ubuntu-1204 -f)
[default-ubuntu-1204] Forcing shutdown of VM...
[Berkshelf] cleaning Vagrant's shelf
[default-ubuntu-1204] Destroying VM and associated drives...
[vagrant command] END (0m3.68s)
Vagrant instance <default-ubuntu-1204> destroyed.
Finished destroying <default-ubuntu-1204> (0m4.04s).
Finished testing <default-ubuntu-1204> (3m12.62s).
Kitchen is finished. (3m12.62s)

This output shows Test Kitchen cleaning up after itself. We destroy the Vagrant instance on a successful convergence and test run in Chef, because further investigation is not required. If the test failed for some reason, Test Kitchen leaves it running so you can log into the machine and poke around to find out what went wrong. Then simply correct the required part of the cookbook (recipes, tests, etc) and rerun Test Kitchen. For example:

1
2
3
4
% bundle exec kitchen login 1204
vagrant@ubuntu-1204$ ... run some commands
vagrant@ubuntu-1204$ ^D
% bundle exec kitchen converge 1204

My goal with these posts is to get some information out for folks to consider when examining Test Kitchen 1.0 alpha for their own projects. There’s a lot more to Test Kitchen, such as managing non-cookbook projects, or even using other kinds of tests. We’ll have more documentation and guides as we get the 1.0 release out.

Enjoy!

Anatomy of a Test Kitchen 1.0 Cookbook (Part 1)

DISCLAIMER Test Kitchen 1.0 is still in alpha at the time of this post.

Update Remove Gemfile and Vagrantfile

Let’s take a look at the anatomy of a cookbook set up with test-kitchen 1.0-alpha.

Note It is outside the scope of this post to discuss how to write minitest-chef tests or “test cookbook” recipes. Use the cookbook described below as an example to get ideas for writing your own.

This is the full directory tree of Opscode’s ”bluepill” cookbook:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
├── .kitchen.yml
├── Berksfile
├── CHANGELOG.md
├── CONTRIBUTING
├── LICENSE
├── README.md
├── TESTING.md
├── attributes
│   └── default.rb
├── metadata.rb
├── providers
│   └── service.rb
├── recipes
│   ├── default.rb
│   └── rsyslog.rb
├── resources
│   └── service.rb
├── templates
│   └── default
│       ├── bluepill_init.fedora.erb
│       ├── bluepill_init.freebsd.erb
│       ├── bluepill_init.rhel.erb
│       └── bluepill_rsyslog.conf.erb
└── test
    └── cookbooks
        └── bluepill_test
            ├── README.md
            ├── attributes
            │   └── default.rb
            ├── files
            │   └── default
            │       └── tests
            │           └── minitest
            │               ├── default_test.rb
            │               └── support
            │                   └── helpers.rb
            ├── metadata.rb
            ├── recipes
            │   └── default.rb
            └── templates
                └── default
                    └── test_app.pill.erb

I’ll assume the reader is familiar with basic components of cookbooks like “recipes,” “templates,” and the top-level documentation files, so let’s trim this down to just the areas of concern for Test Kitchen.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
├── .kitchen.yml
├── Berksfile
└── test
    └── cookbooks
        └── bluepill_test
            ├── attributes
            │   └── default.rb
            ├── files
            │   └── default
            │       └── tests
            │           └── minitest
            │               ├── default_test.rb
            │               └── support
            │                   └── helpers.rb
            ├── recipes
            │   └── default.rb
            └── templates
                └── default
                    └── test_app.pill.erb

Note that this cookbook has a “test” cookbook. I’ll get to that in a minute.

First of all, we have the .kitchen.yml. This is the project definition that describes what is required to run test kitchen itself. This particular file tells Test Kitchen to bring up nodes of the platforms we’re testing with Vagrant, and defines the boxes with their box names and URLs to download. You can view the full .kitchen.yml in the Git repo. For now, I’m going to focus on the suite stanza in the .kitchen.yml. This defines how Chef will run when Test Kitchen brings up the Vagrant machine.

1
2
3
4
5
- name: default
  run_list:
  - recipe[minitest-handler]
  - recipe[bluepill_test]
  attributes: {bluepill: { bin: "/opt/chef/embedded/bin/bluepill" } }

Each platform has a recipe it will run with, in this case apt and yum. Then the suite’s run list is appended, so for example, the final run list of the Ubuntu 12.04 node will be:

1
["recipe[apt]", "recipe[minitest-handler]", "recipe[bluepill_test]"]

We have apt so the apt cache on the node is updated before Chef does anything else. This is pretty typical so we put it in the default run list of each Ubuntu box.

The minitest-handler recipe existing in the run list means that the Minitest Chef Handler will be run at the end of the Chef run. In this case, it will use the tests from the test cookbook, bluepill_test.

The bluepill cookbook itself does not depend on any of these cookbooks. So how does Test Kitchen know where to get them? Enter the next file in the list above, Berksfile. This informs Berkshelf which cookbooks to download. The relevant excerpt from the Berksfile is:

1
2
3
4
cookbook "apt"
cookbook "yum"
cookbook "minitest-handler"
cookbook "bluepill_test", :path => "./test/cookbooks/bluepill_test"

Based on the Berksfile, it will download apt, yum, and minitest-handler from the Chef Community site. It will also use the bluepill_test included in the bluepill cookbook. This is transparent to the user, as I’ll cover in a moment.

Test Kitchen’s Vagrant driver plugin handles all the configuration of Vagrant itself based on the entries in the .kitchen.yml. To get the Berkshelf integration in the Vagrant boxes, we need to install the vagrant-berkshelf plugin in Vagrant. Then, we automatically get Berkshelf’s Vagrant integration, meaning all the cookbooks defined in the Berksfile are going to be available on the box we bring up.

Remember the test cookbook mentioned above? It’s the next component. The default suite in .kitchen.yml puts bluepill_test in the run list. This particular recipe will include the bluepill default recipe, then it sets up a test service using the bluepill_service LWRP. This means that when the nodes brought up by Test Kitchen via Vagrant converge, they’ll have bluepill installed and set up, and then a service running that we can test the final behavior. Since Chef will exit with a non-zero return code if it encounters an exception, we know that a successful run means everything is configured as defined in the recipes, and we can run tests against the node.

The tests we’ll run are written with the Minitest Chef Handler. These are defined in the test cookbook, files/default/tests/minitest directory. The minitest-handler cookbook (also in the default suite run list) will execute the default_test tests.

In the next post, we’ll look at how to run Test Kitchen, and what all the output means.

Last Check-in Time for Nodes

This one liner uses the knife exec sub-command to iterate over all the node objects on the Chef Server, and print out their ohai_time attribute in a human readable format.

1
knife exec -E 'nodes.all {|n| puts "#{n.name} #{Time.at(n[:ohai_time])}"}'

Let’s break this up a little.

1
knife exec -E

The exec plugin for knife executes a script or the given string of Ruby code in the same context as chef-shell (or shef in Chef 10 and earlier) if you start it up in it’s “main” context. Since it is knife, it will also use your .chef/knife.rb settings, so it knows about your user, key and Chef Server.

1
nodes.all

The chef-shell main context has helper methods to access the corresponding endpoints in the Chef Server API. Clearly we’re working with “nodes” here, and the #all method returns all the node objects from the Chef Server. This differs from search in that there’s a commit delay between the time when data is saved to the server, and the data is indexed by Solr. This is usually a few seconds, but depending on various factors like the hardware you’re using, how many nodes are converging, etc, it can take longer.

Anyway, we can pass a block to nodes.all and do something with each node object. The example above is a oneliner, so let’s make it more readable.

1
2
3
nodes.all do |n|
  puts "#{n.name} #{Time.at(n[:ohai_time])}"
end

We’re simply going to use n as the iterator for each node object, and we’ll print a string about the node. The #{}’s in the string to print with puts is Ruby string interpolation. That is, everything inside the braces is a Ruby expression. First, the Chef::Node object has a method, #name, that returns the node’s name. This is usually the FQDN, but depending on your configuration (node_name in /etc/chef/client.rb or using the -N option for chef-client), it could be something else. Then, we’re going to use the node’s ohai_time attribute. Every time Chef runs and it gathers data about the node with Ohai, it generates the ohai_time attribute, which is the Unix epoch of the timestamp when Ohai ran. When Chef saves the node data at the end of the run, we know approximately the last time the node ran Chef. In this particular string, we’re converting the Unix epoch, like 1358962351.444405 to a human readable timestamp like 2013-01-23 10:32:31 -0700.

Of course, you can get similar data from the Chef Server by using knife status:

1
knife status

The ohai_time attribute will be displayed as a relative time, e.g., “585 hours ago.” It will include some more data about the nodes like IP’s. This uses Chef’s search feature, so you can also pass in a query:

1
knife status "role:webserver"

The knife exec example is simple, but you can get a lot more data about the nodes than what knife status reports.

In either case, ohai_time isn’t 100% accurate, since it is generated at the beginning of the run, and depending on what you’re doing with Chef on your systems, it can take a long time before the node data is saved. However, it’s close enough for many use cases.

If more detailed or completely accurate information about the Chef run is required for your purposes, you should use a report handler, which does have more data about the run available, including whether the run was successful or not.

Install Chef 11 Server on CentOS 6

A few months ago, I posted briefly on how to install Chef 10 server on CentOS. This post revisits the process for Chef 11.

These steps were performed on a default CentOS 6.3 server install.

First, navigate to the Chef install page to get the package download URL. Use the form on the “Chef Server” tab to select the appropriate drop-down items for your system.

Install the package from the given URL.

1
rpm -Uvh https://opscode-omnitruck-release.s3.amazonaws.com/el/6/x86_64/chef-server-11.0.4-1.el6.x86_64.rpm

The package just puts the bits on disk (in /opt/chef-server). The next step is to configure the Chef Server and start it.

1
% chef-server-ctl reconfigure

This runs the embedded chef-solo with the included cookbooks, and sets up everything required - Erchef, RabbitMQ, PostgreSQL, etc.

Next, run the Opscode Pedant test suite. This will verify that everything is working.

1
% chef-server-ctl test

Copy the default admin user’s key and the validator key to your local workstation system that you have Chef client installed on, and create a new user for yourself with knife. You’ll need version 11.2.0. The key files on the Chef Server are readable only by root.

1
2
scp root@chef-server:/etc/chef-server/admin.pem .
scp root@chef-server:/etc/chef-server/chef-validator.pem .

Use knife configure -i to create an initial ~/.chef/knife.rb and new administrative API user for yourself. Use the FQDN of your newly installed Chef Server, with HTTPS. The validation key needs to be copied over from the Chef Server from /etc/chef-server/chef-validator.pem to ~/.chef to use it for automatically bootstrapping nodes with knife bootstrap.

1
% knife configure -i

The .chef/knife.rb file should look something like this:

1
2
3
4
5
6
7
8
log_level                :info
log_location             STDOUT
node_name                'jtimberman'
client_key               '/home/jtimberman/.chef/jtimberman.pem'
validation_client_name   'chef-validator'
validation_key           '/home/jtimberman/.chef/chef-validator.pem'
chef_server_url          'https://chef-server.example.com'
syntax_check_cache_path  '/home/jtimberman/.chef/syntax_check_cache'

Your Chef Server is now ready to use. Test connectivity as your user with knife:

1
2
3
4
5
6
% knife client list
chef-validator
chef-webui
% knife user list
admin
jtimberman

In previous versions of Open Source Chef Server, users were API clients. In Chef 11, users are separate entities on the Server.

The chef-server-ctl command is used on the Chef Server system for management. It has built-in help (-h) that will display the various sub-commands.

Chef and Net::SSH Dependency Broken

2nd UPDATE CHEF-3835 was opened by a member of the community; Chef versions 11.2.0 and 10.20.0 have been released by Opscode to resolve the issue.

UPDATE Opscode is working on getting a new release of the Chef gem with updated version constraints.

What Happened?

Earlier today (February 6, 2013), a new version of the various net-ssh RubyGems were published. This includes:

  • net-ssh 2.6.4
  • net-ssh-multi 1.1.1
  • net-ssh-gateway 1.1.1

Chef’s dependencies have a pessimistic version constraint (~>) on net-ssh 2.2.2.

What’s the Problem?

So what is the problem?

It appears to lie with net-ssh-gateway. The version of net-ssh-gateway went from 1.1.0 (released in April 2011), to 1.1.1. It depends on net-ssh. In net-ssh-gateway 1.1.0, the net-ssh version constraint was >= 1.99.1, which is fine with Chef’s constraint against ~> 2.2.2. However, in net-ssh-gateway 1.1.1, the net-ssh version constraint was changed to >= 2.6.4, which is obviously a conflict with Chef’s constraint.

What’s the Solution?

So, how can we fix it?

One solution is to use the Opscode Omnibus Package for Chef. This isn’t a solution for everyone, of course, but it does include and contain all the dependencies. This also doesn’t help if one wishes to install another gem that depends on Chef under the “Omnibus” Ruby environment along with Chef, because the conflict will be found. For example, to use the minitest-chef-handler gem for running minitest-chef tests.

vagrant@ubuntu-12-04:~$ /opt/chef/embedded/bin/gem install minitest-chef-handler ERROR: While executing gem ... (Gem::DependencyError) Unable to resolve dependencies: net-ssh-gateway requires net-ssh (>= 2.6.4)

Another solution is to relax / modify the constraint in Chef. This may be okay, but as of right now we don’t know if this will affect anything in the way that Chef uses net-ssh. We have tickets related to net-ssh version constraints in Chef:

  • http://tickets.opscode.com/browse/CHEF-2977
  • http://tickets.opscode.com/browse/CHEF-3156

Local-only Knife Configuration

In this post I want to discuss briefly an approach to setting up a shared Knife configuration file for teams using the same Chef Repository, while supporting customized configuration.

Background

Most infrastructures managed by Chef have multiple people working on them. Recently, several people in the Ruby community started working together on migrating RubyGems to Amazon EC2.

The repository has a shared .chef/knife.rb which sets some local paths where cookbooks and roles are located. In addition to this, I wanted to test building the infrastructure using a Chef Server and my own EC2 account.

The Approach

At Opscode, we believe in leveraging internal DSLs. The .chef/knife.rb (and Chef’s client.rb or solo.rb, etc) is no exception. While you can have a fairly simple configuration like this:

1
2
3
4
node_name        "jtimberman"
client_key       "/home/jtimberman/.chef/jtimberman.pem"
chef_server_url  "https://api.opscode.com/organizations/my_organization"
cookbook_path    "cookbooks"

You can also have something like this:

1
2
3
4
5
6
7
8
9
10
log_level     :info
log_location  STDOUT
node_name     ENV["NODE_NAME"] || "solo"
client_key    File.expand_path("../solo.pem", __FILE__)
cache_type    "BasicFile"
cache_options(path: File.expand_path("../checksums", __FILE__))
cookbook_path [ File.expand_path("../../chef/cookbooks", __FILE__) ]
if ::File.exist?(File.expand_path("../knife.local.rb", __FILE__))
  Chef::Config.from_file(File.expand_path("../knife.local.rb", __FILE__))
end

This is the knife.rb included in the RubyGems-AWS repo.

The main part of interest here is the last three lines.

1
2
3
if ::File.exist?(File.expand_path("../knife.local.rb", __FILE__))
  Chef::Config.from_file(File.expand_path("../knife.local.rb", __FILE__))
end

This says “if a file knife.local.rb exists, then load its configuration. The Chef::Config class is what Chef uses for configuration files, and the #from_file method will load the specified file.

In this case, the content of my knife.local.rb is:

1
2
3
4
5
6
7
8
9
10
11
node_name                "jtimberman"
client_key               "/Users/jtimberman/.chef/jtimberman.pem"
validation_client_name   "ORGNAME-validator"
validation_key           "/Users/jtimberman/.chef/ORGNAME-validator.pem"
chef_server_url          "https://api.opscode.com/organizations/ORGNAME"
cookbook_path [
  File.expand_path("../../chef/cookbooks", __FILE__),
  File.expand_path("../../chef/site-cookbooks", __FILE__)
]
knife[:aws_access_key_id]      = "Some access key I like"
knife[:aws_secret_access_key]  = "The matching secret access key"

Here I’m setting my Opscode Hosted Chef credentials and server. I also set the cookbook_path to include the site-cookbooks directory (this should probably go in the regular knife.rb). Finally, I set the knife configuration options for my AWS EC2 account.

The configuration is parsed top-down, so the options here that overlap the knife.rb will be used instead.

In the Repository

In the repository, commit only the .chef/knife.rb and not the .chef/knife.local.rb. I recommend adding the local file to the .gitignore or VCS equivalent.

1
2
3
% echo .chef/knife.local.rb >> .gitignore
% git add .chef/knife.rb .gitignore
% git commit -m 'keep general knife.rb, local config is ignored'

Conclusion

There are many approaches to solving the issue of having shared Knife configuration for multiple people in a single repository. The real benefit here is that the configuration file is Ruby, which provides a lot of flexibility. Of course, when using someone else’s configuration examples, one should always read the code and understand it first :-).

Local Templates for Application Configuration

Today I joined the Food Fight Show for a conversation about Application Deployment. Along the way, the question came up about where to store application specific configuration files. Should they be stored in a Chef cookbook for setting up the system for the application? Or shoud they be stored in the application codebase itself?

The answer is either, as far as Chef is concerned. Chef’s template resource can render a template from a local file on disk, or retrieve the template from a cookbook. The latter is the most common pattern, so let’s examine the former, using a local file on disk.

For sake of discussion, let’s use a Rails application that needs a database.yml file rendered. Also, we’ll assume that information about the application (database user, password, server) we need is stored in a Chef data bag. Finally, we’re going to assume that the application is already deployed on the system somehow and we just want to render the database.yml.

The application source tree looks something like this:

1
2
3
myapp/
-> config/
    -> database.yml.erb

Note that there should not be a database.yml (non-.erb) here, as it will be rendered with Chef. The deployment of the app will end up in /srv, so the full path of this template is, for example, /srv/myapp/current/config/database.yml.erb. The content of the template may look like this:

1
2
3
4
5
6
7
8
<%= @rails_env %>:
  adapter: <%= @adapter %>
  host: <%= @host %>
  database: <%= @database %>
  username: <%= @username %>
  password: <%= @password %>
  encoding: 'utf8'
  reconnect: true

The Chef recipe looks like this. Note we’ll use a search to find the first node that should be the database master (there should only be one). For the adapter, we may have set an attribute in the role that selects the adapter to use.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
results = search(:node, "role:myapp_database_master AND environment:#{node.chef_environment}")
db_master = results[0]

template "/srv/myapp/shared/database.yml" do
  source "/srv/myapp/current/config/database.yml.erb"
  local true
  variables(
    :rails_env => node.chef_environment,
    :adapter => db_master['myapp']['db_adapter'],
    :host => db_master['fqdn'],
    :database => "myapp_#{node.chef_environment}",
    :username => "myapp",
    :password => "SUPERSECRET",
  )
end

The rendered template, /srv/myapp/shared/database.yml, will look like this:

1
2
3
4
5
6
7
8
production:
  adapter: mysql
  host: domU-12-31-39-14-F1-C3.compute-1.internal
  database: myapp_production
  username: myapp
  password: SUPERSECRET
  encoding: utf8
  reconnect: true

This post is only part of the puzzle, mainly to explain what I mentioned on the Food Fight Show today. There are a number of unanswered questions like,

  • Should database.yml be .gitignore’d?
  • How do developers run the app locally?
  • How do I use this with Chef Solo?

As mentioned on the show, there’s currently a thread related to this topic on the Chef mailing list.