jtimberman's Code Blog

Chef, Ops, Ruby, Linux/Unix. Opinions are mine, not my employer's (Chef).

Load_current_resource and Chef-shell

This post will illustrate load_current_resource and a basic use of chef-shell.

The chef-shell is an irb-based REPL (read-eval-print-loop). Everything I do is Ruby code, just like in Chef recipes or other cookbook components. I’m going to use a package resource example, so need privileged access (sudo).

1
% sudo chef-shell

The chef-shell program loads its configuration, determines what session type, and displays a banner. In this case, we’re taking all the defaults, which means no special configuration, and a standalone session.

1
2
3
4
5
6
7
8
9
10
11
12
loading configuration: none (standalone session)
Session type: standalone
Loading...done.

This is the chef-shell.
 Chef Version: 11.14.0.rc.2
 http://www.opscode.com/chef
 http://docs.opscode.com/

run `help' for help, `exit' or ^D to quit.

Ohai2u jtimberman@jenkins.int.housepub.org!

To evaluate resources as we’d write them in a recipe, we need to switch to recipe mode.

1
chef > recipe_mode

I can do anything here that I can do in a recipe. I could paste in my own recipes. Here, I’m just going to add a package resource to manage the vim package. Note that this works like the “compile” phase of a chef-client run. The resource will be added to the Chef::ResourceCollection object. We’ll look at this in a little more detail shortly.

1
2
chef:recipe > package "vim"
 => <package[vim] @name: "vim" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :install, :upgrade, :remove, :purge, :reconfig] @action: :install @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: :default @elapsed_time: 0 @sensitive: false @candidate_version: nil @options: nil @package_name: "vim" @resource_name: :package @response_file: nil @response_file_variables: {} @source: nil @version: nil @timeout: 900 @cookbook_name: nil @recipe_name: nil>

I’m done adding resources/writing code to test, so I’ll initiate a Chef run with the run_chef method (this is a special method in chef-shell).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
chef:recipe > run_chef
[2014-07-21T09:04:51-06:00] INFO: Processing package[vim] action install ((irb#1) line 1)
[2014-07-21T09:04:51-06:00] DEBUG: Chef::Version::Comparable does not know how to parse the platform version: jessie/sid
[2014-07-21T09:04:51-06:00] DEBUG: Chef::Version::Comparable does not know how to parse the platform version: jessie/sid
[2014-07-21T09:04:51-06:00] DEBUG: package[vim] checking package status for vim
vim:
  Installed: 2:7.4.335-1
  Candidate: 2:7.4.335-1
  Version table:
 *** 2:7.4.335-1 0
        500 http://ftp.us.debian.org/debian/ testing/main amd64 Packages
        100 /var/lib/dpkg/status
[2014-07-21T09:04:51-06:00] DEBUG: package[vim] current version is 2:7.4.335-1
[2014-07-21T09:04:51-06:00] DEBUG: package[vim] candidate version is 2:7.4.335-1
[2014-07-21T09:04:51-06:00] DEBUG: package[vim] is already installed - nothing to do

Let’s take a look at what’s happening. Note that we have INFO and DEBUG output. By default, chef-shell runs with Chef::Log#level set to :debug. In a normal Chef Client run with :info output, we see the first line, but not the others. I’ll show each line, and then explain what Chef did.

1
[2014-07-21T09:04:51-06:00] INFO: Processing package[vim] action install ((irb#1) line 1)

There is a timestamp, the resource, package[vim], the action install Chef will take, and the location in the recipe where this was encountered. I didn’t specify one in the resource, that’s the default action for package resources. The irb#1 line 1 just means that it was the first line of the irb in recipe mode.

1
2
[2014-07-21T09:04:51-06:00] DEBUG: Chef::Version::Comparable does not know how to parse the platform version: jessie/sid
[2014-07-21T09:04:51-06:00] DEBUG: Chef::Version::Comparable does not know how to parse the platform version: jessie/sid

Chef chooses the default provider for each resource based on a mapping of platforms and their versions. It uses an internal class, Chef::Version::Comparable to do this. The system I’m using is a Debian “testing” system, which has the codename jessie, but it isn’t a specific release number. Chef knows that for all debian platforms to use the apt package provider, and that’ll do here.

1
2
3
4
5
6
7
8
9
10
[2014-07-21T09:04:51-06:00] DEBUG: package[vim] checking package status for vim
vim:
  Installed: 2:7.4.335-1
  Candidate: 2:7.4.335-1
  Version table:
 *** 2:7.4.335-1 0
        500 http://ftp.us.debian.org/debian/ testing/main amd64 Packages
        100 /var/lib/dpkg/status
[2014-07-21T09:04:51-06:00] DEBUG: package[vim] current version is 2:7.4.335-1
[2014-07-21T09:04:51-06:00] DEBUG: package[vim] candidate version is 2:7.4.335-1

This output is the load_current_resource method implemented in the apt package provider.

The check_package_state method does all the heavy lifting. It runs apt-cache policy and parses the output looking for the version number. If we used the :update action, and the installed version wasn’t the same as the candidate version, Chef would install the candidate version.

Chef resources are convergent. They only get updated if they need to be. In this case, the vim package is installed already (our implicitly specified action), so we see the following line:

1
[2014-07-21T09:04:51-06:00] DEBUG: package[vim] is already installed - nothing to do

Nothing to do, Chef finishes its run.

Modifying Existing Resources

We can manipulate the state of the resources in the resource collection. This isn’t common in most recipes. It’s required for certain kinds of development patterns like “wrapper” cookbooks. As an example, I’m going to modify the resource object so I don’t have to log into the system again and run apt-get remove vim, to show the next section.

First, I’m going to create a local variable in the context of the recipe. This is just like any other variable in Ruby. For its value, I’m going to use the #resources() method to look up a resource in the resource collection.

1
2
chef:recipe > local_package_variable = resources("package[vim]")
 => <package[vim] @name: "vim" @noop: nil @before: nil @params: {} @provider: nil @allowed_actions: [:nothing, :install, :upgrade, :remove, :purge, :reconfig] @action: :install @updated: false @updated_by_last_action: false @supports: {} @ignore_failure: false @retries: 0 @retry_delay: 2 @source_line: "(irb#1):1:in `irb_binding'" @guard_interpreter: :default @elapsed_time: 0.029617095 @sensitive: false @candidate_version: nil @options: nil @package_name: "vim" @resource_name: :package @response_file: nil @response_file_variables: {} @source: nil @version: nil @timeout: 900 @cookbook_name: nil @recipe_name: nil>

The return value is the package resource object:

1
2
chef:recipe > local_package_variable.class
 => Chef::Resource::Package

(#class is a method on the Ruby Object class that returns the class of the object)

To remove the vim package, I use the #run_action method (available to all Chef::Resource subclasses), specifying the :remove action as a symbol:

1
2
3
chef:recipe > local_package_variable.run_action(:remove)
[2014-07-21T09:11:50-06:00] INFO: Processing package[vim] action remove ((irb#1) line 1)
[2014-07-21T09:11:52-06:00] INFO: package[vim] removed

There is no additional debug to display. Chef will run apt-get remove vim to converge the resource with this action.

Load Current Resource Redux

Now that the package has been removed from the system, what happens if we run Chef again? Well, Chef is convergent, and it takes idempotent actions on the system to ensure that the managed resources are in the desired state. That means it will install the vim package.

1
2
chef:recipe > run_chef
[2014-07-21T09:11:57-06:00] INFO: Processing package[vim] action install ((irb#1) line 1)

We’ll see some familiar messages here about the version, then:

1
2
3
4
5
6
7
8
9
[2014-07-21T09:11:57-06:00] DEBUG: package[vim] checking package status for vim
vim:
  Installed: (none)
  Candidate: 2:7.4.335-1
  Version table:
     2:7.4.335-1 0
        500 http://ftp.us.debian.org/debian/ testing/main amd64 Packages
[2014-07-21T09:11:57-06:00] DEBUG: package[vim] current version is nil
[2014-07-21T09:11:57-06:00] DEBUG: package[vim] candidate version is 2:7.4.335-1

This is load_current_resource working as expected. As we can see from the apt-cache policy output, the package is not installed, and as the action to take is :install, Chef will do what we think:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Reading package lists...
Building dependency tree...
Reading state information...
The following packages were automatically installed and are no longer required:
  g++-4.8 geoclue geoclue-hostip geoclue-localnet geoclue-manual
  geoclue-nominatim gstreamer0.10-plugins-ugly libass4 libblas3gf libcolord1
  libcolorhug1 libgeoclue0 libgnustep-base1.22 libgnutls28 libminiupnpc8
  libpoppler44 libqmi-glib0 libstdc++-4.8-dev python3-ply xulrunner-29
Use 'apt-get autoremove' to remove them.
Suggested packages:
  vim-doc vim-scripts
The following NEW packages will be installed:
  vim
0 upgraded, 1 newly installed, 0 to remove and 28 not upgraded.
Need to get 0 B/905 kB of archives.
After this operation, 2,088 kB of additional disk space will be used.
Selecting previously unselected package vim.
(Reading database ... 220338 files and directories currently installed.)
Preparing to unpack .../vim_2%3a7.4.335-1_amd64.deb ...
Unpacking vim (2:7.4.335-1) ...
Setting up vim (2:7.4.335-1) ...
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/vim (vim) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/vimdiff (vimdiff) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/rvim (rvim) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/rview (rview) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/vi (vi) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/view (view) in auto mode
update-alternatives: using /usr/bin/vim.basic to provide /usr/bin/ex (ex) in auto mode

This should be familiar to anyone that uses Debian/Ubuntu, it’s standard apt-get install output. Of course, this is a development system so I have some cruft, but we’ll ignore that ;).

If we run_chef again, we get the output we saw in the original example in this post:

1
[2014-07-21T09:50:06-06:00] DEBUG: package[vim] is already installed - nothing to do

ChefDK and Ruby

Recently, Chef released ChefDK, the “Chef Development Kit.” This is a self-contained package of everything required to run Chef, work with Chef cookbooks, and includes the best of breed community tools, test frameworks, and other utility programs that are commonly used when working with Chef in infrastructure as code. ChefDK version 0.1.0 was released last week. A new feature mentioned in the README.md is very important, in my opinion.

Using ChefDK as your primary development environment

What does that mean?

It means that if the only reason you have Ruby installed on your local system is to do Chef development or otherwise work with Chef, you no longer have to maintain a separate Ruby installation. That means you won’t need any of these:

  • rbenv
  • rvm
  • chruby (*note)
  • “system ruby” (e.g., OS X’s included /usr/bin/ruby, or the ruby package from your Linux distro)
  • poise ruby)

(*note: You can optionally use chruby with ChefDK if it’s part of your workflow and you have other Rubies installed.)

Do not misunderstand me: These are all extremely good solutions for getting and using Ruby on your system. They definitely have their place if you do other Ruby development, such as web applications. This is especially true if you have to work with multiple versions of Ruby. However, if you’re like me and mainly use Ruby for Chef, then ChefDK has you covered.

In this post, I will describe how I have set up my system with ChefDK, and use its embedded Ruby by default.

Getting Started

Download ChefDK from the downloads page. At the time of this blog post, the available builds are limited to OS X and Linux (Debian/Ubuntu or RHEL), but Chef is working on Windows packages.

For example, here’s what I did on my Ubuntu 14.04 system:

1
2
wget https://opscode-omnibus-packages.s3.amazonaws.com/ubuntu/13.10/x86_64/chefdk_0.1.0-1_amd64.deb
sudo dpkg -i /tmp/chefdk_0.1.0-1_amd64.deb

OS X users will be happy to know that the download is a .DMG, which includes a standard OS X .pkg (complete with developer signing). Simply install it like many other products on OS X.

For either Linux or OS X, in omnibus fashion, the post-installation creates several symbolic links in /usr/bin for tools that are included in ChefDK:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
% ls -l /usr/bin | grep chefdk
lrwxrwxrwx 1 root root 21 Apr 30 22:13 berks -> /opt/chefdk/bin/berks
lrwxrwxrwx 1 root root 20 Apr 30 22:13 chef -> /opt/chefdk/bin/chef
lrwxrwxrwx 1 root root 26 Apr 30 22:13 chef-apply -> /opt/chefdk/bin/chef-apply
lrwxrwxrwx 1 root root 27 Apr 30 22:13 chef-client -> /opt/chefdk/bin/chef-client
lrwxrwxrwx 1 root root 26 Apr 30 22:13 chef-shell -> /opt/chefdk/bin/chef-shell
lrwxrwxrwx 1 root root 25 Apr 30 22:13 chef-solo -> /opt/chefdk/bin/chef-solo
lrwxrwxrwx 1 root root 25 Apr 30 22:13 chef-zero -> /opt/chefdk/bin/chef-zero
lrwxrwxrwx 1 root root 23 Apr 30 22:13 fauxhai -> /opt/chefdk/bin/fauxhai
lrwxrwxrwx 1 root root 26 Apr 30 22:13 foodcritic -> /opt/chefdk/bin/foodcritic
lrwxrwxrwx 1 root root 23 Apr 30 22:13 kitchen -> /opt/chefdk/bin/kitchen
lrwxrwxrwx 1 root root 21 Apr 30 22:13 knife -> /opt/chefdk/bin/knife
lrwxrwxrwx 1 root root 20 Apr 30 22:13 ohai -> /opt/chefdk/bin/ohai
lrwxrwxrwx 1 root root 23 Apr 30 22:13 rubocop -> /opt/chefdk/bin/rubocop
lrwxrwxrwx 1 root root 20 Apr 30 22:13 shef -> /opt/chefdk/bin/shef
lrwxrwxrwx 1 root root 22 Apr 30 22:13 strain -> /opt/chefdk/bin/strain
lrwxrwxrwx 1 root root 24 Apr 30 22:13 strainer -> /opt/chefdk/bin/strainer

These should cover the 80% use case of ChefDK: using the various Chef and Chef Community tools so users can follow their favorite workflow, without shaving the yak of managing a Ruby environment.

But, as I noted, and the thesis of this post, is that one could use this Ruby environment included in ChefDK as their own! So where is that?

ChefDK’s Ruby

Tucked away in every “omnibus” package is a directory of “embedded” software – the things that were required to meet the end goal. In the case of Chef or ChefDK, this is Ruby, openssl, zlib, libpng, and so on. This is a fully contained directory tree, complete with lib, share, and yes indeed, bin.

1
2
% ls /opt/chefdk/embedded/bin
(there's a bunch of commands here, trust me)

Of particular note are /opt/chefdk/embedded/bin/ruby and /opt/chefdk/embedded/bin/gem.

To use ChefDK’s Ruby as default, simply edit the $PATH.

1
export PATH="/opt/chefdk/embedded/bin:${HOME}/.chefdk/gem/ruby/2.1.0/bin:$PATH"

Add that, or its equivalent, to a login shell profile/dotrc file, and rejoice. Here’s what I have now:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
$ which ruby
/opt/chefdk/embedded/bin/ruby
$ which gem
/opt/chefdk/embedded/bin/gem
$ ruby --version
ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-linux]
$ gem --version
2.2.1
$ gem env
RubyGems Environment:
  - RUBYGEMS VERSION: 2.2.1
  - RUBY VERSION: 2.1.1 (2014-02-24 patchlevel 76) [x86_64-linux]
  - INSTALLATION DIRECTORY: /opt/chefdk/embedded/lib/ruby/gems/2.1.0
  - RUBY EXECUTABLE: /opt/chefdk/embedded/bin/ruby
  - EXECUTABLE DIRECTORY: /opt/chefdk/embedded/bin
  - SPEC CACHE DIRECTORY: /home/ubuntu/.gem/specs
  - RUBYGEMS PLATFORMS:
    - ruby
    - x86_64-linux
  - GEM PATHS:
     - /opt/chefdk/embedded/lib/ruby/gems/2.1.0
     - /home/ubuntu/.chefdk/gem/ruby/2.1.0
  - GEM CONFIGURATION:
     - :update_sources => true
     - :verbose => true
     - :backtrace => false
     - :bulk_threshold => 1000
     - "install" => "--user"
     - "update" => "--user"
  - REMOTE SOURCES:
     - https://rubygems.org/
  - SHELL PATH:
     - /opt/chefdk/embedded/bin
     - /home/ubuntu/.chefdk/gem/ruby/2.1.0/bin
     - /usr/local/sbin
     - /usr/local/bin
     - /usr/sbin
     - /usr/bin
     - /sbin
     - /bin
     - /usr/games
     - /usr/local/games

Note that this is the current stable release of Ruby, version 2.1.1 patchlevel 76, and the {almost} latest version of RubyGems, version 2.2.1. Also note the Gem paths – the first is the embedded gems path, which is where gems installed by root with the chef gem command will go. The other is in my home directory – ChefDK is set up so that gems can be installed as a non-root user within the ~/.chefdk/gems directory.

Installing Gems

Let’s see this in action. Install a gem using the gem command.

1
2
3
4
5
6
7
$ gem install knife-solve
Fetching: knife-solve-1.0.1.gem (100%)
Successfully installed knife-solve-1.0.1
Parsing documentation for knife-solve-1.0.1
Installing ri documentation for knife-solve-1.0.1
Done installing documentation for knife-solve after 0 seconds
1 gem installed

And as I said, this will be installed in the home directory:

1
2
3
4
5
6
7
$ gem content knife-solve
/home/ubuntu/.chefdk/gem/ruby/2.1.0/gems/knife-solve-1.0.1/LICENSE
/home/ubuntu/.chefdk/gem/ruby/2.1.0/gems/knife-solve-1.0.1/README.md
/home/ubuntu/.chefdk/gem/ruby/2.1.0/gems/knife-solve-1.0.1/Rakefile
/home/ubuntu/.chefdk/gem/ruby/2.1.0/gems/knife-solve-1.0.1/lib/chef/knife/solve.rb
/home/ubuntu/.chefdk/gem/ruby/2.1.0/gems/knife-solve-1.0.1/lib/knife-solve.rb
/home/ubuntu/.chefdk/gem/ruby/2.1.0/gems/knife-solve-1.0.1/lib/knife-solve/version.rb

Using Bundler

ChefDK also includes bundler. As a “non-Chef, Ruby use case”, I installed octopress for this blog.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
% bundle install --path vendor --binstubs
Fetching gem metadata from https://rubygems.org/.......
Fetching additional metadata from https://rubygems.org/..
Installing rake (0.9.6)
Installing RedCloth (4.2.9)
Installing chunky_png (1.2.9)
Installing fast-stemmer (1.0.2)
Installing classifier (1.3.3)
Installing fssm (0.2.10)
Installing sass (3.2.12)
Installing compass (0.12.2)
Installing directory_watcher (1.4.1)
Installing haml (3.1.8)
Installing kramdown (0.14.2)
Installing liquid (2.3.0)
Installing maruku (0.7.0)
Installing posix-spawn (0.3.6)
Installing yajl-ruby (1.1.0)
Installing pygments.rb (0.3.7)
Installing jekyll (0.12.1)
Installing rack (1.5.2)
Installing rack-protection (1.5.0)
Installing rb-fsevent (0.9.3)
Installing rdiscount (2.0.7.3)
Installing rubypants (0.2.0)
Installing sass-globbing (1.0.0)
Installing tilt (1.4.1)
Installing sinatra (1.4.3)
Installing stringex (1.4.0)
Using bundler (1.5.2)
Updating files in vendor/cache
  * classifier-1.3.3.gem
  * fssm-0.2.10.gem
  * sass-3.2.12.gem
  * compass-0.12.2.gem
  * directory_watcher-1.4.1.gem
  * haml-3.1.8.gem
  * kramdown-0.14.2.gem
  * liquid-2.3.0.gem
  * maruku-0.7.0.gem
  * posix-spawn-0.3.6.gem
  * yajl-ruby-1.1.0.gem
  * pygments.rb-0.3.7.gem
  * jekyll-0.12.1.gem
  * rack-1.5.2.gem
  * rack-protection-1.5.0.gem
  * rb-fsevent-0.9.3.gem
  * rdiscount-2.0.7.3.gem
  * rubypants-0.2.0.gem
  * sass-globbing-1.0.0.gem
  * tilt-1.4.1.gem
  * sinatra-1.4.3.gem
  * stringex-1.4.0.gem
Your bundle is complete!
It was installed into ./vendor

Then I can use for example the rake task to preview things while writing this post.

1
2
3
4
5
6
7
$ ./bin/rake preview
Starting to watch source with Jekyll and Compass. Starting Rack on port 4000
directory source/stylesheets/
   create source/stylesheets/screen.css
[2014-05-07 21:46:35] INFO  WEBrick 1.3.1
[2014-05-07 21:46:35] INFO  ruby 2.1.1 (2014-02-24) [x86_64-linux]
[2014-05-07 21:46:35] INFO  WEBrick::HTTPServer#start: pid=10815 port=4000

Conclusion

I’ve used Chef before it was even released. As the project has evolved, and as the Ruby community around it has established new best practices installing and maintaining Ruby development environments, I’ve followed along. I’ve used all the version managers listed above. I’ve spent untold hours getting the right set of gems installed just to have to upgrade everything again and debug my workstation. I’ve written blog posts, wiki pages, and helped countless users do this on their own systems.

Now, we have an all-in-one environment that provides a great solution. Give ChefDK a whirl on your workstation – I think you’ll like it!

Evolution of Cookbook Development

In this post, I will explore some development patterns that I’ve seen (and done!) with Chef cookbooks, and then explain how we can evolve to a new level of cookbook development. The examples here come from Chef’s new chef-splunk cookbook, which is a refactored version of an old splunk42 cookbook. While there is a public splunk cookbook on the Chef community site, it shares some of the issues that I saw with our old one, which are partially subject matter of this post.

Anyway, on to the evolution!

Sub-optimal patterns

These are the general patterns I’m going to address.

  • Composing URLs from multiple local variables or attributes
  • Large conditional logic branches like case statements in recipes
  • Not using definitions when it is best to do so
  • Knowledge of how node run lists are composed for search, or searching for “role:some-server
  • Repeated resources across multiple orthogonal recipes
  • Plaintext secrets in attributes or data bag items

Cookbook development is a wide and varied topic, so there are many other patterns to consider, but these are the ones most relevant to the refactored cookbook.

Composing URLs

It may seem like a good idea, to compose URL strings as attributes or local variables in a recipe based on other attributes and local variables. For example, in our splunk42 cookbook we have this:

1
2
3
4
5
splunk_root = "http://download.splunk.com/releases/"
splunk_version = "4.2.1"
splunk_build = "98164"
splunk_file = "splunkforwarder-#{splunk_version}-#{splunk_build}-linux-2.6-amd64.deb"
os = node['os'].gsub(/\d*/, '')

These get used in the following remote_file resource:

1
2
3
4
remote_file "/opt/#{splunk_file}" do
  source "#{splunk_root}/#{splunk_version}/universalforwarder/#{os}/#{splunk_file}"
  action :create_if_missing
end

We reused the filename variable, and composed the URL to the file to download. Then to upgrade, we can simply modify the splunk_version and splunk_build, as Splunk uses a consistent naming theme for their package URLs (thanks, Splunk!). The filename itself is built from a case statement (more on that in the next section). We could further make the version and build attributes, so users can update to newer versions by simply changing the attribute.

So what is bad about this? Two things.

  1. This is in the splunk42::client recipe, and repeated again in the splunk42::server recipe with only minor differences (the package name, splunk vs splunkforwarder).
  2. Ruby has excellent libraries for manipulating URIs and paths as strings, and it is easier to break up a string than compose a new one.

How can this be improved? First, we can set attributes for the full URL. The actual code for that is below, but suffice to say, it will look like this (note the version is different because the new cookbook installs a new Splunk version).

1
default['splunk']['forwarder']['url'] = 'http://download.splunk.com/releases/6.0.1/universalforwarder/linux/splunkforwarder-6.0.1-189883-linux-2.6-amd64.deb'

Second, we have helper libraries distributed with the cookbook that break up the URI so we can return just the package filename.

1
2
3
4
5
def splunk_file(uri)
  require 'pathname'
  require 'uri'
  Pathname.new(URI.parse(uri).path).basename.to_s
end

The previous remote_file resource is rewritten like this:

1
2
3
4
remote_file "/opt/#{splunk_file(node['splunk']['forwarder']['url'])}" do
  source node['splunk']['forwarder']['url']
  action :create_if_missing
end

As a bonus, the helper methods are available in other places like other cookbooks and recipes, rather than the local scope of local variables.

Conditional Logic Branches

One of the wonderful things about Chef is that simple Ruby conditionals can be used in recipes to selectively set values for resource attributes, define resources that should be used, and other decisions. One of the horrible things about Chef is that simple Ruby conditionals can be used in recipes and often end up being far more complicated than originally intended, especially when handling multiple platforms and versions.

In the earlier example, we had a splunk_file local variable set in a recipe. I mentioned it was built from a case statement, which looks like this, in full:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
splunk_file = case node['platform_family']
  when "rhel"
    if node['kernel']['machine'] == "x86_64"
      splunk_file = "splunkforwarder-#{splunk_version}-#{splunk_build}-linux-2.6-x86_64.rpm"
    else
      splunk_file = "splunkforwarder-#{splunk_version}-#{splunk_build}.i386.rpm"
    end
  when "debian"
    if node['kernel']['machine'] == "x86_64"
      splunk_file = "splunkforwarder-#{splunk_version}-#{splunk_build}-linux-2.6-amd64.deb"
    else
      splunk_file = "splunkforwarder-#{splunk_version}-#{splunk_build}-linux-2.6-intel.deb"
    end
  when "omnios"
    splunk_file = "splunkforwarder-#{splunk_version}-#{splunk_build}-solaris-10-intel.pkg.Z"
  end

Splunk itself supports many platforms, and not all of them are covered by this conditional, so it’s easy to imagine how this can get further out of control and make the recipe even harder to follow. Also consider that this is just the client portion for the splunkforwarder package, this same block is repeated in the server recipe, for the splunk package.

So why is this bad? There are three reasons.

  1. We have a large block of conditionals that sit in front of a user reading a recipe.
  2. This logic isn’t reusable elsewhere, so it has to be duplicated in the other recipe.
  3. This is only the logic for the package filename, but we care about the entire URL. I’ve also covered that composing URLs isn’t delightful.

What is a better approach? Use the full URL as I mentioned before, and set it as an attribute. We will still have the gnarly case statement, but it will be tucked away in the attributes/default.rb file, and hidden from anyone reading the recipe (which is the thing they probably care most about reading).

1
2
3
4
5
6
7
8
9
10
11
case node['platform_family']
when 'rhel'
  if node['kernel']['machine'] == 'x86_64'
    default['splunk']['forwarder']['url'] = 'http://download.splunk.com/releases/6.0.1/universalforwarder/linux/splunkforwarder-6.0.1-189883-linux-2.6-x86_64.rpm'
    default['splunk']['server']['url'] = 'http://download.splunk.com/releases/6.0.1/splunk/linux/splunk-6.0.1-189883-linux-2.6-x86_64.rpm'
  else
    default['splunk']['forwarder']['url'] = 'http://download.splunk.com/releases/6.0.1/universalforwarder/linux/splunkforwarder-6.0.1-189883.i386.rpm'
    default['splunk']['server']['url'] = 'http://download.splunk.com/releases/6.0.1/splunk/linux/splunk-6.0.1-189883.i386.rpm'
  end
when 'debian'
  # ...

The the complete case block can be viewed in the repository. Also, since this is an attribute, consumers of this cookbook can set the URL to whatever they want, including a local HTTP server.

Another example of gnarly conditional logic looks like this, also from the splunk42::client recipe.

1
2
3
4
5
6
7
8
9
10
11
12
13
case node['platform_family']
when "rhel"
  rpm_package "/opt/#{splunk_file}" do
    source "/opt/#{splunk_file}"
  end
when "debian"
  dpkg_package "/opt/#{splunk_file}" do
    source "/opt/#{splunk_file}"
  end
when "omnios"
  # tl;dr, this was more lines than you want to read, and
  # will be covered in the next section.
end

Why is this bad? After all, we’re selecting the proper package resource to install from a local file on disk. The main issue is the conditional creates different resources that can’t be looked up in the resource collection. Our recipe doesn’t do this, but perhaps a wrapper cookbook would. The consumer wrapping the cookbook has to duplicate this logic in their own. Instead, it is better to select the provider for a single package resource.

1
2
3
4
5
6
7
8
9
10
package "/opt/#{splunk_file(node['splunk']['forwarder']['url'])}" do
  case node['platform_family']
  when 'rhel'
    provider Chef::Provider::Package::Rpm
  when 'debian'
    provider Chef::Provider::Package::Dpkg
  when 'omnios'
    provider Chef::Provider::Package::Solaris
  end
end

Definitions Aren’t Bad

Definitions are simply defined as recipe “macros.” They are not actually Chef Resources themselves, they just look like them, and contain their own Chef resources. This has some disadvantages, such as lack of metaparameters (like action), which has lead people to prefer using the “Lightweight Resource/Provider” (LWRP) DSL instead. In fact, some feel that definitions are bad, and that one should feel bad for using them. I argue that they have their place. One advantage is their relative simplicity.

In our splunk42 cookbook, the client and server recipes duplicate a lot of logic. As mentioned a lot of this is case statements for the Splunk package file. They also repeat the same logic for choosing the provider to install the package. I snipped the content from the when "omnios" block, but it looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
cache_dir = Chef::Config[:file_cache_path]
splunk_pkg = splunk_file.gsub(/\.Z/, '')

execute "uncompress /opt/#{splunk_file}" do
  not_if { ::File.exists?(splunk_cmd) }
end

cookbook_file "#{cache_dir}/splunk-nocheck" do
  source "splunk-nocheck"
end

file "#{cache_dir}/splunkforwarder-response" do
  content "BASEDIR=/opt"
end

pkgopts = ["-a #{cache_dir}/splunk-nocheck",
           "-r #{cache_dir}/splunkforwarder-response"]

package "splunkforwarder" do
  source "/opt/#{splunk_pkg}"
  options pkgopts.join(' ')
  provider Chef::Provider::Package::Solaris
end

(Note: the logic for setting the provider is required since we’re not using the default over-the-network package providers, and installing from a local file on the system.)

This isn’t too bad on its own, but needs to be repeated again in the server recipe if one wanted to run a Splunk server on OmniOS. The actual differences between the client and server package installation are the package name, splunkforwarder vs splunk. The earlier URL attribute example established a forwarder and server attribute. Using a definition, named splunk_installer, allows us to simplify the package installation used by the client and server recipes to look like this:

1
2
3
4
5
6
splunk_installer 'splunkforwarder' do
  url node['splunk']['forwarder']['url']
end
splunk_installer 'splunk' do
  url node['splunk']['server']['url']
end

How is this better than an LWRP? Simply that there was less ceremony in creating it. There is less cognitive load for a cookbook developer to worry about. Definitions by their very nature of containing resources are already idempotent and convergent with no additional effort. They also automatically support why-run mode, whereas in an LWRP that must be done by the developer. Finally, between resources in the definition and the rest of the Chef run, notifications may be sent.

Contrast this to an LWRP, we need resources and providers directories, and the attributes of the resource need to be defined in the resource. Then the action methods need to be written in the provider. If we’re using inline resources (which we are) we need to declare those so any notifications work. Finally, we should ensure that why-run works properly.

The actual definition is ~40 lines, and can be viewed in the cookbook repository. I don’t have a comparable LWRP for this, but suffice to say that it would be longer and more complicated than the definition.

Reasonability About Search

Search is one of the killer features of running a Chef Server. Dynamically configuring load balancer configuration, or finding the master database server is simple with a search. Because we often think about the functionality a service provides based on the role it serves, we end up doing searches that look like this:

1
splunk_servers = search(:node, "role:splunk-server")

Then we do something with splunk_servers, like send it to a template. What if someone doesn’t like the role name? Then we have to do something like this:

1
splunk_servers = search(:node, "role:#{node['splunk']['server_role']}")

Then consumers of the cookbook can use whatever server role name they want, and just update the attribute for it. But, the internet has said that roles are bad, so we shouldn’t use them (even though they aren’t ;)). So instead, we need something like one of these queries:

1
2
3
splunk_servers = search(:node, "recipes:splunk42\:\:server")
#or
splunk_servers = search(:node, "#{node['splunk']['server_search_query']}")

The problem with the first is similar to the problem with the first (role:splunk-server), we need knowledge about the run list in order to search properly. The problem with the second is that we now have to worry about constructing a query properly as a string that gets interpolated correctly.

How can we improve this? I think it is more “Chef-like” to use an attribute on the server’s node object itself that informs queries the intention that the node is in fact a Splunk server. In our chef-splunk cookbook, we use node['splunk']['is_server']. The query looks like this:

1
splunk_servers = search(:node, "splunk_is_server:true")

This reads clearly, and the is_server attribute can be set in one of 15 places (for good or bad, but that’s a different post).

Repeating Resources, Composable Recipes

In the past, it was deemed okay to repeat resources across recipes when those recipes were not included on the same node. For example, client and server recipes that have similar resource requirements, but may pass in separate data. Another example is in the haproxy) cookbook I wrote where one recipe statically manages the configuration files, and the other uses a Chef search to populate the configuration.

As I have mentioned above, a lot of code was duplicated between the client and server recipes for our splunk42 cookbook: user and group, the case statements, package resources, execute statements (that haven’t been shared here), and the service resource. It is definitely important to ensure that all the resources needed to converge a recipe are defined, particularly when using notifications. That is why sometimes a recipe will have a service resource with no actions like this:

1
service 'mything'

However Chef 11will generate a warning about cloned resources when they are repeated in the same Chef run.

Why is this bad? Well, CHEF-3694 explains in more detail that particular issue, of cloned resources. The other reason is that it makes recipes harder to reuse when they have a larger scope than absolutely necessary. How can we make this better? A solution to this is to write small, composable recipes that contain resources that may be optional for certain use cases. For example, we can put the service resource in a recipe and include that:

1
2
3
4
5
service 'splunk' do
  supports :status => true, :restart => true
  provider Chef::Provider::Service::Init
  action :start
end

Then when we need to make sure we have the service resource available (e.g., for notifications):

1
2
3
4
5
6
7
template "#{splunk_dir}/etc/system/local/outputs.conf" do
  source 'outputs.conf.erb'
  mode 0644
  variables :splunk_servers => splunk_servers
  notifies :restart, 'service[splunk]'
end
include_recipe 'chef-splunk::service'

Note that the service is included after the resource that notifies it. This is a feature of the notification system, where the notified resource can appear anywhere in the resource collection, and brings up another excellent practice, which is to declare service resources after other resources which affect their configuration. This prevents a race condition where, if a bad config is deployed, the service would attempt to start, fail, and cause the Chef run to exit before the config file could correct the problem.

Making recipes composable in this way means that users can pick and choose the ones they want. Our chef-splunk cookbook has a prescriptive default recipe, but the client and server recipes mainly include the others they need. If someone doesn’t share our opinion on this for their use case, they can pick and choose the ones they want. Perhaps they have the splunk user and group created on systems through some other means. They won’t need the chef-splunk::user recipe, and can write their own wrapper to handle that. Overall this is good, though it does mean there are multiple places where a user must look to follow a recipe.

Plaintext Secrets

Managing secrets is one of the hardest problems to solve in system administration and configuration management. In Chef, it is very easy to simply set attributes, or use data bag items for authentication credentials. Our old splunk42 cookbook had this:

1
splunk_password = node[:splunk][:auth].split(':')[1]

Where node[:splunk][:auth] was set in a role with the username:password. This isn’t particularly bad since our Chef server runs on a private network and is secured with HTTPS and RSA keys, but a defense in depth security posture has more controls in place for secrets.

How can this be improved? At Chef, we started using Chef Vault to manage secrets. I wrote a post about chef-vault a few months ago, so I won’t dig too deep into the details here. The current chef-splunk cookbook loads the authentication information like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
splunk_auth_info = chef_vault_item(:vault, "splunk_#{node.chef_environment}")['auth']
user, pw = splunk_auth_info.split(':')

execute "#{splunk_cmd} edit user #{user} -password '#{pw}' -role admin -auth admin:changeme" do
  not_if { ::File.exists?("#{splunk_dir}/etc/.setup_#{user}_password") }
end

file "#{splunk_dir}/etc/.setup_#{user}_password" do
  content 'true\n'
  owner 'root'
  group 'root'
  mode 00600
end

The first line loads the authentication information from the encrypted-with-chef-vault data bag item. Then we make a couple of convenient local variables, and change the password from Splunk’s built-in default. Then, we control convergence of the execute by writing a file that indicates that the password has been set.

The advantage of this over attributes or data bag items is that the content is encrypted. The advantage over regular encrypted data bags is that we don’t need to distribute the secret key out to every system, we can update the list of nodes that have access with a knife command.

Conclusion

Neither Chef (the company), nor I are here to tell anyone how to write cookbooks. One of the benefits of Chef (the product) is its flexibility, allowing users to write blocks of Ruby code in recipes that quickly solve an immediate problem. That’s how we got to where we were with splunk42, and we certainly have other cookbooks that can be refactored similarly. When it comes to sharing cookbooks with the community, well-factored, easy to follow, understand, and use code is preferred.

Many of the ideas here came from community members like Miah Johnson, Noah Kantrowitz, Jamie Winsor, and Mike Fiedler. I owe them thanks for challenging me over the years on a lot of the older patterns that I held onto. Together we can build better automation through cookbooks, and a strong collaborative community. I hope this information is helpful to those goals.

Managing Multiple AWS Account Credentials

UPDATE: All non-default profiles must have their profile name start with “profile.” Below, this is “profile nondefault.” The ruby code is updated to reflect this.

In this post, I will describe my local setup for using the AWS CLI, the AWS Ruby SDK, and of course the Knife EC2 plugin.

The general practice I’ve used is to set the appropriate shell environment variables that are used by default by these tools (and the “legacy” ec2-api-tools, the java-based CLI). Over time and between tools, there have been several environment variables set:

1
2
3
4
5
6
7
8
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
AWS_SSH_KEY
AMAZON_ACCESS_KEY_ID
AMAZON_SECRET_ACCESS_KEY
AWS_ACCESS_KEY
AWS_SECRET_KEY

There is now a config file (ini-flavored) that can be used to set credentials, ~/.aws/config. Each ini section in this file is a different account’s credentials. For example:

1
2
3
4
5
6
7
8
[default]
aws_access_key_id=MY_DEFAULT_KEY
aws_secret_access_key=MY_DEFAULT_SECRET
region=us-east-1
[profile nondefault]
aws_access_key_id=NOT_MY_DEFAULT_KEY
aws_secret_access_key=NOT_MY_DEFAULT_SECRET
region=us-east-1

I have two accounts listed here. Obviously, the actual keys are not listed :). I source a shell script that sets the environment variables with these values. Before, I maintained a separate script for each account. Now, I install the inifile RubyGem and use a one-liner for each of the keys.

1
2
3
4
export AWS_ACCESS_KEY_ID=`ruby -rinifile -e "puts IniFile.load(File.join(File.expand_path('~'), '.aws', 'config'))['default']['aws_access_key_id']"`
export AWS_SECRET_ACCESS_KEY=`ruby -rinifile -e "puts IniFile.load(File.join(File.expand_path('~'), '.aws', 'config'))['default']['aws_secret_access_key']"`
export AWS_DEFAULT_REGION="us-east-1"
export AWS_SSH_KEY='jtimberman'

This will load the specified file, ~/.aws/config with the IniFile.load method, retrieving the default section’s aws_access_key_id value. Then repeat the same for the aws_secret_access_key.

To use the nondefault profile:

1
2
export AWS_ACCESS_KEY_ID=`ruby -rinifile -e "puts IniFile.load(File.join(File.expand_path('~'), '.aws', 'config'))['profile nondefault']['aws_access_key_id']"`
export AWS_SECRET_ACCESS_KEY=`ruby -rinifile -e "puts IniFile.load(File.join(File.expand_path('~'), '.aws', 'config'))['profile nondefault']['aws_secret_access_key']"`

Note that this uses ['profile nondefault'].

Since different tools historically have used slightly different environment variables, I export those too:

1
2
3
4
export AMAZON_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
export AMAZON_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY
export AWS_ACCESS_KEY=$AWS_ACCESS_KEY_ID
export AWS_SECRET_KEY=$AWS_SECRET_ACCESS_KEY

I create a separate config script for each account.

The AWS CLI tool will automatically use the ~/.aws/config, and can load different profiles with the --profile option. The aws-sdk Ruby library will use the environment variables, however. So authentication in a Ruby script is automatically set up.

1
2
require 'aws-sdk'
iam = AWS::IAM.new

Without this, it would be:

1
2
3
require 'aws-sdk'
iam = AWS::IAM.new(:access_key_id => 'YOUR_ACCESS_KEY_ID',
                   :secret_access_key => 'YOUR_SECRET_ACCESS_KEY')

Which is a little ornerous.

To use this with knife-ec2, I have the following in my .chef/knife.rb:

1
2
knife[:aws_access_key_id]      = ENV['AWS_ACCESS_KEY_ID']
knife[:aws_secret_access_key]  = ENV['AWS_SECRET_ACCESS_KEY']

Naturally, since knife.rb is Ruby, I could use Inifile.load there, but I only started using that library recently, and I have my knife configuration setup already.

Preview Chef Client Local Mode

Opscode Developer John Keiser mentioned that a feature for Chef Zero he’s been working on, “local mode,” is now in Chef’s master branch. This means it should be in the next release (11.8). I took the liberty to check this unreleased feature out.

Let’s just say, it’s super awesome and John has done some amazing work here.

PREVIEW

This is a preview of an unreleased feature in Chef. All standard disclaimers apply :).

Install

This is in the master branch of Chef, not released as a gem yet. You’ll need to get the source and build a gem locally. This totally assumes you’ve installed a sane ruby and bundler on your system.

1
2
3
4
5
git clone git://github.com/opscode/chef.git
cd chef
bundle install
bundle exec rake gem
gem install  pkg/chef-11.8.0.alpha.0.gem

Note Alpha!

Setup

Next, point it at a local repository. I’ll use a simple example.

1
2
3
4
git clone git://github.com/opscode/chef-repo.git
cd chef-repo
knife cookbook create zero -o ./cookbooks
vi cookbooks/zero/recipes/default.rb

I created a fairly trivial example recipe to show that this will support search, and data bag items:

1
2
3
4
5
6
7
8
9
10
a = search(:node, "*:*")
b = data_bag_item("zero", "fluff")

file "/tmp/zerofiles" do
  content a[0].to_s
end

file "/tmp/fluff" do
  content b.to_s
end

This simply searches for all nodes, and uses the content of the first node (the one we’re running on presumably) for a file in /tmp. It also loads a data bag item (which I created) and uses it for the content of another file in /tmp.

1
2
mkdir -p data_bags/zero
vi data_bags/zero/fluff.json

The data bag item:

1
2
3
4
{
  "id": "fluff",
  "clouds": "Are fluffy"
}

Converge!

Now, converge the node:

1
chef-client -z -o zero

The -z, or --local-mode argument is the magic that sets up Chef Zero, and loads all the contents of the repository. The -o zero tells Chef to use a one time run list of the “zero” recipe.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[2013-10-10T23:53:32-06:00] WARN: No config file found or specified on command line, not loading.
Starting Chef Client, version 11.8.0.alpha.0
[2013-10-10T23:53:36-06:00] WARN: Run List override has been provided.
[2013-10-10T23:53:36-06:00] WARN: Original Run List: [recipe[zero]]
[2013-10-10T23:53:36-06:00] WARN: Overridden Run List: [recipe[zero]]
resolving cookbooks for run list: ["zero"]
Synchronizing Cookbooks:
  - zero
Compiling Cookbooks...
Converging 2 resources
Recipe: zero::default
  * file[/tmp/zerofiles] action create
    - create new file /tmp/zerofiles
    - update content in file /tmp/zerofiles from none to 0a038a
        --- /tmp/zerofiles      2013-10-10 23:53:36.368059768 -0600
        +++ /tmp/.zerofiles20131010-6903-10cvytu        2013-10-10 23:53:36.368059768 -0600
        @@ -1 +1,2 @@
        +node[jenkins.int.housepub.org]
  * file[/tmp/fluff] action create
    - create new file /tmp/fluff
    - update content in file /tmp/fluff from none to d46bab
        --- /tmp/fluff  2013-10-10 23:53:36.372059683 -0600
        +++ /tmp/.fluff20131010-6903-1l3i1h     2013-10-10 23:53:36.372059683 -0600
        @@ -1 +1,2 @@
        +data_bag_item[fluff]
Chef Client finished, 2 resources updated

The diff output from each of the file resources shows that the content does in fact come from the search (a node object was returned) and a data bag item (a data bag item object was returned).

What’s Next?

Since this is a feature of Chef, it will be documented and released, so look for that in the next version of Chef.

I can see this used for testing purposes, especially for recipes that make use of combinations of data bags and search, such as Opscode’s nagios cookbook.

Questions

  • Does it work with Berkshelf?

I don’t know. Probably not (yet).

  • Does it work with Test Kitchen?

I don’t know. Probalby not (yet). Provisioners in test-kitchen would need to be (re)written.

  • Should I use this in production?

This is an unreleased feature in the master branch. What do you think? :)

  • When will this be released?

I don’t know the schedule for 11.8.0. Soon?

  • Where do I find out more, or get involved?

Join #chef-hacking in irc.freenode.net, the chef-dev mailing list, or attend the Chef Community Summit (November 12-13, 2013 in Seattle).

Switching MyOpenID to Google OpenID

You may be aware that MyOpenID is shutting down in February 2014.

The next best thing to use IMO, is Google’s OpenID, since they have 2-factor authentication. Google doesn’t really expose the OpenID URL in a way that makes it as easy to use as “username.myopenid.com.” Fortunately, it’s relatively simple to add to a custom domain hosted by, for example, GitHub pages. My coworker, Stephen Delano, pointed me to this pro-tip.

The requirement is to put a <link> tag in the HTML header of the site. It should look like this:

1
2
<link rel="openid2.provider" href="https://www.google.com/accounts/o8/ud?source=profiles" />
<link rel="openid2.local_id" href="http://www.google.com/profiles/A_UNIQUE_GOOGLE_PROFILE_ID />

Obviously you need a Google Profile, but anyone interested in doing this probably has a Google+ account for Google Hangouts anyway :).

If you’re like me and have your custom domain hosted as an Octopress blog, this goes in source/_includes/custom/head.html. Then deploy the site and in a few moments you’ll be able to start using your site as an OpenID.

Managing Secrets With Chef Vault

Two years ago, I wrote a post about using Chef encrypted data bags for SASL authentication with Postfix. At the time, my ISP didn’t allow non-authenticated SMTP, so I had to find a solution so I could get cronspam and other vital email from my servers at home. I’ve since switched ISPs to one that doesn’t care so much about this, so I’m not using any of that code anymore.

However, that doesn’t mean I don’t have secrets to manage! I actually don’t for my personal systems due to what I’m managing with Chef now, but we certainly do for Opscode’s hosted Enterprise Chef environment. The usual suspects for any web application are required: database passwords, SSL certificates, service API tokens, etc.

We’re evaluating chef-vault as a possible solution. This blog post will serve as notes for me so I can remember what I did when my terminal history is gone, and hopefully information for you to be able to use in your own environment.

Chef Vault

Chef Vault is an open source project published by Nordstrom. It is distributed as a RubyGem. You’ll need it installed on your local workstation so you can encrypt sensitive secrets, and on any systems that need to decrypt said secrets. Since the workstation is where we’re going to start, install the gem. I’ll talk about using this in a recipe later.

1
% gem install chef-vault

Use Cases

Now, for the use cases, I’m going to take two fairly simple examples, and explain how chef-vault works along the way.

  1. A username/password combination. The vaultuser will be created on the system with Chef’s built-in user resource.
  2. A file with sensitive content. In this case, I’m going to use a junk RSA private key for vaultuser.

Secrets are generally one of these things. Either a value passed into a command-line program (like useradd) or a file that should live on disk (like an SSL certificate or RSA key).

Command-line Structure

Chef Vault includes knife plugins to allow you to manage the secrets from your workstation, uploading them to the Chef Server just like normal data bags. The secrets themselves live in Data Bags on the Chef Server. The “bag” is called the “vault” for chef-vault.

After installation, the encrypt and decrypt sub-commands will be available for knife.

1
2
3
4
5
6
knife encrypt create [VAULT] [ITEM] [VALUES] --mode MODE --search SEARCH --admins ADMINS --json FILE
knife encrypt delete [VAULT] [ITEM] --mode MODE
knife encrypt remove [VAULT] [ITEM] [VALUES] --mode MODE --search SEARCH --admins ADMINS
knife rotate secret [VAULT] [ITEM] --mode MODE
knife encrypt update [VAULT] [ITEM] [VALUES] --mode MODE --search SEARCH --admins ADMINS --json FILE
knife decrypt [VAULT] [ITEM] [VALUES] --mode MODE

The README and Examples document these quite well.

Mode: Solo vs Client

I’m using Chef with a Chef Server (Enterprise Chef), so I’ll specify --mode client for the knife commands.

It is important to note the MODE in the chef-vault knife plugin commands affects where the encrypted data bags will be saved. Chef supports data bags with both Solo and Client/Server use. When using chef-solo, you’ll need to configure data_bag_path in your knife.rb. That is, even if you’re using Solo, since these are knife plugins, the configuration is for knife, not chef-solo. I’m using a Chef Server though, so I’m going to use --mode client.

Create a User with a Password

The user I’m going to create is the arbitrarily named vaultuser, with the super secret password, chef-vault. I’m going to use this on a Linux system with SHA512 hashing, so first I generate a password using mkpasswd:

1
2
3
% mkpasswd -m sha-512
Password: chef-vault
$6$VqEIDjsp$7NtPMhA9cnxvSMTE9l7DMmydJJEymi9b4t1Vhk475vrWlfxMgVb3bDLhpk/RZt0J3X7l5H8WnqFgvq3dIa9Kt/

Note: This is the mkpasswd(1) command from the Ubuntu 10.04 mkpasswd package.

Create the Item

The command I’m going to use is knife encrypt create since this is a new secret. I’ll show two examples. First, I’ll pass in the raw JSON data as “values”. You would do this if you’re not going to store the unencrypted secret on disk or in a repository. Second, I’ll pass a JSON file. You would do this if you want to store the unencrypted secret on disk or in a repository.

1
2
3
4
% knife encrypt create secrets vaultuser \
  '{"vaultuser":"$6$VqEIDjsp$7NtPMhA9cnxvSMTE9l7DMmydJJEymi9b4t1Vhk475vrWlfxMgVb3bDLhpk/RZt0J3X7l5H8WnqFgvq3dIa9Kt/"}' \
  --search 'role:base' \
  --admins jtimberman --mode client

The [VALUES] in this command is raw JSON that will be created in the data bag item by chef-vault. The --search option tells chef-vault to use the public keys of the nodes matching the SOLR query for encrypting the value. Then during the Chef run, chef-vault uses those node’s private keys to decrypt the value. The --admins option tells chef-vault the list of users on the Chef Server who are also allowed to decrypt the secret. This is specified as a comma separated string for multiple admins. Finally, as I mentioned, I’m using a Chef Server so I need to specify --mode client, since “solo” is the default.

Here’s the equivalent, using a JSON file named secrets_vaultuser.json. It has the content:

1
{"vaultuser":"$6$VqEIDjsp$7NtPMhA9cnxvSMTE9l7DMmydJJEymi9b4t1Vhk475vrWlfxMgVb3bDLhpk/RZt0J3X7l5H8WnqFgvq3dIa9Kt/"}

The command is:

1
2
3
4
% knife encrypt create secrets vaultuser \
  --json secrets_vaultuser.json
  --search 'role:base' \
  --admins jtimberman --mode client

Now, let’s see what has been created on the Chef Server. I’ll use the core Chef knife plugin, data bag item show for this.

1
2
3
% knife data bag show secrets
vaultuser
vaultuser_keys

I now have a “secrets” data bag, with two items. The first, vaultuser is the one that contains the actual secret. Let’s see:

1
2
3
4
5
6
7
8
9
% knife data bag show secrets vaultuser
id:        vaultuser
vaultuser:
  cipher:         aes-256-cbc
  encrypted_data: j+/fFM7ist6I7K360GNfzSgu6ix63HGyXN2ZAd99R6H4TAJ4pQKuFNpJXYnC
  SXA5n68xn9frxHAJNcLuDXCkEv+F/MnW9vMlTaiuwW/jO++vS5mIxWU170mR
  EgeB7gvPH7lfUdJFURNGQzdiTSSFua9E06kAu9dcrT83PpoQQzk=
  iv:             cu2Ugw+RpTDVRu1QaaAfug==
  version:        1

As you can see, I have encrypted data. I also told chef-vault that my user can decrypt this. I need to use the knife plugin to do so:

1
2
3
% knife decrypt secrets vaultuser 'vaultuser' --mode client
secrets/vaultuser
  vaultuser: $6$VqEIDjsp$7NtPMhA9cnxvSMTE9l7DMmydJJEymi9b4t1Vhk475vrWlfxMgVb3bDLhpk/RZt0J3X7l5H8WnqFgvq3dIa9Kt/

The 'vaultuser' in quotes is the key from the hash of JSON data that I specified earlier. As you can see, the password is that which was generated from the mkpasswd command earlier.

But what nodes have access to decrypt this password? That’s what chef-vault stored in the vaultuser_keys item. Let’s look:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
% knife data bag show secrets vaultuser_keys
admins:              jtimberman
clients:
  os-945926465950316
  os-2790002246935003
id:                  vaultuser_keys
jtimberman:          0Q2bhw/kJl2aIVEwqY6wYhrrfdz9fdsf8tCiIrBih2ZORvV7EEIpzzKQggRX
4P4vnVQjMjfkRwIXndTzctCJONQYF50OSZi5ByXWqbich9iCWvVIbnhcLWSp
z5mQoSTNXyZz/JQZGnubkckh4wGLBFDrLJ6WKl6UNXH1dRwqDNo5sEK7/3Wn
b4ztVSRxzB01wVli0wLvFSZzGsKYJYINBcidnbIgLh/xGYGtBJVlgG2z/7TV
uN0b/qvGj8VlhbS6zPlwh39O3mexDdkLwry/+gbO1nj8qKNkKDKaix5zypwE
XdmdfMKNYGaM6kzG8cwuKZXLAgGAgblVUB1HP8+8kQ==

os-2790002246935003: kGQLsxsFmBe9uPuWxZpKiNBnqJq55hQZJLgaKdjG2Vvivv98RrFGz1y8Xbwe
uzeSgPgAURCZmxpNxpHrwvvKcvL77sBOL6TTKiNzs8n5B3ZOawy17dsuG24v
41R0cRMnYLgbLcjln9dpVe4Esr4goPxko+1XqBPik1SBapthQq/pLUJ1BIKh
Fxu1QVGj1w4HPUftLaUzeS33jKbtfvgZyZsYZBdVCVEVidOxC90WRf4wtkd6
Ueyj+0gd1QKv84Q387O1R5LtRMS6u+17PJinrcRIkVNZ6P1z6oT2Dasfvrex
rK3s5vD7v6jpkUW12Wj74Lz3Z6x3sKuIDzCtvEUnWw==

os-945926465950316:  XzTJrJ3TZZZ1u9L9p6DZledf3bo2ToH2yrLGZQKPV6/ANzElHXGcYrEdtP0q
14Nz1NzsqEftzviAebUUnc6ke91ltD8s6hNQQrPJRqkUoDlM7lNEwiUiz/dD
+sFI6CSzQptO3zPrUbAlUI1Zog5h7k/CCtiYtmFRD6wbAWnxmCqvLhO1jwqL
VNJ1vfjlFsG77BDm2HFw7jgleuxRGYEgBfCCuBuW70FAdUTvNHIAwKQVkfU/
Am75UYm7N4N0E+W76ZwojLoYtXXTV/iOGG1cw3C75SVAmCsBOuxUK/otub67
zsNDsKToKa+laxzXGylrmkTricYXIqVpIQO8OL5nnw==

As we can see, I have two nodes that are API clients with access to decrypt the data bag items. These values are all generated by chef-vault, and I’ll talk about how to update the list and rotate secrets later in this post.

Manage a User Password

Let’s manage a user resource with a password set to the value from our encrypted data bag using Chef Vault.

First, I created a cookbook named vault, and added it to the base role. It contains the following recipe:

1
2
3
4
5
6
7
8
9
10
11
12
chef_gem "chef-vault"
require "chef-vault"

vault = ChefVault::Item.load("secrets", "vaultuser")

user "vaultuser" do
  password vault['vaultuser']
  home "/home/vaultuser"
  supports :manage_home => true
  shell "/bin/bash"
  comment "Chef Vault User"
end

Let me break this down.

1
2
chef_gem "chef-vault"
require "chef-vault"

chef-vault is distributed as a RubyGem, and I want to use it in my recipe(s), so here I use the chef_gem resource. Then, I require it like any other Ruby library.

1
vault = ChefVault::Item.load("secrets", "vaultuser")

This is where the decryption happens. If I do this under a chef-shell, I can see:

1
2
chef:recipe > vault = ChefVault::Item.load("secrets", "vaultuser")
 => data_bag_item["secrets", "vaultuser", {"id"=>"vaultuser", "vaultuser"=>"$6$VqEIDjsp$7NtPMhA9cnxvSMTE9l7DMmydJJEymi9b4t1Vhk475vrWlfxMgVb3bDLhpk/RZt0J3X7l5H8WnqFgvq3dIa9Kt/"}]

ChefVault::Item.load takes two arguments, the “vault” or data bag, in this case secrets, and the “item”, in this case vaultuser. It returns a data bag item. Then in the user resource, I use the password:

1
2
3
4
5
6
7
user "vaultuser" do
  password vault['vaultuser']
  home "/home/vaultuser"
  supports :manage_home => true
  shell "/bin/bash"
  comment "Chef Vault User"
end

The important resource attribute here is password, where I’m using the local variable, vault and the vaultuser key from the item as decrypted by ChefVault::Item.load. When Chef runs, it will look like this:

1
2
3
4
5
6
Recipe: vault::default
  * chef_gem[chef-vault] action install
    - install version 2.0.1 of package chef-vault
  * chef_gem[chef-vault] action install (up to date)
  * user[vaultuser] action create
    - create user user[vaultuser]

Now, I can su to vaultuser using the password I created:

1
2
3
4
5
6
ubuntu@-2790002246935003:~$ su - vaultuser
Password: chef-vault
vaultuser@os-2790002246935003:~$ id
uid=1001(vaultuser) gid=1001(vaultuser) groups=1001(vaultuser)
vaultuser@os-2790002246935003:~$ pwd
/home/vaultuser

Yay! To show that the user was created with the right password, here’s the DEBUG log output:

1
2
3
4
5
6
INFO: Processing user[vaultuser] action create ((irb#1) line 12)
DEBUG: user[vaultuser] user does not exist
DEBUG: user[vaultuser] setting comment to Chef Vault User
DEBUG: user[vaultuser] setting password to $6$VqEIDjsp$7NtPMhA9cnxvSMTE9l7DMmydJJEymi9b4t1Vhk475vrWlfxMgVb3bDLhpk/RZt0J3X7l5H8WnqFgvq3dIa9Kt/
DEBUG: user[vaultuser] setting shell to /bin/bash
INFO: user[vaultuser] created

Next, I’ll create a secret that is a file rendered on the system.

Create a Private SSH Key

Suppose this vaultuser is to be used for deploying code by cloning a repository. It will need a private SSH key to authenticate, so I’ll create one, with an empty passphrase in this case.

1
2
3
4
5
6
% ssh-keygen -b 4096 -t rsa -f vaultuser-ssh
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in vaultuser-ssh.
Your public key has been saved in vaultuser-ssh.pub.

Get the SHA256 checksum of the private key. I use SHA256 because that’s what Chef uses for file content. We’ll use this to verify content later.

1
2
% sha256sum vaultuser-ssh
a83221c243c9d39d20761e87db6c781ed0729b8ff4c3b330214ebca26e2ea89d  vaultuser-ssh

Assume that I also created the SSH key on GitHub for this user.

In order to have a file’s contents be a JSON value for the data bag item, I’ll remove the newlines (\n), and generate the JSON:

1
2
ruby -rjson -e 'puts JSON.generate({"vaultuser-ssh-private" => File.read("vaultuser-ssh")})' \
  > secrets_vaultuser-ssh-private.json

Now, create the secret on the Chef Server:

1
2
3
4
5
knife encrypt create secrets vaultuser-ssh-private \
  --search 'role:base' \
  --json secrets_vaultuser-ssh-private.json \
  --admins jtimberman \
  --mode client

Let’s verify the server has what we need:

1
2
3
4
5
6
7
8
9
10
% knife data bag show secrets vaultuser-ssh-private
id:                    vaultuser-ssh-private
vaultuser-ssh-private:
  cipher:         aes-256-cbc
  encrypted_data: mRRToM2N/0F+OyJxkYlHo/cUtHSIuy69ROAKuGoHIhX9Fr5vFTCM4RyWQSTN
  trimmed for brevity even though scrollbars
% knife decrypt secrets vaultuser-ssh-private 'vaultuser-ssh-private' --mode client
secrets/vaultuser-ssh-private
  vaultuser-ssh-private: -----BEGIN RSA PRIVATE KEY-----
trimmed for brevity even though scrollbars

Manage the Key File

Now, I’ll manage the private key file with the vault cookbook.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
vault_ssh = ChefVault::Item.load("secrets", "vaultuser-ssh-private")

directory "/home/vaultuser/.ssh" do
  owner "vaultuser"
  group "vaultuser"
  mode 0700
end

file "/home/vaultuser/.ssh/id_rsa" do
  content vault_ssh["vaultuser-ssh-private"]
  owner "vaultuser"
  group "vaultuser"
  mode 0600
end

Again, let’s break this up a bit. First, load the item from the encrypted data bag like we did before.

1
vault_ssh = ChefVault::Item.load("secrets", "vaultuser-ssh-private")

Next, make sure that the vaultuser has an .ssh directory with the correct permissions.

1
2
3
4
5
directory "/home/vaultuser/.ssh" do
  owner "vaultuser"
  group "vaultuser"
  mode 0700
end

Finally, manage the content of the private key file with a file resource and the content resource attribute. The value of vault_ssh["vaultuser-ssh-private"] will be a string, with \n’s embedded, but when it’s rendered on disk, it will display properly.

1
2
3
4
5
6
file "/home/vaultuser/.ssh/id_rsa" do
  content vault_ssh["vaultuser-ssh-private"]
  owner "vaultuser"
  group "vaultuser"
  mode 0600
end

And now run chef on a target node:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Recipe: vault::default
  * chef_gem[chef-vault] action install (up to date)
  * user[vaultuser] action create (up to date)
  * directory[/home/vaultuser/.ssh] action create
    - create new directory /home/vaultuser/.ssh
    - change mode from '' to '0700'
    - change owner from '' to 'vaultuser'
    - change group from '' to 'vaultuser'

  * file[/home/vaultuser/.ssh/id_rsa] action create
    - create new file /home/vaultuser/.ssh/id_rsa with content checksum a83221
        --- /tmp/chef-tempfile20130909-1918-1v5hezo   2013-09-09 22:41:21.887239999 +0000
        +++ /tmp/chef-diff20130909-1918-xwbmsn    2013-09-09 22:41:21.883240065 +0000
        @@ -0,0 +1,51 @@
        +-----BEGIN RSA PRIVATE KEY-----
        +MIIJJwIBAAKCAgEAtZmwFTlVOBbr2ZfG+cDtUGx04xCcgaa0p0ISmeyMEoGYH/CP
        output trimmed because its long even though scrollbars again

Note the content checksum, a83221. This will match the checksum of the source file from earlier (scroll up!), and the one rendered:

1
2
ubuntu@os-2790002246935003:~$ sudo sha256sum /home/vaultuser/.ssh/id_rsa
a83221c243c9d39d20761e87db6c781ed0729b8ff4c3b330214ebca26e2ea89d  /home/vaultuser/.ssh/id_rsa

Yay! Now, we can SSH to GitHub (note, this is fake GitHub for example purposes).

1
2
3
4
5
6
7
ubuntu@os-2790002246935003:~$ su - vaultuser
Password: chef-vault
vaultuser@os-2790002246935003:~$ ssh -i .ssh/id_rsa github@172.31.7.15
$ hostname
os-945926465950316
$ id
uid=1002(github) gid=1002(github) groups=1002(github)

Updating a Secret

What happens if we need to update a secret? For example, if an administrator leaves the organization, we will want to change the vaultuser password (and SSH private key).

1
2
3
% mkpasswd -m sha-512
Password: gone-user
$6$zM5STNtXdmsrOSm$svJr0tauijqqxTjnMIGJGJPv5V3ovMFCQo.ZDBleiL.yOxcngRqh9yAjpMAsMBA7RlKPv5DKFd1aPZm/wUoKs.

The encrypt create command will return an error if the target already exists:

1
2
% knife encrypt create secrets vaultuser --search 'role:base' --json secrets_vaultuser.json --admins jtimberman --mode client
ERROR: ChefVault::Exceptions::ItemAlreadyExists: secrets/vaultuser already exists, use 'knife encrypt remove' and 'knife encrypt update' to make changes.

So, I need to use encrypt update. Note make sure that the contents of the JSON file are valid JSON.

1
% knife encrypt update secrets vaultuser --search 'role:base' --json secrets_vaultuser.json --admins jtimberman --mode client

encrypt update only updates the things that change, so I can also shorten this:

1
% knife encrypt update secrets vaultuser --json secrets_vaultuser.json --mode client

Since the search and the admins didn’t change.

Verify it:

1
2
3
% knife decrypt secrets vaultuser 'vaultuser' --mode client
secrets/vaultuser
  vaultuser: $6$zM5STNtXdmsrOSm$svJr0tauijqqxTjnMIGJGJPv5V3ovMFCQo.ZDBleiL.yOxcngRqh9yAjpMAsMBA7RlKPv5DKFd1aPZm/wUoKs.

Now, just run Chef on any nodes affected.

1
2
3
4
5
6
7
8
Recipe: vault::default
  * chef_gem[chef-vault] action install (up to date)
  * user[vaultuser] action create
    - alter user user[vaultuser]

  * directory[/home/vaultuser/.ssh] action create (up to date)
  * file[/home/vaultuser/.ssh/id_rsa] action create (up to date)
Chef Client finished, 1 resources updated

And su to the vault user with the gone-user password:

1
2
3
ubuntu@os-2790002246935003:~$ su - vaultuser
Password: gone-user
vaultuser@os-2790002246935003:~$

Managing Access to Items

There are three common scenarios which require managing the access to an item in the vault.

  1. A system needs to be taken offline, or otherwise prevented from accessing the item(s).
  2. A new system comes online that needs access.
  3. An admin user has left the organization.
  4. A new admin user has joined the organization.

Suppose we have a system that we need to take offline for some reason, so we want to disable its access to a secret. Or, perhaps we have a user who has left the organization that was an admin. We can do that in a few ways.

Update the Vault Item

The most straightforward way to manage access to an item is to use the update or remove sub-commands.

Remove a System

Suppose I want to remove node DEADNODE, I can qualify the search to exclude the node named DEADNODE:

1
2
3
4
% knife encrypt update secrets vaultuser \
  --search 'role:base NOT name:DEADNODE' \
  --json secrets_vaultuser.json \
  --admins jtimberman --mode client

Note, as before, admins didn’t change so I don’t need to pass that argument.

Add a New System

If the node has run Chef and is indexed on the Chef Server already, simply rerun the update command with the search:

1
2
3
4
% knife encrypt update secrets vaultuser \
  --search 'role:base' \
  --json secrets_vaultuser.json \
  --admins jtimberman --mode client

There’s a bit of a “Chicken and Egg” problem here, in that a new node might not be indexed for search if it tried to load the secret during a bootstrap beforehand. For example, if I create an OpenStack instance with the base role in its run list, the node doesn’t exist for the search yet. A solution here is to create the node with an empty run list, allowing it to register with the Chef Server, and then use knife bootstrap to rerun Chef with the proper run list. This is annoying, but no one claimed that chef-vault would solve all problems with shared secret management :–).

Remove an Admin

The admins argument takes a list. Earlier, I only had my userid as an admin. Suppose I created the item with “bofh” as an admin too:

1
2
3
4
% knife encrypt create secrets vaultuser \
  --search 'role:base' \
  --json secrets_vaultuser.json \
  --admins "jtimberman,bofh" --mode client

To remove the bofh user, use the encrypt remove subcommand. In this case, the --admins argument is the list of admins to remove, rather than add.

1
% knife encrypt remove secrets vaultuser --admins bofh --mode client

Add a New Admin

I want to add “mandi” as an administrator because she’s awesome and will help manage our secrets. As above, I just pass a comma-separated string, "jtimberman,mandi" to the --admins argument.

1
2
3
4
% knife encrypt update secrets vaultuser \
  --search 'role:base' \
  --json secrets_vaultuser.json \
  --admins "jtimberman,mandi" --mode client

Regenerate the Client

The heavyhanded way to remove access is to regenerate the API client on the Chef Server. For example, of my nodes, say I want to remove os-945926465950316:

1
2
3
4
% knife client reregister os-945926465950316
-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEAybzwv53tDLIzW+GHRJwLthZmiGTfZVyqQX6m6RGuZjemEIdy
trim trim

If you’re familiar with Chef Server’s authentication cycle, you’ll know that until that private key is copied to the node, it will completely fail to authenticate. However, once the /etc/chef/client.pem file is updated with the content from the knife command, we’ll see that the node fails to read the Chef Vault item:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
================================================================================
Recipe Compile Error in /var/chef/cache/cookbooks/vault/recipes/default.rb
================================================================================


OpenSSL::PKey::RSAError
-----------------------
padding check failed


Cookbook Trace:
---------------
  /var/chef/cache/cookbooks/vault/recipes/default.rb:4:in `from_file'


Relevant File Content:
----------------------
/var/chef/cache/cookbooks/vault/recipes/default.rb:

  1:  chef_gem "chef-vault"
  2:  require "chef-vault"
  3:
  4>> vault = ChefVault::Item.load("secrets", "vaultuser")
  5:
  6:  user "vaultuser" do
  7:    password vault["vaultuser"]
  8:    home "/home/vaultuser"
  9:    supports :manage_home => true
 10:    shell "/bin/bash"
 11:    comment "Chef Vault User"
 12:  end
 13:

Note I say this is heavy-handed because if you make a mistake, you need to re-upload every single secret that this node needs access to.

Removing Users

We can also remove user access from Enterprise Chef simply by disassociating that user from the organization on the Chef Server. I won’t show an example of that here, since I’m using Opscode’s hosted Enterprise Chef server and I’m the only admin, however :–).

Backing Up Secrets

To back up the secrets, as encrypted data from the Chef Server, use knife-essentials (comes with Chef 11+, available as a RubyGem for Chef 10).

1
2
3
4
5
% knife download data_bags/secrets/
Created data_bags/secrets/vaultuser_keys.json
Created data_bags/secrets/vaultuser.json
Created data_bags/secrets/vaultuser-ssh-private_keys.json
Created data_bags/secrets/vaultuser-ssh-private.json

For example, the vaultuser.json file looks like this:

1
2
3
4
5
6
7
8
9
{
  "id": "vaultuser",
  "vaultuser": {
    "encrypted_data": "3yREwInxdyKpf8nuTIivXAeuEzHt7o4vF4FsOwmVLHmMWol5nCBoMWF0YdaW\n3P3NpEAAAxYEYeJYdVkrdLqjjB2kTJdx0+ceh/RBHBWqmSeHOWFH9pCRGjV8\nfS5XaTueShb320b/+Ia8iqUJJWg6utnbJCDx+VMcGNggPXgPKC8=\n",
    "iv": "EI+y74Uj2uwq7EVaP+0K6Q==\n",
    "version": 1,
    "cipher": "aes-256-cbc"
  }
}

Since these are encrypted using a strong cipher (AES 256), they should be safe to store in repository. Unless you think the NSA has access to that repository ;–).

Conclusion

Secrets management is hard! Especially when you need to store secrets that are used by multiple systems, services, and people. Chef’s encrypted data bag feature isn’t a panacea, but it certainly helps. Hopefully, this blog post was informative. While I don’t always respond, I do read all comments posted here via Disqus, so let me know if something is out of whack, or needs an update.

Getting Started With Zones on OmniOS

I’ve become enamored with IllumOS recently. Years ago, I used Solaris (2.5.1 through 8) at IBM. Unfortunately (for me), I stopped using it before Solaris 10 brought all the cool toys to the yard – zones, zfs, dtrace, SMF. Thanks to OmniTI’s excellent IllumOS distribution, OmniOS, I’m getting acclimated with the awesomeness. I plan to write more about my experiences here.

First up, I spent today playing with zones. Zones are a kernel-level container technology similar to Linux containers/cgroups, or BSD jails. They’re fast and lightweight. At least two of the plans I have for them:

  1. Segregating the services on my home-server.
  2. Adding support to various tools in Chef’s ecosystem.

The following is basically a compilation of several different blog posts and documentation collections I’ve been poring over. Like most technical blog writers, I’m posting this so I can find it later :–).

Hardware

I have a number of options for learning OmniOS. I have spare hardware, or VMware, or OmniTI’s Vagrant box. I’m doing all three of these, but the main use will be on physical hardware, as I’m planning to port the aforementioned server to OmniOS (#1, above).

The details of the hardware are not important, except that I have a hard disk device c3t1d0, and a physical NIC device nge1 that are devoted to zones. To adapt these instructions for your own installation, change those device names where appropriate.

You can find the name of the disk device to use in your system with the format command.

root@menthe:~# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c3t0d0 <ATA-WDCWD1500AHFD-0-7QR5 cyl 18238 alt 2 hd 255 sec 63>
          /pci@0,0/pci1043,cb84@d/disk@0,0
       1. c3t1d0 <ATA-SAMSUNG HD501LJ-0-12-465.76GB>
          /pci@0,0/pci1043,cb84@d/disk@1,0
Specify disk (enter its number): ^D

Here I wanted to use the Samsung disk.

Use dladm to find the network devices:

root@menthe:~# dladm show-phys
LINK         MEDIA                STATE      SPEED  DUPLEX    DEVICE
nge0         Ethernet             up         1000   full      nge0
nge1         Ethernet             up         1000   full      nge1

Setup

The example zone here is named base. Replace base with any zone name you wish, e.g. webserver37 or noodlebarn. It’s also worth noting that I’m going to use DHCP, rather than static networking here. There are plenty of guides out there for static networking, and I had to hunt around for DHCP. Also worth noting is that this was all performed right after installing the OS.

First, create a zpool to use for zones. This is a 500G disk, so I have plenty of space.

zpool create zones c3t1d0

Next, create a VNIC on the interface which is devoted to zones (nge1). It can be named anything, but must end with a number.

dladm create-vnic -l nge1 vnicbase0

Rather than use the zonecfg REPL, I used the following configuration file, for repeatability.

create -b
set zonepath=/zones/base
set ip-type=exclusive
set autoboot=false
add net
set physical=vnicbase0
end
commit

Use this config file to configure the zone with zonecfg.

zonecfg -z base -f base.conf

Now we’re ready to install the OS in the new zone. This may take awhile as all the packages need to be downloaded.

zoneadm -z base install

The default nsswitch.conf(4) does not use DNS for hosts. This is fairly standard for Solaris/IllumOS. Also, the resolv.conf(4) is not configured automatically, which is a departure from automagic Linux distributions (and a thing I agree with).

cp /etc/nsswitch.dns /etc/resolv.conf /zones/base/root/etc

OmniOS does not use sysidcfg, so the way to make the new zone boot up with an interface configured for DHCP is to write out the ipadm.conf configuration for ipadm. The following is base.ipadm.conf that I used, with the vnicbase0 VNIC created with dladm earlier.

_ifname=vnicbase0;_family=2;
_ifname=vnicbase0;_family=26;
_ifname=vnicbase0;_aobjname=vnicbase0/v4;_dhcp=-1,no;

Copy this file to the zone.

cp base.ipadm.conf /zones/base/root/etc/ipadm/ipadm.conf

Now, boot the zone.

zoneadm -z base boot

Now you can log into the newly created zone and verify that things are working, and do any further configuration required.

zlogin -e ! base

I use ! as the escape character because I’m logging into my global zone over SSH. This means you disconnect with !. instead of ~..

Once complete, the zone can be cloned.

Clone a Zone

I’m going to clone the base zone to clonebase. Again, rename this to whatever you like.

First, a zone must be halted before it can be cloned.

zoneadm -z base halt

Now, create a new VNIC for the zone.

dladm create-vnic -l nge1 clonebase

Read the base zone’s configuration, and replace base with clonebase.

zonecfg -z base export | sed 's/base/clonebase/g' | tee clonebase.conf

Then, create the new zone configuration, and clone the base zone.

zonecfg -z clonebase -f clonebase.conf
zoneadm -z clonebase clone base

Again, ensure that the network configuration to use DNS is available.

cp /etc/nsswitch.dns /etc/resolv.conf /zones/clonebase/root/etc

Create the ipadm.conf config for the new zone. I named it clonebase.ipadm.conf

sed 's/base/clonebase/g' base.ipadm.conf > clonebase.ipadm.conf

Now copy this to the zone.

cp clonebase.ipadm.conf /zones/clonebase/root/etc/ipadm/ipadm.conf

Finally, boot the new zone.

zoneadm -z clonebase boot

Login and verify the new zone.

zlogin -e ! clonebase

Cleaning Up

Use the following to clean up the zone when it’s not needed anymore.

zone=clonebase
zoneadm -z $zone halt
zoneadm -z $zone uninstall -F
zonecfg -z $zone delete -F

Sans Prose

This gist contains all the things I did above minus the prose.

What’s Next?

I have a few goals in mind for this system. First of all, I want to manage the zones with Chef, of course. Some of the functions of the zones may be:

  • IPS package repository
  • Omnibus build system for OmniOS
  • Adding OmniOS support to cookbooks

I also want to facilitate plugins and the ecosystem around Chef for IllumOS, including zone based knife, vagrant and test-kitchen plugins.

Finally, I plan to convert my Linux home-server to OmniOS. There are a couple things I’m running that will require Linux (namely Plex), but fortunately, OmniOS has KVM thanks to SmartOS.

References

The following links were helpful in composing this post, and of course for the reference material they contain.

Starting ChefSpec Example

This is a quick post to introduce what I’m starting on testing with ChefSpec. This is from Opscode’s Java cookbook. While the recipe tested is really trivial, it actually has some nuances that require detailed testing.

First off, the whole thing is in this gist. I’m going to break it down into sections below. The file is spec/default_spec.rb in the java cookbook (not committed/pushed yet).

The chefspec gem is where all the magic comes from. You can read about ChefSpec on its home page. You’ll need to install the gem, and from there, run rspec to run the tests.

1
require 'chefspec'

Next, we’re going to describe the default recipe. We’re using the regular rspec “let” block to set up the runner to converge the recipe. Then, because we know/assume that the openjdk recipe is the default, we can say that this chef run should include the java::openjdk recipe.

1
2
3
4
5
describe 'java::default' do
  let (:chef_run) { ChefSpec::ChefRunner.new.converge('java::default') }
  it 'should include the openjdk recipe by default' do
    chef_run.should include_recipe 'java::openjdk'
  end

Next, this cookbook supports Windows. However, we have to set up the runner with the correct platform and version (this comes from fauxhai), and then set attributes that are required for it to work.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
context 'windows' do
    let(:chef_run) do
      runner = ChefSpec::ChefRunner.new(
        'platform' => 'windows',
        'version' => '2008R2'
        )
      runner.node.set['java']['install_flavor'] = 'windows'
      runner.node.set['java']['windows']['url'] = 'http://example.com/windows-java.msi'
      runner.converge('java::default')
    end
    it 'should include the windows recipe' do
      chef_run.should include_recipe 'java::windows'
    end
  end

Next are the contexts for other install flavors. The default recipe will include the right recipe based on the flavor, which is set by an attribute. So we set up an rspec context for each recipe, then set the install flavor attribute, and test that the right recipe was included.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
  context 'oracle' do
    let(:chef_run) do
      runner = ChefSpec::ChefRunner.new
      runner.node.set['java']['install_flavor'] = 'oracle'
      runner.converge('java::default')
    end
    it 'should include the oracle recipe' do
      chef_run.should include_recipe 'java::oracle'
    end
  end
  context 'oracle_i386' do
    let(:chef_run) do
      runner = ChefSpec::ChefRunner.new
      runner.node.set['java']['install_flavor'] = 'oracle_i386'
      runner.converge('java::default')
    end
    it 'should include the oracle_i386 recipe' do
      chef_run.should include_recipe 'java::oracle_i386'
    end
  end

Finally, a recent addition to this cookbook is support for IBM’s Java. In addition to setting the install flavor, we must set the URL where the IBM Java package is (see the README in the commit linked in that ticket for detail), and we can see that the ibm recipe is in fact included.

1
2
3
4
5
6
7
8
9
10
11
12
  context 'ibm' do
    let(:chef_run) do
      runner = ChefSpec::ChefRunner.new
      runner.node.set['java']['install_flavor'] = 'ibm'
      runner.node.set['java']['ibm']['url'] = 'http://example.com/ibm-java.bin'
      runner.converge('java::default')
    end
    it 'should include the ibm recipe' do
      chef_run.should include_recipe 'java::ibm'
    end
  end
end

This is just the start of the testing for this cookbook. We’ll need to test each individual recipe. However as I’ve not written that code yet, I don’t have examples. Stay tuned!

Test Kitchen and Jenkins

I’ve been working more with test-kitchen 1.0 alpha lately. The most recent thing I’ve done is set up a Jenkins build server to run test-kitchen on cookbooks. This post will describe how I did this for my own environment, and how you can use my new test-kitchen cookbook in yours… if you’re using Jenkins, anyway.

This is all powered by a relatively simple cookbook, and some click-click-clicking in the Jenkins UI. I’ll walk through what I did to set up my Jenkins system.

First, I started with Debian 7.0 (stable, released this past weekend). I installed the OS on it, and then bootstrapped with Chef. The initial test was to make sure everything installed correctly, and the commands were functioning. This was done in a VM, and is now handled by test-kitchen itself (how meta!) in the cookbook, kitchen-jenkins.

The cookbook, kitchen-jenkins is available on the Chef Community site. I started with a recipe, but extracted it to a cookbook to make it easier to share with you all. This is essentially a site cookbook that I use to customize my Jenkins installation so I can run test-kitchen builds.

I apply the recipe with a role, because I love the roles primitive in Chef :–). Here is the role I’m using:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{
  "name": "jenkins",
  "description": "Jenkins Build Server",
  "run_list": [
    "recipe[kitchen-jenkins]"
  ],
  "default_attributes": {
    "jenkins": {
      "server": {
        "home": "/var/lib/jenkins",
        "plugins": ["git-client", "git"],
        "version": "1.511",
        "war_checksum": "7e676062231f6b80b60e53dc982eb89c36759bdd2da7f82ad8b35a002a36da9a"
      }
    }
  },
  "json_class": "Chef::Role",
  "chef_type": "role"
}

The run list is only slightly different here than my actual role, I have a few other things in the run list, which are other site-specific recipes. Don’t worry about those now. The jenkins attributes are set to ensure the right plugins I need are available, and the right version of jenkins is installed.

(I’m going to leave out the details such as uploading cookbooks and roles, if you’re interested in test-kitchen, I’ll assume you’ve got that covered :–).)

Once Chef completes on the Jenkins node, I can reach the Jenkins UI, conveniently enough, via “http://jenkins:8080” (because I’ve made a DNS entry, of course). The next release of the Jenkins cookbook will have a resource for managing jobs, but for now I’m just going to create them in the webui.

For this example, I want to have two kinds of cookbook testing jobs. The first, is to simply run foodcritic and fail on any correctness matches. Second, I want to actually run test-kitchen.

A foodcritic job is simple:

  1. New job –> Build a free-style software project “foodcritic-COOKBOOK”.
  2. Source Code Management –> Git, supply the repository and the master branch.
  3. Set a build trigger to Poll SCM every 5 minutes, once an hour, whenever you like.
  4. Add a build step to execute a shell, “foodcritic . -f correctness”

I created a view for foodcritic jobs, and added them all to the view for easy organizing.

Next, I create a test-kitchen job:

  1. New job –> Copy existing job “foodcritic-COOKBOOK”, name the new job “test-COOKBOOK”.
  2. Uncheck Poll SCM, check “Build after other projects are built” and enter “foodcritic-COOKBOOK”.
  3. Replace the foodcritic command in the build shell command with “kitchen test”.

Now, the test kitchen test will only run if the foodcritic build succeeds. If the cookbook has any correctness lint errors, then the foodcritic build fails, and the kitchen build won’t run. This will help conserve resources.

Hopefully the kitchen-jenkins cookbook is helpful and this blog post will give you some ideas how to go about adding cookbook tests to your CI system, even if it’s not Jenkins.