jtimberman's Code Blog

Chef, Ops, Ruby, Linux/Unix. Opinions are mine, not my employer's (CHEF).

Awesome Syntax Highlighting in Keynote

I am working on my presentation for ChefConf. I plan to have quite a lot of code samples. I’ve found the options for getting code samples with nice syntax highlight a lackluster endeavour, with various GUI editors like TextMate, Sublime, and Atom having “Copy as RTF” plugins, but none of them being easily customizable.

So I did a quick Google search and happened on a gist I hadn’t seen before. It describes the following steps:

  1. Install Homebrew (done, I have that covered with Chef ;)).
  2. Install “highlight” with brew install highlight.
  3. Use highlight to transform the source code file to RTF and copy it to the clipboard.
  4. Paste the clipboard into Keynote.app.

This isn’t much different than the other solutions, except one super cool thing I learned about highlight.

It has styles.

1
2
3
highlight -w

<OMG LIST OF EIGHTY TWO DIFFERENT STYLES!!!>

That’s right, at least at the time of this writing highlight has 82 different styles available. Including my favorite(s) solarized – both light and dark. Note that the --help output says that this option is deprecated in the version I’ve installed (3.18_1), but the styles are in /usr/local/Cellar/highlight/VERSION/share/themes.

Highlight knows the syntax highlighting for a lot of languages, these are in /usr/local/Cellar/highlight/VERSION/share/langDefs. For example I can get my Ruby recipe highlighted with this:

1
highlight -s solarized-light -O rtf recipes/client.rb | pbcopy

This will use Courier New as the font, and depending on the theme/style used, some of the highlighting may be bold or italic. This is easy enough to change in Keynote though.

For up to date documentation and information about highlight, visit the author’s page.

Quick Tip: Create a Provisioner Node

This quick tip is brought to you by my preparation for my ChefConf talk about using Chef Provisioning to build a Chef Server Cluster, which is based on my blog post about the same. In the blog post I used chef-zero as my Chef Server, but for the talk I’m using Hosted Chef.

In order for the Chef Provisioning recipe to work the provisioning node – the node that runs chef-client – needs to have the appropriate permissions to manage objects on the Chef Server. This is easy with chef-zero – there are no ACLs at all. However in Hosted Chef, like any regular Chef Server, the ACLs don’t allow nodes’ API clients to modify other nodes, or API clients.

Fortunately we can do all the work necessary using knife, with the knife-acl plugin. In this quick tip, I’ll create a group for provisioning nodes, and give that group the proper permissions for the Chef Provisioning recipe to create the machines’ nodes and clients.

First of all, I’m using ChefDK, and it’s my Ruby environment too, so install the gem:

1
chef gem install knife-acl

Next, use the knife group subcommand to create the new group. Groups are a number of users and/or API clients. By default, an organization on Hosted Chef will have admins, billing-admins, clients, and users. Let’s create provisioners now.

1
knife group create provisioners

The Role-based access control (RBAC) system in the Chef Server allows us to assign read, create, update, grant, and delete permissions to various objects in the organization. Containers are a special holder of other types of objects, in this case we need to add permissions for the clients and nodes containers. This is what allows the Chef Provisioning recipe’s machine resources to have their Chef objects created.

1
2
3
4
5
6
7
8
9
for i in read create update grant delete
do
  knife acl add containers clients $i group provisioners
done

for i in read create update grant delete
do
  knife acl add containers nodes $i group provisioners
done

Next, we need the API client that will be used by the Chef Provisioning node to authenticate with the Chef Server, and the node needs to be created as well. By default the client will automatically have permissions for the node object that has the same name.

1
2
knife client create -d chefconf-provisioner > ~/.chef/chefconf-provisioner.pem
knife node create -d chefconf-provisioner

Finally, we need to put the new API client into the provisioners group that was created earlier. First we need to get a mapping of the actors in the organization. Then we can add the client to the group.

1
2
knife actor map
knife group add actor provisioners chefconf-provisioner

The knife actor map command will generate a YAML file like this:

1
2
3
4
5
6
7
8
9
---
:user_map:
  :users:
    jtimberman: 12345678901234567890123456780123
  :usags:
    12345678901234567890123456780123: jtimberman
:clients:
  chefconf-provisioner: chefconf-provisioner
  jtimberman-chefconf-validator: jtimberman-chefconf-validator

This maps users to their USAG and stores a list of clients. More information about this is in the knife-acl README

At this point, we have a node, with the private key in ~/.chef that can be used with the Chef Server to use Chef Provisioning’s machine resource. We can also perform additional tasks that require having a node object, such as create secrets as Chef Vault items:

1
knife vault create secrets dnsimple -M client -J data_bags/secrets/dnsimple.json -A jtimberman -S 'name:chefconf-provisioner'

The entire series of commands is below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
chef gem install knife-acl
knife group create provisioners

for i in read create update grant delete
do
  knife acl add containers clients $i group provisioners
done

for i in read create update grant delete
do
  knife acl add containers nodes $i group provisioners
done

knife client create -d chefconf-provisioner > ~/.chef/chefconf-provisioner.pem
knife node create -d chefconf-provisioner
knife actor map
knife group add actor provisioners chefconf-provisioner

knife vault create secrets dnsimple -M client -J data_bags/secrets/dnsimple.json -A jtimberman -S 'name:chefconf-provisioner'

Hopefully this helps you out with your use of Chef Provisioning, and a non-zero Chef server. If you have further questions, find me at ChefConf!

Quick Tip: Define Resources to Notifiy in LWRPs

In this quick tip, I’ll explain why you may need to create resources to notify in a provider, even if the resource exists in a recipe, when using use_inline_resources in Chef’s LWRP DSL.

I’ll use an example cookbook, notif, to illustrate. First, I’ve created cookbooks/notif/resources/default.rb, with the following content.

1
2
actions :write
default_action :write

Then, I have written cookbooks/notif/providers/default.rb like this:

1
2
3
4
5
6
7
use_inline_resources

action :write do
  log 'notifer' do
    notifies :create, 'file[notified]'
  end
end

Then the default recipe, where I’ll use the resource automatically generated from the resource directory, notif.

1
2
3
4
5
6
file 'notified' do
  content 'something'
  action :nothing
end

notif 'doer'

When I run Chef, I’ll get an error like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 Recipe: notif::default
   * file[notified] action nothing (skipped due to action :nothing)
   * notif[doer] action write

     ================================================================================
     Error executing action `write` on resource 'notif[doer]'
     ================================================================================

     Chef::Exceptions::ResourceNotFound
     ----------------------------------
     resource log[notifer] is configured to notify resource file[notified] with action create, but file[notified] cannot be found in the resource collection. log[notifer] is defined in /tmp/kitchen/cookbooks/notif/providers/default.rb:4:in `block in class_from_file'

     Resource Declaration:
     ---------------------
     # In /tmp/kitchen/cookbooks/notif/recipes/default.rb

      12: notif 'doer'

     Compiled Resource:
 ------------------
     # Declared in /tmp/kitchen/cookbooks/notif/recipes/default.rb:12:in `from_file'

     notif("doer") do
       action :write
       retries 0
       retry_delay 2
       default_guard_interpreter :default
       declared_type :notif

       recipe_name "default"
     end

To fix this, I define the file resource in the provider:

1
2
3
4
5
6
7
8
9
10
11
use_inline_resources

action :write do
  log 'notifer' do
    notifies :create, 'file[notified]'
  end

  file 'notified' do
    content new_resource.name
  end
end

Then when I run Chef, it will converge and notify the file resource to be configured.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Recipe: notif::default
  * file[notified] action nothing (skipped due to action :nothing)
  * notif[doer] action write
    * log[notifer] action write

    * file[notified] action create
      - create new file notified
      - update content in file notified from none to 935e8e
      --- notified       2015-01-18 05:47:49.186399317 +0000
      +++ ./.notified20150118-15795-om5fiw       2015-01-18 05:47:49.186399317 +0000
      @@ -1 +1,2 @@
      +doer
    * file[notified] action create (up to date)

Running handlers:
Running handlers complete
Chef Client finished, 3/4 resources updated in 1.298990565 seconds

Why does this happen?

The reason for this is because use_inline_resources tells Chef that in this provider, we’re using inline resources that will be added to their own run context, with their own resource collection. We don’t have access to the resource collection from the recipe. Even though the file[notified] resource exists from the recipe, it doesn’t actually get inherited in the provider’s run context, raising the error we saw before.

We can turn off use_inline_resources by removing it, and the custom resource will be configured:

1
2
3
4
5
action :write do
  log 'notifer' do
    notifies :create, 'file[notified]'
  end
end

Then run Chef:

1
2
3
4
5
6
7
8
9
10
11
Recipe: notif::default
  * file[notified] action nothing (skipped due to action :nothing)
  * notif[doer] action write (up to date)
  * log[notifer] action write
  * file[notified] action create
    - update content in file notified from 935e8e to 3fc9b6
    --- notified 2015-01-18 05:47:49.186399317 +0000
    +++ ./.notified20150118-16159-r18q7z 2015-01-18 05:50:57.832140405 +0000
    @@ -1,2 +1,2 @@
    -doer
    +something

Notice that the file[notified] resource wasn’t updated at the start of the run, when it was encountered in the recipe, but it was when notified by the log resource in the provider action, changing the content.

Use inline compile mode!

The use_inline_resources method in the lightweight provider DSL is strongly recommended. It makes it easier to send notifications from the custom resource itself to other resources in the recipe’s resource collection. Read more about the inline compile mode in the Chef docs.

Also, define the resources that you need to notify when you’re doing this in your provider’s actions. A common example is within a provider that writes configuration for a service, and needs to tell that service to restart.

Quick Tip: Testing Conditionals in ChefSpec

This tip is brought to you by the homebrew cookbook.

ChefSpec is a great way to create tests for Chef recipes to catch regressions. Sometimes recipes end up having branching conditional logic that can have very different outcomes based on external factors – attributes, existing system state, or cross-platform support.

The homebrew cookbook only supports OS X, so we don’t have cross-platform support to test there. However, its default recipe has four conditionals to test. You can read the entire default_spec.rb for full context, I’m going to focus on just one aspect here:

  • Installing homebrew should only happen if the brew binary does not exist.

This is a common use case in Chef recipes. The best way to go about converging your node to the desired state involves running some arbitrary command. In this case, it’s the installation of Homebrew itself. Normally for installations we want to use an idempotent, convergent resource like package. However, since homebrew is to be our package management system, we have to do something else. As it turns out the homebrew project provides an installation script and that script will install a binary, /usr/local/bin/brew. We will assume that if Chef converged on a node after running the script, and the brew binary exists, then we don’t need to attempt reinstallation. There’s more robust ways to go about it (e.g., running brew gives some desired output), but this works for example purposes today.

From the recipe, here’s the resource:

1
2
3
4
5
execute 'install homebrew' do
  command homebrew_go
  user node['homebrew']['owner'] || homebrew_owner
  not_if { ::File.exist? '/usr/local/bin/brew' }
end

command is a script, called homebrew_go, which is a local variable set to a path in Chef::Config[:file_cache_path]. It is retrieved in the recipe with remote_file. The resource used to have execute homebrew_go, but when ChefSpec runs, it does so in a random temporary directory, which we cannot predict the name.

The astute observer will note that the user parameter has another conditional (designated by the ||). That’s actually the subject of another post. In this post, I’m concerned only with testing the guard, not_if.

The not_if is a Ruby block, which means the Ruby code is evaluated inline during the Chef run. How we go about testing that is the subject of this post.

First, we need to mock the return result of sending the #exist? method to the File class. There are two reasons. First, we want to control the conditional so we can write a test for each outcome. Second, someone running the test (like me) might have already installed homebrew on their local system (which I have), and so /usr/local/bin/brew will exist. To do this, in our context, we have a before block that stubs the return to false:

1
2
3
4
5
6
before(:each) do
  allow_any_instance_of(Chef::Resource).to receive(:homebrew_owner).and_return('vagrant')
  allow_any_instance_of(Chef::Recipe).to receive(:homebrew_owner).and_return('vagrant')
  allow(File).to receive(:exist?).and_return(false)
  stub_command('which git').and_return(true)
end

There’s some other mocked values here. I’ll talk about the vagrant user for homebrew_owner in a moment, though again, that’s the subject of another post.

The actual spec will test that the installation script will actually get executed when we run chef, and as the vagrant user.

1
2
3
4
5
it 'runs homebrew installation as the default user' do
  expect(chef_run).to run_execute('install homebrew').with(
    :user => 'vagrant'
  )
end

When rspec runs, we see this is the case:

1
2
3
homebrew::default
  default user
    runs homebrew installation as the default user

If I didn’t mock the user, it would be jtimberman, as that is the user that is running Chef via rspec/ChefSpec. The test would fail. If you’re looking at the full file, there’s some other details we’re going to look at shortly. If I didn’t mock the return for File.exist?, the execute wouldn’t run at all.

To test what happens when /usr/local/bin/brew exists, I set up a new context in rspec, and create a new before block.

1
2
3
4
5
6
7
8
9
10
context '/usr/local/bin/brew exists' do
  before(:each) do
    allow(File).to receive(:exist?).and_return(true)
    stub_command('which git').and_return(true)
  end

  it 'does not run homebrew installation' do
    expect(chef_run).to_not run_execute('install homebrew')
  end
end

We don’t need the vagrant mocks earlier, but we do need to stub File.exist?. This test would pass on my system without it, but not on, e.g., a Linux system that doesn’t have homebrew.

Then running rspec, we see:

1
2
3
4
5
homebrew::default
  /usr/local/bin/brew exists
    does not run homebrew installation
  default user
    runs homebrew installation as the default user

In a coming post, I will walk through the conditionals related to the homebrew_owner.

Quick Tip: Serverspec Spec_helper in Test Kitchen

Recently, I’ve started refactoring some old cookbooks I wrote ages ago. I’m adding Serverspec coverage that can be run with kitchen verify. In this quicktip, I’ll describe how to create a spec_helper that can be used in all the specs. This is a convention used by many in the Ruby community to add configuration for RSpec.

For Chef, we can run integration tests after convergence using Test Kitchen using Serverspec. To do that, we need to require Serverspec, and then set its backend. In some cookbooks, the author/developer may have written spec_helper files in the various test/integration/SUITE/serverspec/ directories, but this will use a single shared file for them all. Let’s get started.

In the .kitchen.yml, add the data_path configuration directive in the provisioner.

1
2
3
provisioner:
  name: chef_zero
  data_path: test/shared

Then, create the test/shared directory in the cookbook, and create the spec_helper.rb in it.

1
2
mkdir test/shared
$EDITOR test/shared/spec_helper.rb

Minimally, it should look like this:

1
2
3
require 'serverspec'

set :backend, :exec

Then in your specs, for example test/integration/default/serverspec/default_spec.rb, require the spec_helper. On the instances under test, the file will be copied to /tmp/kitchen/data/spec_helper.rb.

1
require_relative '../../../kitchen/data/spec_helper'

That’s it, now when running kitchen test, or kitchen verify on a converged instance, the helper will be used.

Quick Tip: Chef 12 Homebrew User Mixin

OS X is an interesting operating system. It is a Unix, but is primarily used for workstations. As such, many system settings can, and should, be done as a non-privileged user. Some tasks, however, require administrative privileges. OS X uses sudo to escalate privileges. This is done by a nice GUI pop-up requesting the user password when done through another GUI element. However, one must use sudo $COMMAND when working at the Terminal.

The Homebrew package manager tries to do everything as a non-privileged user. The installation script will invoke some commands with sudo – namely to create and set the correct permissions on /usr/local (its default installation location). Once that is complete, brew install will not require privileged access for installing packages. In fact, the Homebrew project recommends never using sudo with the brew commands.

In Chef 12 the default provider for the package resource is homebrew. This originally came from the homebrew cookbook. In order to not use sudo when managing packages, there’s a helper method (mixin) that attempts to determine what non-privileged user should run the brew install command. This is also ported to Chef 12. The method can also take an argument that specifies a particular user that should run the brew command.

When managing an OS X system with Chef, it is often easier to just run chef-client as root, rather than be around when sudo prompts for a password. This means that we need a way to execute other commands for managing OS X as a non-privileged user. We can reuse the mixin to do this. I’ll demonstrate this using plain old Ruby with pry, which is installed in ChefDK, and I’ll start it up with sudo. Then, I’ll show a short recipe with chef-apply.

1
2
3
% which pry
/opt/chefdk/embedded/bin/pry
% sudo pry

Paste in the following Ruby code:

1
2
3
4
5
require 'chef'
include Chef::Mixin::HomebrewUser
include Chef::Mixin::ShellOut

find_homebrew_uid #=> 501

The method find_homebrew_uid is the helper we want. As we can see, rather than returning 0 (for root), it returns 501, which is the UID of the jtimberman user on my system. To prove that I’m executing in a process owned by root:

1
Process.uid #=> 0

Or, I can shell out to the whoami command using Chef’s shell_out method – which is the same method Chef would use to run brew install.

1
shell_out('whoami').stdout #=> "root\n"

The shell_out method can take a :user attribute:

1
shell_out('whoami', :user => find_homebrew_uid).stdout #=> "jtimberman\n"

So this can be used to install packages with brew, and is exactly what Chef 12 does.

1
shell_out('brew install coreutils', :user => find_homebrew_uid)

Or, it can be used to run defaults(1) settings that require running as a specific user, rather than root

1
2
3
# Turn off iPhoto face detection, please
shell_out('defaults write com.apple.iPhoto PKFaceDetectionEnabled 0',
          :user => find_homebrew_uid)
1
2
3
4
5
6
# before...
jtimberman@localhost% defaults read com.apple.iPhoto PKFaceDetectionEnabled
1
# after!
jtimberman@localhost% defaults read com.apple.iPhoto PKFaceDetectionEnabled
0

Putting this together in a Chef recipe that gets run by root, we can disable face detection in iPhoto like this:

1
2
3
4
5
Chef::Resource::Execute.send(:include, Chef::Mixin::HomebrewUser)

execute 'defaults write com.apple.iPhoto PKFaceDetectionEnabled 0' do
  user find_homebrew_uid
end

The first line makes the method available on all execute resources. To make the method available to all resources, use Chef::Resource.send, and to make it available across everything in all recipes, use Chef::Recipe.send. Otherwise we would get a NoMethodError exception.

The execute resource takes a user attribute, so we use the find_homebrew_uid method here to set the user. And we can observe the same results as above:

1
2
3
4
5
6
7
8
9
jtimberman@localhost% defaults write com.apple.iPhoto PKFaceDetectionEnabled 1
jtimberman@localhost% defaults read com.apple.iPhoto PKFaceDetectionEnabled
1
jtimberman@localhost% sudo chef-apply nofaces.rb
Recipe: (chef-apply cookbook)::(chef-apply recipe)
* execute[defaults write com.apple.iPhoto PKFaceDetectionEnabled 0] action run
- execute defaults write com.apple.iPhoto PKFaceDetectionEnabled 0
jtimberman@localhost% defaults read com.apple.iPhoto PKFaceDetectionEnabled
0

Those who have read the workstation management posts on this blog in the past may be aware that I have a cookbook that can manage OS X “defaults(1)” settings. I plan to make updates to the resource in that cookbook that will leverage this method.

Quick Tip: Deleting Attributes

I have a new goal for 2015, and that is to write at least one “Quick Tip” per week about Chef. I’ve added the category “quicktips” to make these easier to find.

In this quick tip, I want to talk about a new feature of Chef 12. The new feature is the ability to remove an attribute from all levels (default, normal, override) on a node so it doesn’t get saved back to the Chef Server. This was brought up in Chef RFC 23. The reason I don’t want to save the attribute in question back to the server is that it is a secret that I have in a Chef Vault item.

I’m using Datadog for my home systems, and the wonderful folks at Datadog have a cookbook to set it up. The documentation requires that you set two attributes to authenticate, the API key, and the application key:

1
2
node.default['datadog']['api_key'] = 'Secrets In Plain Text Attributes??'
node.default['datadog']['application_key'] = 'It is probably fine.'

I prefer to use chef-vault because I think it’s the best way to manage shared secrets in Chef recipes. I still need to set the attributes for Datadog’s recipe to work, however. In order to accomplish the goal here, I will use a custom cookbook, housepub-datadog. It has one recipe that looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
include_recipe 'chef-vault'

node.default['datadog']['api_key'] = chef_vault_item(:secrets, 'datadog')['data']['api_key']
node.default['datadog']['application_key'] = chef_vault_item(:secrets, 'datadog')['data']['chef']

include_recipe 'datadog::dd-agent'

ruby_block 'smash-datadog-auth-attributes' do
  block do
    node.rm('datadog', 'api_key')
    node.rm('datadog', 'application_key')
  end
  subscribes :create, 'template[/etc/dd-agent/datadog.conf]', :immediately
end

Let’s take a closer look at the recipe.

1
include_recipe 'chef-vault'

Here, the chef-vault recipe is included to ensure everything works, and I have a dependency on chef-vault in my cookbook’s metadata. Next, we see the attributes set:

1
2
node.default['datadog']['api_key'] = chef_vault_item(:secrets, 'datadog')['data']['api_key']
node.default['datadog']['application_key'] = chef_vault_item(:secrets, 'datadog')['data']['chef']

The secrets/datadog item looks like this in plaintext:

1
2
3
4
5
6
7
{
  "id": "datadog",
  "data": {
    "api_key": "My datadog API key",
    "chef": "Application key for the 'chef' application"
  }
}

When Chef runs, it will load the vault-encrypted data bag item, and populate the attributes that will be used in the template. This template comes from the datadog::dd-agent recipe, which is included next. The template from that recipe looks like this:

1
2
3
4
5
6
7
8
9
template '/etc/dd-agent/datadog.conf' do
  owner 'root'
  group 'root'
  mode 0644
  variables(
    :api_key => node['datadog']['api_key'],
    :dd_url => node['datadog']['url']
  )
end

Now, for the grand finale of this post, I delete the attributes that were set using a ruby_block resource. The timing here is important, because these attributes must be deleted after Chef has converged the template. This does get updated every run, because the ruby block is not convergent, and this is okay because the attributes are updated every run, too. I could write additional logic to make this convergent, but I’m okay with the behavior. The subscribes ensures that as soon as the template is written, the node object is updated to remove the attributes. Otherwise, this happens next after the dd-agent recipe.

1
2
3
4
5
6
7
ruby_block 'smash-datadog-auth-attributes' do
  block do
    node.rm('datadog', 'api_key')
    node.rm('datadog', 'application_key')
  end
  subscribes :create, 'template[/etc/dd-agent/datadog.conf]', :immediately
end

Let’s see this in action:

1
2
3
4
5
6
7
8
9
10
managed-node$ chef-client
...
Recipe: housepub-datadog::default
  * ruby_block[smash-datadog-auth-attributes] action run
    - execute the ruby block smash-datadog-auth-attributes
...
workstation% knife node show managed-node -a datadog.api_key -a datadog.application_key
managed-node:
  datadog.api_key:
  datadog.application_key:

Bonus quick tip! knife node show can take the -a option multiple times to display more attributes. I just discovered this in writing this post, and I don’t know when it was added. For sure in Chef 12.0.3, so you should just upgrade anyway ;).

Update This feature was added by Awesome Chef Ranjib Dey.

Chef 12: Fix Untrusted Self Sign Certs

Scenario: You’ve started up a brand new Chef Server using version 12, and you have installed Chef 12 on your local system. You log into the Management Console to create a user and organization (or do this with the command-line chef-server-ctl commands), and you’re ready to rock with this knife.rb:

1
2
3
4
5
node_name                'jtimberman'
client_key               'jtimberman.pem'
validation_client_name   'tester-validator'
validation_key           'tester-validator.pem'
chef_server_url          'https://chef-server.example.com/organizations/tester'

However, when you try to check things out with knife:

1
2
3
% knife client list
ERROR: SSL Validation failure connecting to host: chef-server.example.com - SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
ERROR: OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed

This is because Chef client 12 has SSL verification enabled by default for all requests. Since the certificate generated by the Chef Server 12 installation is self-signed, there isn’t a signing CA that can be verified, and this fails. Never fear intrepid user, for you can get the SSL certificate from the server and store it as a “trusted” certificate. To find out how, use knife ssl check.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Connecting to host chef-server.example.com:443
ERROR: The SSL certificate of chef-server.example.com could not be verified
Certificate issuer data: /C=US/ST=WA/L=Seattle/O=YouCorp/OU=Operations/CN=chef-server.example.com/emailAddress=you@example.com

Configuration Info:

OpenSSL Configuration:
* Version: OpenSSL 1.0.1j 15 Oct 2014
* Certificate file: /opt/chefdk/embedded/ssl/cert.pem
* Certificate directory: /opt/chefdk/embedded/ssl/certs
Chef SSL Configuration:
* ssl_ca_path: nil
* ssl_ca_file: nil
* trusted_certs_dir: "/Users/jtimberman/Downloads/chef-repo/.chef/trusted_certs"

TO FIX THIS ERROR:

If the server you are connecting to uses a self-signed certificate, you must
configure chef to trust that server's certificate.

By default, the certificate is stored in the following location on the host
where your chef-server runs:

  /var/opt/chef-server/nginx/ca/SERVER_HOSTNAME.crt

Copy that file to your trusted_certs_dir (currently: /Users/jtimberman/Downloads/chef-repo/.chef/trusted_certs)
using SSH/SCP or some other secure method, then re-run this command to confirm
that the server's certificate is now trusted.

(note, this chef-server location is incorrect, it’s /var/opt/opscode)

There is a fetch plugin for knife too. Let’s download the certificate to the automatically preconfigured trusted certificate location mentioned in the output above.

1
2
3
4
5
6
7
8
% knife ssl fetch
WARNING: Certificates from chef-server.example.com will be fetched and placed in your trusted_cert
directory (/Users/jtimberman/Downloads/chef-repo/.chef/trusted_certs).

Knife has no means to verify these are the correct certificates. You should
verify the authenticity of these certificates after downloading.

Adding certificate for chef-server.example.com in /Users/jtimberman/Downloads/chef-repo/.chef/trusted_certs/chef-server.example.com.crt

The certificate should be verified that what was downloaded is in fact the same as the certificate on the Chef Server. For example, I compared SHA256 checksums:

1
2
3
4
% ssh ubuntu@chef-server.example.com sudo sha256sum /var/opt/opscode/nginx/ca/chef-server.example.com.crt
043728b55144861ed43a426c67addca357a5889158886aee50685cf1422b5ebf  /var/opt/opscode/nginx/ca/chef-server.example.com.crt
% gsha256sum .chef/trusted_certs/chef-server.example.com.crt
043728b55144861ed43a426c67addca357a5889158886aee50685cf1422b5ebf  .chef/trusted_certs/chef-server.example.com.crt

Now check knife client list again.

1
2
% knife client list
tester-validator

Victory!

Now, we need to get the ceritficate out to every node in the infrastructure in its trusted_certs_dir – by default this is /etc/chef/trusted_certs. The most simple way to do this is to use knife ssh to run knife on the target nodes.

1
2
3
4
5
6
7
8
% knife ssh 'name:*' 'sudo knife ssl fetch -c /etc/chef/client.rb'
node-output.example.com WARNING: Certificates from chef-server-example.com will be fetched and placed in your trusted_cert
node-output.example.com directory (/etc/chef/trusted_certs).
node-output.example.com
node-output.example.com Knife has no means to verify these are the correct certificates. You should
node-output.example.com verify the authenticity of these certificates after downloading.
node-output.example.com
node-output.example.com Adding certificate for chef-server.example.com in /etc/chef/trusted_certs/chef-server.example.com.crt

The output will be interleaved for all the nodes returned by knife ssh. Of course, we should verify the SHA256 checksums like before, which can be done again with knife ssh.

Reflecting on Six Years With Chef

It actually started a bit over seven years ago. I saw the writing on the wall at IBM; my job was soon to be outsourced. I found an open position with the SANS institute, accepted an offer there, and was due to start work in a couple of weeks. Around the same time, my friends Adam Jacob and Nathan Haneysmith had started HJK Solutions. They invited me to join them then, but it wasn’t the right time for me. Adam told me that at SANS I should at least use the automation tools and general infrastructure management model they planned to use. It turned out this was sage advice, for a number of reasons.

Around April, 2008, Adam told me he was working on “Chef,” a Ruby based configuration management and system integration framework. I was excited about its potential, and a few months later on July 2, 2008, I started with HJK Solutions as a Linux system administration consultant. I got familiar with HJK’s puppet-based stack, and ancillary Ruby tools like iClassify while working on their customer infrastructures over the coming months. After Opscode was founded and we released Chef 0.5, my primary focus was porting HJK’s puppet modules to chef cookbooks.

opscode/cookbooks

Adam had started the repository to give new users a place to begin using Chef with full working examples. I continued their development, and had the opportunity to solve hard problems of integration web application stacks with them. There were three important reasons for the repository to exist:

  1. We have a body of knowledge as a tribe, and that can be codified.
  2. Infrastructure as code is real, and it can be reusable.
  3. The best way to learn Chef is to use Chef, and I had a goal to know Chef well enough to teach it to new users and companies.

The development of general purpose cookbooks ends up being harder than any of us really imagined, I think. Every platform is different, so not only did I have to learn Chef, I had to learn how different platforms behave for common (and uncommon) pieces of software in web operations stacks. Over the years of managing these cookbooks, I learned a lot about how the community was developing workflows for using Chef, and how they differed from our opinions. I learned also learned how to manage and contribute to open source projects at a rather large scale, and how to have compassion and empathy for new or frustrated users.

Training and Services

In my time at CHEF, nee Opscode, I’ve had several job role changes. After several months of working on cookbooks, I added package and release management (RIP, apt.opscode.com) to my repertoire. I then switched to technical evangelism and training. With mentorship from John Willis, I drafted the initial version of Chef Fundamentals, and delivered our inaugural training class in Seattle.

I worked with the team John built to deliver training, speak at conferences, and work directly with customers to help make them successful with Chef. Eventually, John left the company to build an awesome team at Enstratius. I took on the role of Director of the team, but eventually I discovered that the management track was not the future of my career.

Open Source and Community

I came back to working on the cookbooks, which I had previously split into separate repositories. I was also working more directly in the community, doing public training classes only (our consulting team did private/onsite classes), participating in our IRC channels and mailing lists. We had some organization churn, and I was moved around between four different managers, eventually reporting to the inimitable Nathen Harvey.

During one of our 1-1 discussions, he said, “You know, Joshua. You write a lot of cookbooks to automate infrastructure. But you haven’t actually worked on any infrastructure in years. You should do something about that.”

Around that time, there was a “senior system administrator” job posting on our very own careers site. I talked to our VP of Operations, and after a brief transition period, moved completely over to the ops team. I was able to bring with me the great practices from the community for developing cookbooks: testing with chefspec and serverspec, code consistency with rubocop and foodcritic, and wrapping it all up with test kitchen.

The Future

I’ve had the privilege to do work that I love, which is automating hard problems using Chef. I’ve also had the privilege of being part of the web operations, infrastructure as code, devops, and Chef communities over the past six years. I’ve been to all four Chef summits, and all three ChefConfs. A thing I’ve noticed over the years is that many conversations keep coming up at the summits and ChefConf. Fresh on my mind because the last summit was so recent is the topic of cookbook reusability. See, during the time that I managed opscode/cookbooks, I eventually saw the point people in the community were making about these being real software repositories that need to be managed like other complex software projects. We split up the repository into individual repositories per cookbook. We started adding test coverage, and conforming to consistency via syntax and style lint checking. That didn’t make cookboks more reusable, but it lowered the barrier of contribution, which in turn made them more reusable as more use cases could be covered. I got to be a part of that evolution, and it’s been awesome.

While using Chef is one of my favorite technical things to do, I have come to the conclusion that based on my experience the best thing I can do is be a facilitator of stronger technical discipline with regard to using Chef. Primarily, this means improving how CHEF uses Chef to build Chef for our community and customers. We’re already really good at using Chef to build Chef (the product), and run Hosted Chef (the service). However, awesome tools from the community such as Test Kitchen, Berkshelf, ChefSpec, and Foodcritic did not exist when we started out. Between new, awesome tools, and growing our organization with new awesome people, we need to improve on getting our team members up to speed on the process and workflow that helps us deliver higher quality products.

That is why I’m moving into a new role at CHEF. The sixth year marks as good a time as any to make a change, and I’m no stranger to that. I’m joining a team of quality advocacy led by Joseph Smith, as part of Jez Humble’s “Office of Continuous Improvement and Velocity.” In this new role, I will focus on improving our overall technical excellence so we can deliver better products to our community and customers, and so we can have awesome use cases and examples for managing Chef Server and its add-ons at scale.

My first short term goal in this new role is a workstation automation cookbook that can be used and extended by our internal teams for ensuring everyone has a consistent set of tools to work on the product. This will be made an open source project that the community can use and extend as well. We’ll have more information about this as it becomes “a thing.”

Next, I want to improve how we work on incidents. We’ve had sporadic blog posts about issues in Hosted Chef and Supermarket, and I’d like to see this get better.

I’m also interested in managing Chef Server 12 clusters, including all the add-ons. Recently I worked on the chef-server-cluster cookbook, which will become the way CHEF deploys and manages Hosted Chef using the version 12 packages. Part of the earliest days of opscode/cookbooks, I maintained cookbooks to setup the open source Chef Server. Long time users may remember the “chef solo bootstrap” stack. Since then, CHEF has continued to iterate on that idea, and the “ctl” management commands largely use chef-solo under the hood. The new cookbook combines and wraps up manual processes and the “ctl” commands to enable us, our community, and our customers to build scalable Chef Server clusters using the omnibus packages. The cookbook uses chef-provisioning to do much of the heavy lifting.

It should be easy for organizations to be successful with Chef. That includes CHEF! My goal in my new position is to fuel the love of Chef internally and externally, whip up awesome, and stir up more delight. I also look forward to seeing what our community and customers do with Chef in their organizations.

Thank you

I’d like to thank the friends and mentors I’ve had along this journey. You’re all important, and we’ve shared some good times and code, and sometimes hugs. It’s been amazing to see so many people become successful with Chef.

Above all, I’d like to thank Adam Jacob: for the opportunity to join in this ride, for inspiration to be a better system administrator and operations professional, for mentorship along the way, and for writing Chef in the first place. Cheers, my friend.

Here’s to many more years of whipping up awesome!

Chef Reporting API and Resource Updates

Have you ever wanted to find a list of nodes that updated a specific resource in a period of time? Such as “show me all the nodes in production that had an application service restart in the last hour”? Or, “which nodes have updated their apt cache recently?” For example,

1
2
3
4
5
6
7
8
% knife report resource 'role:supermarket-app AND chef_environment:supermarket-prod' execute 'apt-get update'
execute[apt-get update] changed in the following runs:
app-supermarket1.example.com 2230cf30-6d95-4e43-be18-211137eaf802 @ 2014-10-07T14:07:03Z
app-supermarket2.example.com c5e4d7bf-95a6-4385-9d8e-c6f5617ed79b @ 2014-10-07T14:14:04Z
app-supermarket3.example.com c4c4b4bb-91b6-4f73-9876-b24b093c7f1e @ 2014-10-07T14:09:54Z
app-supermarket4.example.com 3eb09034-7539-4a3c-af6d-5b01d35bc63f @ 2014-10-07T13:31:56Z
app-supermarket5.example.com aa48c1d3-da91-4031-a43d-582a577cbf2d @ 2014-10-07T13:35:15Z
Use `knife runs show` with the run UUID to get more info

I have released a new knife plugin to do that, but first some background.

At CHEF, we run the community’s cookbook site, Supermarket. We monitor the systems that run the site with Sensu. The current infrastructure runs instances on Amazon Web Services EC2, with an Elastic Load Balancer (ELB) in front of them. As a corrective action for a Supermarket outage, CHEF’s operations team added a new check for elevated HTTP 500 responses from the application servers behind the ELB. One thing we found was that when Supermarket was deployed, and the unicorn server restarted, we would see elevated 500’s, but the site often wouldn’t actually be impacted.

The Sensu check is run from a “relay” node. That is, it isn’t run on the application servers or the Sensu server – it’s run out of band since it’s for the ELB. One might imagine we could have similar checks for other services that aren’t run on “managed nodes,” but that’s neither here nor there. The issue is that we get an alert message that looks like this:

1
Sensu Alerts  ALERT - [i-d1dfd5d9/check-elb-backend-500] - CheckELBMetrics CRITICAL: supermarket-elb; Sum of HTTPCode_Backend_5XX is 2538.0. (expected lower than 30.0); (HTTPCode_Backend_5XX '2538.0' within 300 seconds between 2014-08-19 13:33:36 +0000 to 2014-08-19 13:38:36 +0000) [Playbook].

The first part, [i-d1dfd5d9/check-elb-backend-500] is the node name and the check that alerted. The node name here is the monitoring relay that runs the check, not the actual node or nodes where Supermarket was deployed and restarted. This is where Chef Reporting comes into play. In Chef Reporting, we can view information about recent Chef client runs, which gives us a graph like this.

If we go look at the reports in the Chef Manage console, we can drill down to something like this.

This shows that unicorn was restarted in this run. That’s great, but if I’m getting this alert at a time when I’m not particularly coherent (e.g, 2AM), I want a command in a playbook that I can run to get more information quickly without having to log into the webui and click around imprecisely. CHEF publishes a knife-reporting gem that has a couple handy sub-commands to retrieve this run data. For example, we can list runs.

1
2
3
4
5
6
7
8
9
10
% knife runs list
node_name:  i-3022aa3b
run_id:     9eccd8f6-876b-4a57-87ac-0b3e7b7ef1e7
start_time: 2014-08-21T17:03:56Z
status:     started

node_name:  i-a09424a8
run_id:     f2b7871a-149b-4fd3-abdc-d74a838d719a
start_time: 2014-08-21T17:00:23Z
status:     success

Or, we can display a specific run.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
% knife runs show eecb04fb-11df-438a-8e81-dd610eb66616
run_detail:
  data:
  end_time:          2014-08-20T17:50:12Z
  node_name:         i-9f22aa94
  run_id:            eecb04fb-11df-438a-8e81-dd610eb66616
  run_list:          ["role[base]","role[supermarket-app]"]
  start_time:        2014-08-20T17:45:37Z
  status:            success
  total_res_count:   261
  updated_res_count: 17
run_resources:
  cookbook_name:    supermarket
  cookbook_version: 2.7.2
  duration:         209
  final_state:
    enabled: false
    running: true
  id:               unicorn
  initial_state:
    enabled: false
    running: true
  name:             unicorn
  result:           restart
  type:             service
  uri:              https://api.opscode.com/organizations/supermarket/reports/org/runs/eecb04fb-11df-438a-8e81-dd610eb66616/15

This is handy, but a little limited. What if I want to display only the runs containing the service[unicorn] resource?

That’s where my knife-report-resource plugin helps. At first, it was very much specific to finding unicorn restarts on Supermarket app servers. However, I wanted to make it more general purpose as I think people would want to be able to find when arbitrary resources were updated. This is how it works:

  1. Query the Chef Server for a particular set of nodes. For example, 'role:supermarket-app AND chef_environment:supermarket-prod'.
  2. Get all the Chef client runs for a specified time period up until the current time. By default, it starts from one hour ago, but we can pass an ISO8601 timestamp.
  3. Iterate over all the runs looking for runs by the nodes that were returned by the search query, gathering the specified resource type and name.
  4. Display some nice output with the node’s FQDN, the run’s UUID, and a timestamp.

From the earlier example:

1
2
3
4
5
6
7
8
% knife report resource 'role:supermarket-app AND chef_environment:supermarket-prod' execute 'apt-get update'
execute[apt-get update] changed in the following runs:
app-supermarket1.example.com 2230cf30-6d95-4e43-be18-211137eaf802 @ 2014-10-07T14:07:03Z
app-supermarket2.example.com c5e4d7bf-95a6-4385-9d8e-c6f5617ed79b @ 2014-10-07T14:14:04Z
app-supermarket3.example.com c4c4b4bb-91b6-4f73-9876-b24b093c7f1e @ 2014-10-07T14:09:54Z
app-supermarket4.example.com 3eb09034-7539-4a3c-af6d-5b01d35bc63f @ 2014-10-07T13:31:56Z
app-supermarket5.example.com aa48c1d3-da91-4031-a43d-582a577cbf2d @ 2014-10-07T13:35:15Z
Use `knife runs show` with the run UUID to get more info

Then, we can drill down further into one of these runs with the knife-reporting plugin.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
% knife runs show 2230cf30-6d95-4e43-be18-211137eaf802
run_detail:
  data:
  end_time:          2014-10-07T14:07:03Z
  node_name:         i-d7fed0df
  run_id:            2230cf30-6d95-4e43-be18-211137eaf802
  run_list:          ["role[base]","role[supermarket-app]"]
  start_time:        2014-10-07T14:03:59Z
  status:            success
  total_res_count:   271
  updated_res_count: 12
run_resources:
  cookbook_name:    chef-client
  cookbook_version: 3.6.0
  duration:         99
  final_state:
    enabled: true
    running: false
  id:               chef-client
  initial_state:
    enabled: true
    running: true
  name:             chef-client
  result:           enable
  type:             runit_service
  uri:              https://api.opscode.com/organizations/supermarket/reports/org/runs/2230cf30-6d95-4e43-be18-211137eaf802/0
...
  cookbook_name:    supermarket
  cookbook_version: 2.11.0
  duration:         8506
  final_state:
  id:               apt-get update
  initial_state:
  name:             apt-get update
  result:           run
  type:             execute
  uri:              https://api.opscode.com/organizations/supermarket/reports/org/runs/2230cf30-6d95-4e43-be18-211137eaf802/5

Hopefully you find this plugin useful! It is a RubyGem, and is available on RubyGems.org, and the source is available on GitHub.