Topics include: Opscode Chef, configuration management, Ruby, Linux/Unix administration. Opinions expressed here are my own and do not reflect those of my employer (Opscode, Inc.).
I created a plugin for knife that will display a specified option from
Chef’s configuration object, Chef::Config. It operates with the
scope of the automatically detected
knife configuration file,
or by passing the -c option with a configuration file.
Show the “knife” configuration, which includes things like cloud
provider authentication. This doesn’t currently support showing
sub-keys (like knife[aws_access_key_id]).
Most commonly, Vagrant’s
Vagrantfile describes only a single VM. That’s fine, but most
environments separate functionality to different servers (e.g.,
database and web app). For this reason, Vagrantfiles can be set up for
multi-VM arrangements.
However, I’m going to describe a different use case for multiple VMs.
Testing Multiple Distributions
As a cookbook developer for a variety of platforms and platform
versions, I have to ensure changes do not break existing functionality
across supported platforms - namely current releases of Ubuntu and
CentOS (and their parents, Debian and RHEL).
Enter Vagrant’s Multi-VM Vagrantfile.
While I posted recently
about testing with VMware Fusion, I am using Vagrant more. Primarily
because Opscode uses Vagrant internally -
Seth Chisamore built a multi-VM
Vagrantfile for bringing up our full stack. The specifics in his
Vagrantfile are tuned to that particular use case which is different
than mine. However, there are similar patterns that I adapted.
Vagrantfile
The Vagrantfile configures four virtual machines:
CentOS 5.7 and 6.2
Ubuntu 10.04 and 11.10
I built my VMs with veewee
templates that install Chef via the
omnibus built chef-full package.
That way I have a consistent installation that reflects what Opscode
will ship as the easiest and best supported way to install Chef.
Part of the magic of this configuration is that I’m going to reuse my
knife configuration. The Vagrantfile itself goes into my cookbook
testing Chef Repository.
Next, I’m going to describe data about the virtual machines that I’m
going to run. This is a hash of named VMs, centos5, lucid, etc. I
assign their hostname, and give them a host only IP address. I also
set an initial run list, since Vagrant will (noisily) complain if the
run list is empty in a Chef provisioner.
Note I have a base role as a holder, the actual relevant things
are in the base_redhat and base_debian roles. The details really
don’t matter, though.
I disable the shared folder, since I’m going to use a Chef Server, and
my recipes will download what they need from remote repositories, not
my local system.
Set up some basic configuration for the box. Modify this to suit your
environment. This section is on a per-VM basis. If particular tunables
were required, I’d create additional config in the cookbook_testers
hash above, and use those values here.
Notename will be a symbol, but only in some contexts of
execution.
Now I set up the Chef provisioner. Again, I’m using Chef with a Server
(Opscode Hosted Chef, of course). I
use the chef_server_url, and validation_client_name settings from
my knife.rb.
The nodes’ names will be NAME-cookbook-test, rather than their FQDN.
I use this with a rake task that nukes them all from orbit
consistently :).
The run list is going to be combined from the run lists defined from
the cookbook_testers hash above, and a shell environment variable,
CHEF_RUN_LIST, which is simply a comma-separated list of run list
items, similar to that used by knife bootstrap.
To use the Vagrantfile, I export the shell variable with the
role(s)/recipe(s) I am testing, then run vagrant up.
12
% export CHEF_RUN_LIST="recipe[apache2],recipe[apache2::mod_ssl]"% vagrant up
Vagrant will bring up each VM one at a time, going through the full
cycle of provisioning. If there’s an unhandled exception that causes
Chef to exit, then Vagrant also halts execution. If vagrant up is
rerun, then Vagrant continues to the next VM. To reprovision a
failed VM, it can be specified:
1
% vagrant provisioncentos5
Without the VM name, vagrant would reprovision all the VMs. Likewise,
vagrant ssh NAME can be used to open an SSH connection to the named
VM. This is useful to reprovision a VM that failed early, while
Vagrant is continuing on with the others.
Full Vagrantfile
The Vagrantfile is split up in the earlier section, but you can see
the full thing below.
Fact: GitHub is classy. This isn’t just because
Scott Chacon works there, either. Their
handling of a security issue today was very professional. That said, I
have some words to say about the issue itself and the aftermath, and
things you as an application developer can do to help, and to avoid
this kind of problem.
Disclaimer: I’m not an application developer. I am a sysadmin with a
diverse background in operating system security, including previously
held GSEC and GCUX certifications from the SANS Institute/GIAC.
Second disclaimer: This is not an anti-Rails post. All web frameworks
need to be conscious of security, and take bug reports for security
issues seriously.
Issue
Update - Preface: I am not talking about a security vulnerability (a
la an exploit) in Rails. I am talking about a feature that allows
automatically generated code to do things that are not secure and it
is apparently on purpose and by design. This is the wrong thing to
do. Deny by default with whitelisting is the right thing to do.
There is a security issue in
Ruby on Rails. The bug
is closed, but I haven’t dug into find out if the actual problem is
fixed.
The bug itself is very serious. It allows a malicious user to send
arbitrary parameters to a Rails application without requiring
whitelisting up front. In fact,
three months ago an issue
was opened in the Rails project to force new applications to enforce
whitelist mode by default. That bug was subsequently closed just a few
days ago. I’m not going to do an analysis of the issue itself, you can go
read the linked tickets and do further research on the issue.
Update I missed clarifying this. There is a second issue at hand.
GitHub resolved the mass assignment bug by fixing their application.
The second issue is that they had a vulnerability in their public key
form update.
Resolution and Aftermath
You can read all about the initial resolution and retrospective on the
GitHub blog.
You can also read their
follow-up post
on responsible disclosure.
The manner in which they handled the situation is a class act: they
behaved like professionals. Here’s why:
A user reported a problem with their app.
They worked with the user to resolve the problem.
The same user exploited another vulnerability to prove a point to
the Rails project, which is against the GitHub terms of service.
GitHub suspended the user’s account in accordance with their terms
of service.
As an unauthorized breach of a computer system, what Egor Homakov did
is illegal in the United States. It was also irresponsible and
unprofessional. However, his intent was not malicious. I think GitHub
did the right thing by giving him another chance in reinstating
Mr. Homakov’s account several hours after the incident.
GitHub has issued two apologies about this incident. First for the
vulnerability existing in the first place, and second for not being
clear how customers and users can responsibly disclose security
vulnerabilities. They also committed to doing a security audit of
their code base.
GitHub is classy.
Security
After the above, I feel compelled to say some more things about
security in general. You are responsible for a lot of things regarding
the web applications in your infrastructure. One of those is security,
and you should do everything you can to write stable, secure code.
That is a fact. However, web frameworks should provide sane, secure
settings by default. Those settings should be modifiable by the
end-user, the developer. If a developer wishes to disable those
controls, they totally have that right. I think that they need to
understand the potential risk that they are accepting, and what impact
that might have on the business/organization implementing the
application.
This is exactly like the default setting of Red Hat Enterprise Linux
to enable SELinux by default on new installations. Whether you love or
hate this default, it is sane and secure. System administrators can
then use the system as is, or disable SELinux if that is an acceptable
level of risk.
Clearly in this incident, it is not an acceptable level of risk for
GitHub, as they have repaired their application. It would have better
to do that long ago, but at least it’s fixed now.
Security and convenience are quite often polar opposites and mutually
exclusive. This is not always the case, but it is true much of the
time, if not most of the time. The choice for Rails to not have
whitelisting by default is in favor of developer convenience. Yes, it
is up to the developer to make their application secure, but that was
already the case. This simply creates extra work for them to do so.
Deny-all by default is the sane, correct and secure posture to take
when building systems. This is the practice of many tools and
operating system defaults - SELinux as mentioned, or “no open ports”
per Ubuntu’s practice. You don’t have to agree with it, and you
certainly can change it, but that doesn’t change the fact that it is
correct and sane.
Vulnerability Disclosure
There is an entire field in information technology devoted to
vulnerability disclosure. This is typically done by people performing
“ethical hacking” and is one thing done when a code base goes through
a security audit. This is a field that has responsible professionals
participating in a variety of companies, and if it sounds interesting
to you, I recommend the variety of courses offered by the SANS
Institute:
First of all, understand the security guidelines and best practices
for the programming language you’re using. Doing things that are
typesafe, or avoid buffer overflows, that kind of thing. Also
understand and follow the security guideslines and best practices for
the web framework you’re using. The Ruby on Rails project has a fairly
detailed
security guide. If
you’re taking shortcuts, understand the possible risks with that. If
you don’t know the risks, or understand the guidelines, please ask
someone in the community for help.
I strongly recommend you also learn the security guidelines and
associated best practices for the operating system or distribution
that your application will run on in production. If your organization
has operations staff, I’m sure they can help you learn and understand.
If they don’t, they’re not doing their job :).
Every organization and every application is different. The security
implications are going to vary by industry. Talk to the business
owners and find out what the level of security risk they are
comfortable accepting.
Above all, be a professional. Don’t flippantly close security bugs.
Don’t be a dick on discussions about security topics.
This is awesome news for those of us who only haveWhad Xcode
installed to install RubyGems that compile native extensions, or for
installing software with Homebrew, MacPorts or similar.. You can
download them by logging into the
Developer Download site.
This appears to be work started by
Kenneth Reitz
with his
OSX GCC Installer
project. I did try that project out, but ran into issues I didn’t
resolve right away, so I reverted to using Xcode proper. However with
the package from Apple I don’t seem to have any issues so far.
If you already have Xcode installed, you may want to remove it first.
My understanding is that you can remove the /Developer*
director(y|ies) when complete. I had ollllld Xcode on the system where
I first did this.
Next, download and install the package from Apple. It’s about 170M and
takes only a couple minutes to install; sorry I don’t have a Chef
recipe for this ;).
I did run into an issue with Homebrew where it wasn’t finding the
right gcc binary. I had to run the following commands to fix
that issue.
You wouldn’t think that something like an 8G installation would matter
in 2012. However, disk space is a precious commodity on MacBook Airs
and systems that have SSDs as the root volume. This is very welcome
change for me, especially since it means that future Mac OS X
installations do not require a large download before I can start doing
things that get my
system ready to use.
It is no secret that
I use GNU Emacs
as my default text editor. It is perhaps less evident but no less
relevant that I use Emacs 24. I really like the built-in color theme
support and the package management system for getting the various
modes I like to use.
Recently, I revamped my
Emacs configuration. This
post isn’t about that topic. Instead, this post is about how I made
sure that all the systems I want to use Emacs on have the latest
version available.
Unfortunately, Emacs 24 is still unreleased, so it is not available as
the default package on the distributions I use for my personal systems
(Ubuntu/Debian flavors). I wrote a recipe to install Emacs
from source. This is easy enough to do, but since I already automate
everything on my home network with Chef, it was a natural fit for a
new recipe. I simply added this to my local emacs cookbook, in
recipes/source24.rb.
Then I updated my base role to replace recipe[emacs] with
recipe[emacs::source24] and ran Chef. It took about 25 minutes to do
the build, but now I have the same version of Emacs everywhere, and
there was much rejoicing.
And yes, you’re absolutely right, I could just build a package and
install that. However, I don’t want to set up and maintain a package
management repository for my small network, as
easy as that may be.
My OS X systems are a special case because I’m using Homebrew, but the
homebrew cookbook does not [yet?] support
install-time options, and I didn’t spend the time adding support for
building the OS X Emacs w/ cocoa support from git. When I tackle that,
I’ll make another post, so stay tuned!
Mac OS X Lion introduced a new nifty feature called AirDrop. This
allows users on a local network to drag and drop files to each other
with Finder.
While it seems that this would be useful, there are security
implications. After looking through
Google Search Results
on the topic, I found some un-helpful information in a random forum
post (unsurprising). A little more
review of the search results
resulted in finding the actual defaults(1) command to do so:
If you change a class name in your library, do a major version
change! You don’t know who is using your library, even the
undocumented parts.
Background
Recently, we had a post on the
Chef mailing list that using bluepill for
the Chef Server daemon process(es) was broken. Upon further
investigation of the output from
Chef debug output, it appeared to
be an issue with
Bluepill itself, but
after looking into that ticket, the daemons gem had made a change.
Out of curiosity, I was compelled to find out what the source of the
change in the daemons gem was. This lead to a yak shave, as first, I
had to look up the daemons gem on
RubyGems.org to try and find the source. The author of the gem still
uses RubyForge rather than GitHub. That’s fine, but it means I have to
do some link-spelunking to find where the source code lives.
Now I take a look at the change log:
Release 1.1.8: February 7, 2012
rename to daemonization.rb to daemonize.rb (and Daemonization to Daemonize) to ensure compatibility.
Release 1.1.7: February 6, 2012
start_proc: Write out the PID file in the newly created proc to avoid race conditions.
daemonize.rb: remove to simplify licensing (replaced by daemonization.rb).
Release 1.1.6: January 18, 2012
Add the :app_name option for the "call" daemonization mode.
Release 1.1.5: December 19, 2011
Catch the case where the pidfile is empty but not deleted and restart
the app (thanks to Rich Healey)
I then went to the ticket tracker to find out what the source of the
changes might be. Fortunately, there was an open issue that I could
reference.
My question (which I posted to the ticket) is why wouldn’t renaming a
class cause the author to do a new major version? This way other Gems
that rely on this as a dependency could use the paranoia operator,
~> so the broken class name wouldn’t break usage elsewhere.
I’m glad that the daemons gem author did the right thing and yanked
the broken version. Open source worked well here. The process of
finding this was a bit slower than it should have been, and I think
that the bluepill maintainers moved too quickly to “resolve” the
issue, rather than post their concern about the class naming. Kudos to
thuehlinger for fixing his gem, though.
I primarily use this to manage the VMware Fusion virtual machines I
use for testing Opscode’s Chef Cookbooks.
This post is rather light on specific details about things that are
either “common” knowledge or documented elsewhere. Particularly, I
won’t tell you how to set up the virtual machines, other than a few
notes that I think make it easier to manage virtual machines in this
way. In other words, you’re smart and can figure them out.
Install and Configure
The first step to use the fission RubyGem is to install it. If you
don’t like RubyGems, then create a package or grab the source from the
GitHub link, above.
gem install fission
Test that it is detecting your VMware Fusion VMs:
fission status
Fission has a configuration file, ~/.fissionrc, which is yaml
format. If the status command fails, you may need to configure fission
to find the vmrun command. Here’s the example from my system:
If you’re reading this, I presume you know how to install an OS in a
VMware virtual machine. I do a number of tasks during the installation
to make it easy and consistent to work with all my test VMs.
Use bridged networking with DHCP
This usually results in the least amount of hassle for connecting to
the VM without any tunneling or port forwarding tomfoolery.
Give it a simple VM name with alphanumeric characters only.
Use the same hostname during installation as the VM name
In the examples below, I use my “guineapig” system. I also have other
systems like “ubuntu1110”, “freebsd82” and “centos6”. This is the name
you’ll use to refer to the VM with fission, so it should be short,
easy to type and clearly identifiable.
Set the root password, even if the OS doesn’t use the root account
Make sure SSH as root is enabled
Some Linux distributions such as Ubuntu do not enable the root login.
This is for testing, so I really don’t care, and I can always write a
Chef recipe to lock things down (as I would in production) if
required.
I also set a simple password that I can use with -P to knife
bootstrap without the shell doing anything with special characters.
Use NTP
Install the NTP package for your operating system. The workflow here
(and the whole point really) is to make heavy use of VMware Fusion
snapshots and rollback, so it is important that the system time is
correct. I customized the bootstrap templates I use to add ntpdate.
Using Fission
And now the moment you’ve been waiting for. First, see the fission
README for full detail on the commands available. I’m going to focus
here on how I use it.
After the install and post install tasks are done, I create a
new snapshot for the VM.
% fission status
guineapig [running]
% fission snapshot create guineapig base
The name is “base” because thats a good name for a baseline. It can be
useful to create specifically named snapshots for particular purposes.
I use Opscode Hosted Chef as my server and I already have my local
workstation set up with the validation key, a Chef repository and have
uploaded the cookbook(s) I use for testing. I’ll use “knife
bootstrap” to kick off a run on my VM:
% knife bootstrap 10.1.1.129 -x root -Pvanilla -r 'recipe[apache2]'
...
INFO: service[apache2] restarted
INFO: Chef Run complete in 44.324473 seconds
Sweet, it worked. However, if the Chef run fails, I can log in as
root, fix the bug and rerun, or whatever else may need to be done.
Then once I’m ready to reset the VM, fission comes back to play.
% fission snapshot revert guineapig base
Reverting to snapshot 'base'
Reverted to snapshot 'base'
Note that fission will poweroff the VM when reverting the snapshot.
Turn it on again with the start command.
% fission status
guineapig [not running]
% fission start guineapig
Starting 'guineapig'
VM 'guineapig' started
% fission status
guineapig [running]
And after logging in, we can see that apache2 is not installed as it
should not be after the snapshot is restored.
% ssh root@guineapig
root@guineapig's password:
root@guineapig:~# dpkg -l apache2
No packages found matching apache2.
The VM is now ready to do my bidding once again.
Cleanup
Note that reverting the snapshot doesn’t delete the Chef node or
client objects. Since fission is a Ruby library, a simple knife plugin
can wrap up all the fission revert, restart and Chef cleanup, though.
I called mine nukular.
% knife nukular guineapig base guineapig.int.example.com
And here’s the plugin I’m using:
12345678910111213141516171819202122
require'chef/knife'moduleKnifePluginsclassNukular<Chef::Knifedepsdorequire'fission'require'chef/node'require'chef/api_client'endbanner"knife nukular VM SNAPSHOT [NODE]"defrunvm,snapshot=@name_argsnode=@name_args[2].nil??vm:@name_args[2]Fission::Command::SnapshotRevert.new(args=[vm,snapshot]).executeFission::Command::Start.new(args=[vm]).executeChef::Node.load(node).destroyChef::ApiClient.load(node).destroyendendend
The command-line usage takes 2 or 3 arguments. The first two must be
the VM name and the snapshot name, e.g. guineapig and base. If the
node name is different than the VM name, then specify it.
% knife nukular guineapig base guineapig.int.example.com
Note that the plugin has zero error handling or any other sensible
things. You may want to modify it before you use it. Or not, these are
just test systems after all.
Full example
Minus the output from the commands, here is the full output of testing
the Opscode apache2 cookbook on my guinea pig. Assume that all the
required things from my Chef Repository have been uploaded to the Chef
Server and the knife configuration is correct. Also, the chef-full
bootstrap template specified here is a customized version of the
template in the
Chef source master branch template
that has ntpdate -u pool.ntp.org in it.
Earlier this month, I completed a
switch to DNSimple for my
domain’s DNS provider. I am still happy with the switch, and finally,
just now, got around to writing a recipe to have my systems
automatically register themselves in DNS.
In the post, I described automatically adding the DNS entries with the
dnsimple cookbook.
I did this as a proof of concept, but I didn’t add it to all my nodes,
instead using my existing data bag-driven solution.
That said, this post serves as a brief document on how you can mimic
this behavior with your own environment.
Encrypted Data Bag
I put my DNSimple credentials in an
encrypted data bag.
Since I have to decrypt and read the entire thing anyway, I also store
the relevant data there. I keep my encrypted data bags in a bag called
secrets. The structure looks like this:
1234567
{"id":"dnsimple","api_token":"DNSimple API Token Here","domain":"your-domain.example.com","username":"DNSimple username","password":"DNSimple password"}
Replace the values with your values. Encrypting the data is optional,
but requires that you create a secret key or key file. Read my
previous post on the topic
for more information.
The first command just uploads the data bag item, the second encrypts
it. Note that I manage my workstation with Chef, so I will use the
same secret file as the Chef default. The secret file needs to be
copied to each system that will need it.
As
mentioned in the previous blog post,
the encrypted_data_bag_item method is in a library. Either add that
library to your cookbook, or use the class directly.
If you’re not using an encrypted data bag, then the item can be
accessed with the normal method:
1
dnsimple=data_bag_item("bagname","dnsimple")
The real work happens in the dnsimple_record LWRP, which will add an
“A” record for the system running the recipe. Note that the actual
entry is going to use the int subdomain, and it will use the domain
stored in the data bag item. It also will use the default IP address
of the node, which means the IP for the interface with the default
route.
A new “which tool is best” battle is raging in the internets amongst
developers and system administrators. The contestants are screen
and tmux, and the
jury is still out.
This is very much an argument over what color to paint the bikeshed,
but with the latest version of
iTerm2, I think tmux is even more
compelling. Personally, I chose tmux awhile ago.
At my day job, I worked with a customer that
uses tmux for remote pairing between developers. At the time, tmux had
better customizability, and better split-pane support (screen didn’t
yet have vertical split). I stuck with tmux ever since, and was very
pleased when an iTerm2 update announced integration with tmux.
iTerm2
For those who aren’t aware, iTerm2 is an alternate terminal program
for Mac OS X. It is actually an updated codebase from the original,
iTerm, which is effectively
unmaintained. iTerm2 offers
a lot of excellent features like split panes, Growl support, and
many more.
One of the excellent new features is integration with tmux.
iTerm2’s tmux integration
If you already have iTerm2 installed, you may have seen the update
check prompt you to update. You also need to install a special version
of tmux that has the integration patched. The iTerm2 author is working
with the tmux author to get this into the latest tmux codebase, so
hopefully the custom compiled version won’t be necessary soon.
Using the new feature is relatively straightforward. Start up iTerm2
like normal. Then run tmux -C to open a new iTerm2 window that works
like tmux.
Use the tmux menu in iTerm2 to open new windows in tmux. Note that there
are keyboard shortcuts for each of these, and they are not the same as
the tmux window commands.
You can also attach to a tmux session running in iTerm2. In the
screenshot, this is running on the same system, for example purposes.
However, since OS X has SSH, this can be useful if you want to SSH to
another system in the local network and connect to the running
session. For example, the system shown below is my wife’s iMac over
screensharing, but I wouldn’t need to use screenshare (or participate
in its lag) to connect to this anymore. The same holds true for
connecting to my work laptop if necessary.
In this final example screenshot, you can see that I have multiple
panes split in one iTerm2 tab. These correspond to the split windows
in the attached tmux in the other window. Also, the two tabs in the
iTerm2 window are separate tmux windows (0:zsh and 1:zsh).
And now, I can SSH to that system and attach to the tmux session
started by iTerm2.
Automating Installation with Chef
Installing OS X apps is quite easy, but I automate them
with Chef
anyway. While it is a simple “install update and restart”, with a
couple commands to install the update, I do have three systems I want
this on. I updated my
iterm2 cookbook to
support installing the tmux integration for iTerm2. This is disabled
by default, so it needs to be enabled via a node attribute. For
example, I have this in my workstation role applied to my OS X
workstations.
12345678
name"workstation"description"Mac OS X workstations"run_list("recipe[tmux]")default_attributes("iterm2"=>{"tmux_enabled"=>true})
Check out the iterm2 cookbook’s README for more information.