Razor OSS #2 : Public Release of MK

Shortly after the open-source release of Razor in late May, we started to see requests from users for more information about the Microkernel that is used by Razor.  What distribution was used as the basis for this Microkernel?  What services does it provide?  What is involved in building a custom Microkernel that will support my hardware?  Will the Microkernel be open-sourced as well?  If so, will it be part of the Razor project or a separate (but related) project?

At the same time that these requests started coming in from the Razor community, we saw our first Razor issue that was directly linked to the Razor Microkernel.  The issue was with support for a networking card that we hadn’t seen before (the Broadcom NetXtreme II card) that was presenting some issues for a Razor user (The Microkernel it wasn’t checking in with the Razor server on machines that used this network card because the it couldn’t connect to the underlying network).  In the end, the issue turned out to be that the firmware needed to support this network card was not included in the Microkernel (even though firmware for this card that would work with our Microkernel was readily available).

Our intention all along was to make this project publically available, but we still hadn’t worked out the last remaining issues around automating the build of a Microkernel ISO from the Razor Microkernel project itself (at the time of the Razor release late last month the process of building a new version of the Microkernel ISO “from source” was still fairly tedious, manual, and error prone).  We have finally resolved the last of these issues, and are proud to announce that the Razor-Microkernel project is now publicly available as an open-source project that is hosted by Puppet Labs.  The source code for this project is freely available under a GPLv2 license (which is the license that governs the underlying Linux kernel that the Razor-Microkernel project is based on), and the project itself can be found here.

Given the general interest in the Razor Microkernel itself, Nick Weaver also asked me if I would be interested in guest writing a blog post that provides users with a bit more background about the Razor Microkernel itself (what it is, how Razor uses it, and how it can be customized).  At the end of this post there are links to pages on the Razor-Microkernel Project Wiki, where you can find more in depth information about the Microkernel (including a guide that will help you if you decide that you would like to build your own version of the Razor Microkernel to support your special needs).

A personal introduction

Perhaps I should start out by introducing myself (since most of you don’t know me).  My name is Tom McSweeney, and I’m one of the co-creators of Razor.  While Nick was primarily focused on developing the Razor Server over the past few months, my primary focus has been on developing the Razor Microkernel (and helping out with the development of the Razor Server in my spare time).  In terms of my background, I’ve been working as a Sr. Technologist in the Office of the CTO at EMC for about 5 years now.  Prior to that, I worked for a number of years as a software architect in one of the Java Centers at Sun Microsystems.  Overall, I’ve spent many years designing, developing, and deploying large-scale software systems for a variety of platforms, from servers to back-end telecommunications gear to embedded systems (and even handsets).  In almost every case, discovery and provisioning of devices across the network was one of the harder (and in many cases critical) issues that had to be resolved.

A bit of history

Last fall, as part of an internal project at EMC, Nick and I were tasked with selecting a framework that could be used for power-control and bare-metal provisioning of servers in a modern datacenter environment.  Ideally, the framework that we selected would support both bare-metal and “virtual bare-metal” provisioning, but we knew that provisioning on OS onto physical hardware was an absolute necessity for the use cases being considered for that project.  After fighting with several existing frameworks that each claimed to have already solved this problem for us (including Baracus and Crowbar), we decided that it would probably be easier to build our own framework for this task than to try to make one of the existing frameworks do what we needed them to do.

Once we started putting together a design for our own solution (the solution that would eventually become Razor), one of the first issues that we had to resolve was exactly how we would discover new nodes  in the network (so that the Razor server could start managing them).  Whatever tool we used for node discovery would have to be able to provide the Razor Server with a view into the capabilities of those nodes (either physical or virtual) so that the Razor Server could use that meta-data to decide exactly what it should do with the nodes that were discovered.

After discussing alternatives, we decided that the best approach would be to use a small, in-memory Linux kernel for this node discovery process.  There are a number of alternatives available today when it comes to small, in-memory Linux kernels (Damn Small Linux, SliTaz, Porteus, and Puppy LinuX all come to mind), so we narrowed our choices down to just distributions with an total size smaller than 256MB (to speed up delivery of the image to the node) that were under active development, and that included a relatively recent Linux kernel (i.e. distributions built using a v3.0.x Linux kernel).  As an additional constraint, we knew that we would be using Facter during the node discovery process, so we searched for distributions that included pre-built versions of Ruby and the system-level commands that Facter uses (like dmidecode).  Finally, we knew that we would want to build custom extensions for our Microkernel (perhaps even commercial versions of these extensions) so we looked at distributions that provided an easy mechanism for building custom extensions and that were licensed under a “commercial friendly” open-source license.

Once we applied all of these constraints to the various distributions that we had were comparing with each other, there was one distribution that clearly stood out on the list, and that distribution was Tiny Core Linux.  Tiny Core Linux (or TCL) easily met all of our constraints (even a few constraints that we hadn’t thought of initially):

  1. TCL is very small (the “Core” distribution is an ISO that is only 8MB in size) and is designed to run completely in memory  (the default configuration assumes no local storage exists and only takes up about 20MB of system memory when fully booted)
  2. TCL is built using a (very) recent kernel; as of the time of this writing (the latest release of TCL uses a v3.0.21 Linux kernel and, at the time that this was written, that release was posted less than two weeks ago), so we knew that it would provide support for most of the hardware that we were likely to see.
  3. TCL can easily be extended (either during the boot process or dynamically, while the kernel is running) by installing TCL Extensions (which we will call TCEs for short).  An extensive set of pre-built TCEs are available for download and installation (including Ruby).  The complete set of extensions can be found here.
  4. It is relatively simple to build your own TCE mirror, allowing for download and installation of TCEs from a local server (rather than having to pull down the extensions you need across the network).
  5. Tools exist to build your own TCEs if you can’t find a pre-build TCE for a package that you might need.
  6. The licensing terms under which TCL is available (GPLv2) are relatively “commercial friendly”, allowing for later development of commercial extensions for the Microkernel (as long as those extensions are not bundled directly into the ISO).  This would not be the case if a distribution that used a GPLv3 license were used instead.

Now that we had selected the distribution that we were going to use to build our Microkernel, it was time to turn our attention to the additional components that we would be deploying within that distribution to support the node discovery process.

Components that make up the Razor Microkernel

In order to successfully perform node discovery, a number of standard TCL extensions (and their dependencies) are installed during the Microkernel boot process:

  • ruby.tcz – an extension that provides everything needed to run Ruby (v1.8.7) within the Microkernel ; all of the services written for the Microkernel are Ruby-based services, and this package provides the framework needed to run those services (and the classes they depend on).
  • bash.tcz – an extension containing the ‘bash’ shell; installed in case the ‘bash’ shell is needed (out of the box, only the ‘ash’ shell is provided by the TCL “Core” distribution)
  • dmidecode.tcz – an extension containing the dmidecode UNIX command; this command is used by the Facter (and, as such, by the Microkernel Controller) during the node discovery process
  • scsi-3.0.21-tinycore.tcz – an extension that provides the tools, drivers, and kernel modules needed to access SCSI disks; without this extension any SCSI disks attached to the node are not visible to the Microkernel
  • lshw.tcz – an extension containing the lshw UNIX command; this command is used by the Microkernel Controller during the discovery process
  • firmware-bnx2.tcz – an extension that provides the firmware files necessary to access the network using a Broadcom NetXtreme II networking card during the system boot process; without this extension the network cannot be accessed using this type of NIC (which is fairly common on some newer servers).
  • openssh.tcz – an extension containing the OpenSSH daemon; this extension is only included in “development” Microkernel images, on a “production” Microkernel image this package is not included (to prevent unauthorized access to the underlying systems via SSH).

These extensions (which we’ll refer to as the “built-in extensions”) are set up to automatically install during the boot process and, as such, are readily available during the Microkernel setup and initialization process.

In addition to these “built-in extensions”, the Razor Microkernel also downloads and installs a set of “additional extensions”.  These additional extensions are downloaded and installed from a TCE mirror (rather than being installed from a local directory in the Microkernel filesystem) at the end of the Microkernel boot process (rather than during the boot process).  In the current release of the Razor Microkernel, there is only one “additional extension” that might be installed during the system initialization process, an extension that installs the Open VM Tools package (this extension is only installed if when the Microkernel is deployed to a VM running in a VMware-related environment).

Additional extensions can also be provided by an external TCE mirror (perhaps even by the Razor Server itself), and it is a simple configuration change (on the Razor Server) to point the Microkernel at different TCE mirror containing additional extensions that it should install.  If additional extensions are installed from an external TCE mirror, they will be installed in addition to (not instead of) those that are installed from the internal TCE mirror after the boot process completes.

As part of the Microkernel boot process, the Ruby Gems package is also installed (“from source”, using a gzipped tarfile that is bundled into the ISO itself).  Once this package is installed, several “Ruby Gems” are then installed as part of this same system initialization process.  Currently, this list of gems includes the following four gems:

  1. daemons – a gem that provides the capability to wrap existing Ruby classes/scripts as daemon processes (that can be started, stopped, restarted, etc.); this gem is used primarily to wrap the Razor Microkernel Controller as a daemon process.
  2. facter – provides us with access to Facter, a cross-platform Ruby library that is used by the Razor Microkernel to gather together many of the “facts” about the systems that it is deployed to (other “facts” are discovered using the lshw and lscpu UNIX commands).
  3. json_pure – provides the functionality needed to parse/construct JSON requests, which is critical when interacting with the Razor Server; the json_pure gem is used because it is purely Ruby based, so we don’t have to install any additional packages (like we would have to do if we were to use the more “performant”, but partly C-based, json gem instead).
  4. stomp – used by the MCollective daemon to provide external access to its agents via an ActiveMQ message queue

Which gems are actually installed is determined using a list that is “burned into the Microkernel ISO”.  The list itself is actually a part of the Razor-Microkernel project; and the gems are meant to be downloaded (from a local gem repository) during the process of building the ISO (although currently this list is used to bundle a fixed set of gems into the ISO during the Microkernel ISO build process).

The final components that make up the Razor Microkernel are a set of key services that are started automatically during system initialization.  This set of services includes the following:

  1. The Microkernel Controller – a Ruby-based daemon process that interacts with the Razor Server via HTTP
  2. The Microkernel TCE Mirror – a WEBrick instance that provides a completely internal web-server that can be used to obtain TCL extensions that should be installed once the boot process has completed. As was mentioned previously, the only extension that is currently provided by this mirror is the Open VM Tools extension (and its dependencies).
  3. The Microkernel Web Server – a WEBrick instance that can be used to interact with the Microkernel Controller via HTTP; currently this server is only used by the Microkernel Controller itself to save configuration changes it might receive from the Razor Server as part of the “checkin response” (this action actually triggers a restart of the Microkernel Controller by this web server instance), but in the future we feel that this server is also the most-likely interaction point between the MCollective and the Microkernel Controller.
  4. The MCollective Daemon – as was mentioned previously, this process is not currently used, but it is available for future use
  5. The OpenSSH Daemon – only installed and running if we are in a “development” Microkernel; in a “production” Microkernel this daemon process is not started (in fact, the package containing this daemon process isn’t even installed, as was noted above).

Once the system is fully initialized, the components that are running (and the connections between them) look something like this:

 

Often, when we talk about the Razor Microkernel, we’re actually referring to the Microkernel Controller that is running within the Razor Microkernel (since that’s the component that interacts directly with the Razor Server) but, as is shown in this diagram, there are actually several services that all work together to provide the full functionality of the complete Razor Microkernel.

Interactions with the Razor Server

There are two basic operations that the Microkernel Controller performs when interacting with the Razor Server:

  1. Node Checkin – during this process, the Razor Microkernel “checks in” with the Razor Server to determine what, if anything, the Razor Server would like that Microkernel instance to do
  2. Node Registration – during this process, the Razor Microkernel reports the meta-data that it has gathered about the platform it is deployed on (its node) to the Razor Server; this meta-data is then used by the Razor Server to determine what should be done with that node.

The Node Checkin process is periodic (with the timing defined as part of the Razor Server configuration).  The Razor Microkernel simply sends a “checkin request” to the Razor Server every N seconds, and the Razor Server looks for that node in the list of nodes that is managing.  Based on what the Razor Server finds, one of three things might happen:

  1. If it finds the node, and if the information for that node looks like it is up to date, then the Razor Server sends back an acknowledge command in its reply to this checkin request (an “acknowledge” command is basically a no-op).
  2. If the node cannot be found, or if the information for that node looks like it might be out of date, then the Razor Server sends register command back to the Microkernel in the reply to that checkin request instead (and the Microkernel will start the process of node registration).
  3. Finally, if the node needs to be transitioned to a new state (to install a new OS onto the node, for example), the Razor Server can send back reboot command back to the Microkernel in the reply to the checkin request instead (and the Microkernel will reboot immediately).

If the node registration process is triggered (either because the Razor Server has sent back a register command in the response to a checkin request or because the Microkernel itself has detected that the current facts gathered during the node checkin process are different from the facts that it last reported to the Razor Server), then a new set of facts for that node are reported to the Razor Server in a Node Registration request.  This set of facts contains the latest information gathered by the Microkernel Controller (using Facter, combined with information gathered using the lshw and lscpu commands).

Summary

In this posting we have described what the Razor Microkernel is and we’ve shown how the Razor Microkernel is used by the Razor Server for node discovery.  We’ve shown how the Microkernel is constructed from the Tiny Core Linux “Core” distribution (the basis the Razor Microkernel) and also broken down the Microkernel in a bit more detail in order to show how the services running within the Microkernel are organized internally.

More Information

If you are looking for a more detailed view into the Razor Microkernel, we invite you to visit the Razor-Microkernel project page itself.  That project is now freely accessible through the Puppet Labs GitHub site, and can be found here.  This project contains the source code for the services that were described above, as well as the scripts that you need to build your own versions of the Razor Microkernel.  The project site also includes several Wiki pages that provide more detailed information about the Razor Microkernel than we can provide in a blog posting like this.  Of particular interest might be the following pair of pages:

  • An Overview of the Razor Microkernel – provides users with a high-level overview of the Razor Microkernel itself, including detailed discussions of the interactions between the Razor Microkernel and the Razor Server
  • Building a Microkernel ISO – describes the process of building your own Microkernel ISO (in detail) using the tools that are provided by the Razor-Microkernel project

Once again, we’d like to welcome you all to the new Razor-Microkernel project.  As always, comments and feedback are welcome.

Cloud Coding Scripts

2 Comments Leave a comment

%d bloggers like this: