Check out that title. Pretty awesome way to sound smart, right? Well this blog post is another one of my long winded ones and concerns my recent 6 week side-project. So a little warning in advance: This is a long read and a minder-bender in spots. Have a hot or cool drink and some time before you start. I think you will enjoy the ending.
I am a firm believer that virtualization and cloud computing are creating new paradigms to approach innovation, operation, and execution within information technology. I find myself inspired by ideas and concepts that would be impossible before the advent of virtualization as a common approach to logical abstraction of x86 compute, storage, and networking. In my feeble mind, I see endless possibilities not only in automation. I also see possibilities in creating intelligent systems; able to respond in way much more organic that we may have thought possible.
It is from this belief that this new idea came to me. The lifecycle of applications and infrastructure has been both very a manual and managed process. Creation, changes, and death (decommissioning) are all things that can be automated; but require prerequisite knowledge to orchestrate correctly. You would specifically know the quantity, scope, and configuration of physical or virtual servers prior to building for an application. Likewise, configured settings and metadata for the application would have been tested and discovered through intense integration and regression cycles by development/quality teams beforehand. All of this would be wrapped around processes and models (ITIL, COBIT) with the goal of ensuring control and accountability.
And I am not proposing that the model above is inherently wrong. Rather, I just think that it may be possible we are missing opportunities for the infrastructure to work for us.
This idea has actually been with me for a while. The ability for virtualization to convert the paradigm of a rigid physical infrastructure in the form of a server, into a logical construct known as the virtual server opens a wealth of options. The lifecycle of the virtual server is something much different. Unlike the physical server, it can be created, modified, and decommissioned without physical access and through multiple methods. Even more interesting is the ability to clone copies of virtual machines with the *children* possessing the same makeup as the *parent*. I look at this and see a similar model to common creatures in our world around us. An animal is a product of its parents. It is a genetic clone with mixed attributes and minor mutations that make it unique.
So the question in my mind is:
If I took a virtual machine, used the ability to clone as a basis, and added the ability to spawn generations that not only inherit attributes from the parent generations but also introduce mutations. What would I have?
Would I be able to see virtual machines become more sophisticated or efficient over time? How would they evolve?
But then I ran into a brick wall. Random mutations and multiple generations of virtual machines would in the end, produce nothing. There is no goal, no end game. This would result in nothing but virtual white noise.
Then my thought process went even further. Creatures in our world are exposed and live within an environment. Just as an environment enforces life and death, favoring the strong and killing the weak; what if I could create an environment in which certain virtual machines were allowed to thrive and others not based on random mutations within each generation? I could contrive a goal. Using either a single linear metric which every generation is graded on or even a multidimensional map that would judge each virtual machine.
And that is how I came up with the idea for Virtual Selection. It is simply taking the unique constructs that represent virtualization and applying the model of real-world Natural Selection; with the object of allowing for unassisted evolvement of virtual machines towards a goal.
But to create this I first had to decide on a few of strict rules to follow:
1. The creatures (virtual machines) would have no ability to know the goal.
They must only produce an asymmetric result without any input from the environment and only a genetic map from their parent. Mutation must be random and varied in both rate and effect.
2. The environment would have no ability to enforce a result other than selection by reproduction.
The environment would choose which virtual machines procreate and which do not based on how well their genetics compared versus an *ideal* synthetic genetic code. They cannot change the mutation rate, specific attributes, or anything *within* the creature.
3. I would use as little code as possible and as much of the virtualization construct as possible.
This entire process can and has been done with logical code constructs. But what is important about the test is that our goal is not just an abstract test. It is an idea that a virtual machine can become *better* by applying Natural Selection concepts to it. If I wrote a bunch of code that did this all in memory, I have made it irrelevant versus a virtual machine model and ultimately a cloud construct. This test must be useful against a virtual machine that would include an operating system, settings, and applications. The idea is to prove that this is possible in our virtual infrastructure world.
Now I had some real problems to solve. First, how would I represent a synthetic genetic map that was inheritable, mutable, and allowed my environment to evaluate? How would I be able to have this exist inside a virtual machine? And more importantly: how can I make this genetic map as something someone could observed as an evolution through generations?
The answer came to me while staring as a restaurant menu. It is both simple and perfect. I should use an image. An image is a two dimensional map of pixels that each are a combination of possible values. This is inheritable, as each generation can inherit the image of its parent. This is mutable as individual pixels can mutate within the image for each generation. This is can be evaluated by the environment using a *goal* image which represents the ideal genetic map for the environment. And this can be easily relatable as a progression to outside viewers as images are a visual representation of information.
I still had to build the pieces but I had the overall design ready in my mind. I would have creatures as virtual machines. These would have a genetic map that they would take on boot and build their version. I would have an environment as an orchestration application that would evaluate each creature and choose with creatures would be allowed to replicate for the next generation. And all of this would run on a lab environment running VMware’s vSphere 4.1 and using the vSphere Web SDK.
I will go through each section below and talk about the design behind each piece and the lessons learned. And I will finish this post with both the results and some ideas on how this can be further extended into business relevant ideas.
To build my creatures I had to think in small terms. This entire test was to run on my single Intel Core i7, 12GB of RAM, and SSD-based system. Not a massive amount of power which means I had to be as frugal as possible. After starting with several Ubuntu builds I realized I needed smaller and faster VM’s. I needed to run some generations (more on that below) in quantities of 32-64+ creatures at a time and using 6GB creatures was too much for my poor SSD disk.
I ended up using Embedded Debian (Embedian) to build an 800MB creature with all the components I needed. This included PHP for all the genetic functionality and VMware tools (200MB+) to allow the environment to manage and maintain unique identification of the genetic models. I also ended up utilizing Linked-Clones to enable greater space savings and quicker cloning.
Within the creature itself I used PHP to load up the genetic map (image) and replicate using randomized mutation rates. Instead of a preconfigured rate of mutation, I actually treated every piece of genetic code (pixel) against a mutation rate that was randomly established on creation (boot). I also wanted the scope of the mutation rate to be somewhat variant so I added in algorithms that made subsets of generations by percentage either especially prone to mutation or without any mutation. I found it was especially important to make sure that each generation would contain a few creatures with very little mutation. This would prevent any positive progress in a generation from being completely reset by completely unproductive mutation across all creatures.
The funny thing about spawning hundreds of thousands of creatures (virtual machines) is that because they inherit all the details of their parent, I had to build in resets of log files and incremental things to prevent file space shortages. I also had to deal with operating system errors that only occur once in a few thousand boots. Needless to say I had quite a stable build by the end.
The end result is a very robust small creature (virtual machine) that performs a specific function each boot and is not impacted by cloning. Each creature took what it inherited by being cloned and mutated ever so slightly and randomly to become a unique creature.
The first problem to solve would be the ability of the environment to know the uniqueness of the creatures. It must be able to evaluate each creature to determine which *thrived* against the goal genetic model. To do this I actually used DHCP and a Private Class A range (10.x.x.) with a one hour lease time. Each creature would boot, obtain a unique IP, and produce its genetic map. The environment would use this IP address as a unique key for the creatures. The environment possessed several key components that allowed evaluation.
1. Genetic map drop
This was a network share that allowed each creature to drop its unique genetic map keyed with its unique identity. Since this location was asynchronous (drop only) it maintains rule #1 and #2 above. For simplicity this is the only touch point between genetic maps and the environment.
2. Evaluation algorithm (written in C#)
This was functionality that would inventory both the entire generation of creatures (virtual machines through vSphere SDK) and the corresponding genetic maps (image files). Then it would grade against the goal comparing images and produce a map of the results.
3. Life & Death engine (also C#)
This was an orchestration layer that would be fed the map of generational results from the step above and clone, kill, and power on the next generation. This had to be able to managed the genetic map drop, the clone/delete/power on/power off integrations, and manage linked-clones. It also would handle taking the image comparison scores and choose who would procreate and who would not. And finally it would also have to have the ability to handle errors and recover should any error occur in any of the other components.
The environmental process is quite simple. First the environment L&D engine is enabled/started. It examines the existing creatures (first generation) using the evaluation algorithm. Then it chooses the best performers and clones a set of children. This set is actually also slightly random. It is a set with a defined minimum to make sure there are always generations and a defined maximum to ensure that I do not crash my workstation. The actual value in this set is influenced by how the variance of change. If there is little change generation to generation then the number of children increases. Otherwise it will remain the same or reduce with a massive amount of mutation. This serves as an environmental balance to maintain progressive mutations while not over producing children.
While the new children are cloned, every previous generation is destroyed. This ensures that every generation is unique and no comparison is done cross-generation. The use of linked-clones here makes the delete/clone process more difficult, but saves on space and time.
After all this is complete, the L&D engine goes idle and watching the environment for when the children have completed their cycle. Once this has happened it evaluates this new generation and starts from the top.
I chose a picture that was 100 pixels by 100 pixels with a possible gray scale color range of 8 bit (256 colors). This results in the odds of changing 1 pixel correctly at 1 in 256,000. In hindsight this was probably too ambitiously large of a genetic map with only my meager workstation. When I started I knew I would have no idea of the efficiency of the system until I tried to use it. I quickly realized that the closer I got to the goal, the slower it would go and the longer it would take. To complete the project I compressed the selection of color range to compensate for my limited CPU/Storage in my lab.
But, even with color range reduction I still had to find ways to optimize the process. I learned how to clone creatures as fast as possible, how to boot into Linux in half the time, and how to orchestrate vSphere in the correct order. A genetic map with 256,000 combinations of values is quite a feat. However, because the learning process included the slow evolving of this genetic map I stayed true to my goal and simply had this running 24 hours a day, 7 days a week.
And the result is that it works quite well. I cannot really describe my own amazement as these little creature’s genetic maps started looking more like the end goal. What started as a white noise evolved, on its own, into a result that you might agree is pretty darn cool.
The video below is a segment of about 85% of the generations I was able to run against my original goal. Remember this progress was completely random with only procreation selection by the environment. As I was constantly refining the process the rate of change varied as well as the fact that random mutations are in fact, very random.
Each frame in the video represents the highest scoring creature’s genetic map for that generation. I am only playing the generations with a noticeable change (with a few exceptions) to keep the video small. The closer to the end, the more sparse positive change occurred. This is a symptom of a static goal. I am convinced that a dynamic open-ended goal would see a more linear rate of change.
This whole experiment is meant to prove the possibility of using something like Virtual Selection to bring improvements to virtual infrastructure and cloud computing. I can image a development environment where modules of an Enterprise Architecture design can be loaded and each generation is an iteration of the previous with a finite set of values mutated. Each generation of the model can be recorded and compared with Virtual Selection using metrics based on desired performance results. This would create a system that could organically improve upon natural development cycles in a way not possible in human processes. In addition, it is a way to do this using the existing virtualization constructs meant to house the end product (the virtual machine).
The possibilities are endless to the applications possible. To imagine a virtual machine running applications as a creature, to apply Natural Selection as a model to influence it; is just one of the many applications cloud computing will enable. I am hoping this project will help open your minds to cool ideas and maybe inspire you to try something new.
I plan on writing a front end report for this application and loading on my site. The goal is to show a live view of Virtual Selection running for readers to watch and observe with a simpler goal image. Whether I get this done or end up on the next project, only time and my own natural selection will tell.
As always, comments/complaints/ideas are welcome below.