Orchestrating and Modularizing Business Processes

Next: What is modularity?, Previous: (dir), Up: (dir)

Orchestration

COMPLETE TABLE OF CONTENTS

Summary of contents

Next: What is orchestration?, Previous: Top, Up: Top

What is modularity?

Modularity is the ability to separate concerns in a process and hide the details of the different concerns in different containers. It is a service oriented view, in which different aspects of a problem are separated and turned into generic components that offer a service. We often talk about black boxes, grey boxes or white boxes depending on the extent to which the user of a service can see the details within the containers.

Next: How does cfengine deal with modularity and orchestration?, Previous: What is modularity?, Up: Top

What is orchestration?

Orchestration is the ability to coordinate many different processes in a system so that the sum of those processes yields a harmonious result. Orchestration is not about centralized control, though this is common misperception.

An orchestra does not manage to play a symphony because the conductor pulls every player's strings or blows every trumpet in person, but rather because each autonomous player has a copy of the script, knows what to do, and can use just the little additional information from the conductor to access a viewpoint that is not available to an individual. An orchestra is a weakly coupled expert system in which the management (conductor) provides a service to the players.

CFEngine works like an orchestra – this is why is scales so well. Each computer is an autonomous entity, getting its script and a few occasional pieces of information from the policy server (conductor). The coupling between the agents is weak – there is slack that makes the behaviour robust to minor errors in communication or timing.

Next: High level services in cfengine, Previous: What is orchestration?, Up: Top

How does cfengine deal with modularity and orchestration?

Promise Theory provides simple principles for hiding details: agents are considered to reveal a kind of service interface to peers, that is advertised by making a promise to someone. We assume an agent exerts best effort in keeping its promises. Orchestration requires a promise to coordinate and the promise to use that coordination service. These basic ideas are built into CFEngine.

CFEngine provides containers called bundles for creating modular parts. Bundles can be independent (and therefore parallelizable) or they can be dependent (in which case the sequence in which they verify their promises matters).

In a computer centre with many different machines, there is an additional dimension to orchestration – multiple orchestras. Each machine has a number of resources that need to be orchestrated, and the different machines themselves might also need to cooperate because they provide services to one another. The principles are the same in both cases, but the confusion between them is typically the reason why large systems do not scale well.

Next: Hiding details, Previous: How does cfengine deal with modularity and orchestration?, Up: Top

High level services in cfengine

CFEngine is designed to handle high level simplicity (without sacrificing low level capability) by working with configuration patterns, after all configuration is all about promising consistent patterns of system state in the resources of the system. Lists, for instance, are a particularly common kind of pattern: for each of the following... make a similar promise. There are several ways to organize patterns, using containers, lists and associative arrays. Let's look at how to configure a number of application services.

At the simplest or highest level, we can turn services into "genes" to switch on and off on your basic "stem cell" machines.

body agent control
{
bundlesequence => {
                  webserver("on"),
                  dns("on"),
                  security_set("on"),
                  ftp("off")
                  };
}

This obviously looks simple, but this kind of simplicity is cheating as we are hiding all the details of what is going to happen – we don't know if they are hard-coded, or whether we can decide ourselves. Anyone can play that game! The true test is whether we can retain the power to decide the low-level details without having to program in a low level language like Ruby, Python or Perl. Let's peel back some of the layers, knowing that we can hide as many of the details as we like.

A simple, but low level approach to deploying a service, that veteran users will recognize, is the following. This is a simple example of orchestration between a promise to raise a signal about a missing process and another promise to restart said process once its absence has been discovered and signalled.

bundle agent application_services
{
processes:

  "sshd"  restart_class => "start_ssh";
  "httpd" restart_class => "start_spache";

commands:

 start_ssh::
   "/etc/init.d/sshd restart";

 start_apache::
   "/etc/init.d/apache restart";

}

But the first thing we see is that there is a repeated pattern, so we could rewrite this as a single promise for a list of services, at the cost of a loss of transparency. However, this is the power of abstraction.

bundle agent application_services
{
vars:

  "service" slist => { "ssh", "apache", "mysql" };

 #
 # Apply the following promises to this list...
 #

processes:

  "$(daemon[$(service)])" restart_class => canonify("start_$(service)");

commands:

   "$(start[$(service)])"
       ifvarclass => canonify("start_$(service)");

}

This assumes that we can define the necessary information about these services in array variables of the form $(array[index]). This is what other tools refer to as a resource abstraction layer, though in some other tools this layer has to be partially hard-coded. We can see some more approaches below, but let's look at this abstraction for a moment.

Next: Black, Previous: High level services in cfengine, Up: Top

Hiding details

Resource abstraction, or hiding system specific details inside a kind of grey-box, is just another service as far as cfengine is concerned – and we generally map services to bundles.

Many system variables are discovered automatically by cfengine and provided "out of the box", e.g. the location of the filesystem table might be /etc/fstab, or /etc/vfstab or even /etc/filesystems, but cfengine allows you to refer simply to <b>$(sys.fstab)</b>. Soft-coded abstraction needs cannot be discovered by the system however. So how do we create this mythical resource abstraction layer? It is simple. Elsewhere we have defined basic settings.

bundle common res # abstraction layer
{
vars:

  solaris::

   "cfg_file[ssh]" string => "/etc/sshd_config";
   "daemon[ssh] "  string => "sshd";
   "start[ssh] "   string => "/etc/init.d/sshd restart";

  linux.SuSE::

   "cfg_file[ssh]" string => "/etc/ssh/sshd_config";
   "daemon[ssh] "  string => "sshd";
   "start[ssh] "   string => "/etc/init.d/sshd restart";

  default::

   "cfg_file[ssh]" string => "/etc/sshd_config";
   "daemon[ssh] "  string => "sshd";
   "start[ssh] "   string => "/etc/init.d/sshd restart";

classes:

  "default" and => { "!SuSE", "solaris" };
}

Some of the attempts to recreate a CFEngine-like tool try to hard code many decisions, meaning that minor changes in operating system versions require basic recoding of the software. CFEngine does not make decisions for you without your permission.

Next: Bulk operations are handled by repeating patterns over lists, Previous: Hiding details, Up: Top

Black, grey and white box encapsulation in cfengine

Cfengine's ability to abstract system decisions as promises also applies to bundles of promises. After all, we can package promises as bumper compendia for grouping together related matters in a single package. Naturally, cfengine never abandons its insistence on <b>convergence</b>, merely for the sake of making things look simple. Using cfengine, you can create convergent orchestration.

bundle agent services
{
vars:
 "service" slist => { "dhcp", "ntp", "sshd" };
methods:
 "any" usebundle => fix_service("$(service)"),
         comment => "Make sure the basic application services are running";
}

The code above is all you really want to see. The rest can be hidden in libraries that you rarely look at. In cfengine, we want the intentions to shine forth and the low level details to be clear on inspection, but hidden from view.

We can naturally modularize the packaged bundle of fully convergent promises and keep it as library code for reuse. Notice that cfengine adds comments in the code that follow processes through execution, allowing you to see the full intentions behind the promises in logs and error messages. In commercial versions, you can trace these comments to see your process details.

bundle agent fix_service(service)
{
files:

  "$(res.cfg_file[$(service)])"

 #
 # reserved_word => use std templates, e.g. cp(), p(), or roll your own
 #
     copy_from => cp("$(g.masterfiles)/$(service)","policy_host.mydomain"),
         perms => p("0600","root","root"),
       classes => define("$(service)_restart", "failed"),
       comment => "Copy a stock configuration file template from repository";

processes:

  "$(res.daemon[$(service)])"

     restart_class => canonify("$(service)_restart"),
           comment => "Check that the server process is running...";

commands:

  "$(res.start[$(service)])"

           comment => "Method for starting this service",
        ifvarclass => canonify("$(service)_restart");

}

Next: Ordering operations in cfengine, Previous: Black, Up: Top

Bulk operations are handled by repeating patterns over lists

The power of cfengine is to be able to handle lists of similar patterns in a powerful way. You can also wrap the whole experience in a method-bundle, and we can extend this kind of pattern to implement other interfaces, all without low level programming.

#
# Remove certain services from xinetd - for system hardening
#

bundle agent linux_harden_methods
{
vars:

   "services" slist => {
                       "chargen",
                       "chargen-udp",
                       "cups-lpd",
                       "finger",
                       "rlogin",
                       "rsh",
                       "talk",
                       "telnet",
                       "tftp"
                       };
methods:

    #
    # for each $(services) in @(services) do disable_xinetd($(services))
    #

   "any"  usebundle => disable_xinetd("$(services)");
}

In the library of generic templates, we may keep one or more methods for implementing service disablement. For example, this simple interface to Linux's chkconfig is one approach, which need not be hard-coded in Ruby using Cfeninge.

#
# For the standard library
#

bundle agent disable_xinetd(name)
{
vars:
   "status"

     string => execresult("/sbin/chkconfig --list $(name)", "useshell");

classes:
   "on"  expression => regcmp(".*on","$(status)");
   "off" expression => regcmp(".*off","$(status)");

commands:
   on::
      "/sbin/chkconfig $(name) off",
          comment => "disable $(name) service";

reports:
   on::
      "disable $(name) service.";
   off::
      "$(name) has been already disabled. Don't need to perform the action.";

}

Next: Bundle ordering, Previous: Bulk operations are handled by repeating patterns over lists, Up: Top

Ordering operations in cfengine

Ordering of operations is less important than you probably think. We are taught to think of computing as an linear sequence of steps, but this ignores a crucial fact about distributed systems: that many parts are independent of each other and exist in parallel.

Nevertheless there are sometimes cases of strong inter-dependency (that we strive to avoid, as they lead to most of the difficulties of system management) where order is important. In re-designing cfengine, we have taken a pragmatic approach to ordering. Essentially, cfengine takes care of ordering for you for most cases – and you can override the order in three ways:

CFEngine checks promises of the same type in the order in which they are defined, unless overridden
Bulk ordering of composite promises (called bundles) is handled using an overall list using the bundlesequence (replaces the actionsequence in previous cfengines)
Dependency coupling through dynamic classes, may be used to guarantee ordering in the few cases where this is required, as in the example below:

Next: Overriding order, Previous: Ordering operations in cfengine, Up: Top

Bundle ordering

There are two methods, working at different levels. At the top-most level there is the master bundlesequence

body common control
{
bundlesequence => { "bundle_one", "bundle_two", "bundle_three" };
}

For simple cases this is good enough, but the main purpose of the bundlesequence is to easily be able to switch on or off bundles by commenting them out.

A more flexible way of ordering bundles is to wrap the ordered process in a master-bundle. Then you can create new sequences of bundles (parameterized in more sophisticated ways) using methods promises. Methods promises are simply promises to re-use bundles, possibly with different parameters.

The default behaviour is to retain the order of these promises; the effect is to `execute' these bundles in the assumed order:

bundle agent a_bundle_subsequence
{
methods:
  classes::
   "any" usebundle => bundle_one("something");
   "any" usebundle => bundle_two("something");
   "any" usebundle => bundle_three("something");

}

Alternatively, the same effect can be achieved as follows.

bundle agent a_bundle_subsequence
{
methods:
  classes::
   "any" usebundle => generic_bundle("something","one");
   "any" usebundle => generic_bundle("something","two");
   "any" usebundle => generic_bundle("something","three");

}

Or ultimately:

bundle agent a_bundle_subsequence
{
vars:
  "list" slist => { "one", "two", "three"};

methods:
  classes::
   "any" usebundle => generic_bundle("something","$(list)");

}

Next: Distributing Ordering between hosts with cfengine Nova, Previous: Bundle ordering, Up: Top

Overriding order

CFEngine is designed to handle non-deterministic events, such as anomalies and unexpected changes to system state, so it needs to adapt. For this, there is no deterministic solution and approximate methods are required. Nevertheless, it is possible to make cfengine sort out dependent orderings, even when confounded by humans, as in this example:

bundle agent order

{
vars:

 "list" slist => { "three", "four" };

commands:

 ok_later::
   "/bin/echo five";

 any::

  "/bin/echo one"     classes => define("ok_later");
  "/bin/echo two";
  "/bin/echo $(list)";

}

The output of which becomes:

Q: ".../bin/echo one": one
Q: ".../bin/echo two": two
Q: ".../bin/echo three": three
Q: ".../bin/echo four": four
Q: ".../bin/echo five": five

Previous: Overriding order, Up: Top

Distributing Ordering between hosts with cfengine Nova

CFEngine Nova adds many powerful features to CFEngine, including a decentralized approach to coordinating activities across multiple hosts. Some tools try to approach this by centralizing data from the network in a single location, but this has two problems:

It leads to a bottleneck by design that throttles performance seriously.
It relies on the network being available.

With CFEngine Nova there are are both decentralized network approaches to this problem, and probabilistic methods that do not require the network at all.

Orchestration