CFengine 3 Tutorial Part 5 Client promises.cf and regular cf-agent operation

Finally, the moment we’ve been waiting for. Lets take a crack at promises.cf and what we have defined.


 $ cat -n /var/cfengine/inputs/promises.cf 
     1  # promises.cf -- clients only
     2  
     3  body common control
     4  {
     5          bundlesequence  => { 
     6                          "update",
     7                          "garbage_collection",
     8                          "smf_update",
     9                          "verify_root",
    10                          "general_global_configs",
    11                          };
    12  
    13          inputs          => {
    14                          "update.cf",
    15                          "cfengine_stdlib.cf",
    16                          "cf-execd.cf",
    17                          "smf_update.cf",
    18                          "verify_root.cf",
    19                          "general_global_configs.cf",
    20                          };
    21  }
    22  
    23  #######################################################
    24  
    25  bundle common g
    26  {
    27  # Define some global variables
    28  vars:
    29          "masterfiles" string => "/var/cfengine/masterfiles";
    30          "inputs" string => "/var/cfengine/inputs";
    31          "workdir" string => "/var/cfengine";
    32          "phost" string => "192.168.1.10";
    33  
    34          # Define global or local zones
    35          "solaris_zone_type"     expression      =>      usemodule("init_zone-type","");
    36  }
    37  
    38  #######################################################
    39  
    40  body agent control
    41  {
    42  # if default runtime is 5 mins we need this for long jobs
    43  ifelapsed => "15";
    44  
    45  # Allow us to use DNS
    46  skipidentify => "false";
    47  }
    48  
    49  #######################################################
    50  
    51  body monitor control
    52  {
    53  forgetrate => "0.7";
    54  histograms => "true";
    55  monitorfacility => "LOG_DAEMON";
    56  }
    57  
    58  #######################################################
    59  
    60  body reporter control
    61  
    62  {
    63  reports => { "all" };
    64  build_directory => "$(sys.workdir)/reports";
    65  report_output => "text";
    66  time_stamps => "true";
    67  }
    68  
    69  #######################################################

* Line 3 defines “body common control” — our main() function which drives all other executions of the policy.
* Lines 13-20 pull in additional CFEngine Policy files for execution.
* Lines 5-11 define the bundlesequence — the order of which functions are executed. These functions are “included” into this policy by pulling in the CFengine policy files in lines 13-20.

Note, the first action cf-agent will execute in “normal operation” is to update itself from the master policy server.

* Line 15 imports the Standard CFengine library. Where ever you installed CFEngine during the compilation process, you can find this file. (if you didn’t use the –prefix of /var/cfengine, its probably in /usr/local). This library contains some pre-written functions to do a TON of different operations. If you want to add more functions, this might be a good place to place them. Once this library is imported, you can call any of their functions from policies defined later in promises.cf


$ find /var/cfengine -name cfengine_stdlib.cf
/var/cfengine/share/doc/cfengine/cfengine_stdlib.cf

And some examples of what these functions can do...

$ egrep 'bundle|body' /var/cfengine/inputs/cfengine_stdlib.cf 
bundle edit_line comment_lines_matching(regex,comment)
bundle edit_line uncomment_lines_matching(regex,comment)
bundle edit_line delete_lines_matching(regex)
bundle edit_line append_if_no_line(str)
bundle edit_line append_if_no_lines(list)
bundle edit_line resolvconf(search,list)
bundle edit_line set_variable_values(v)
bundle edit_line append_users_starting(v)
bundle edit_line append_groups_starting(v)
bundle edit_line set_user_field(user,field,val)
bundle edit_line append_user_field(group,field,allusers)
bundle edit_line expand_template(templatefile)
body edit_field quoted_var(newval,method)
body edit_field col(split,col,newval,method)
body replace_with value(x)
body select_region INI_section(x)
body edit_defaults std_defs
body edit_defaults empty
body location start
body replace_with comment(c)
body replace_with uncomment
body action if_elapsed(x)
body action measure_performance(x)
body action warn_only
body action bg(elapsed,expire)
body contain silent
body contain in_dir(s)
body contain silent_in_dir(s)
body contain in_shell
body contain setuid(x)
body contain setuid_sh(x)
body contain jail(owner,root,dir)
body classes if_desired(a)
body classes if_repaired(x)
body classes if_else(yes,no)
body classes if_notkept(x)
body classes if_ok(x)
body copy_from secure_cp(from,server)
body copy_from remote_cp(from,server)
body copy_from local_cp(from)
body copy_from no_backup_cp(from)
body copy_from no_backup_rcp(from,server)
body link_from ln_s(x)
body link_from linkchildren(tofile)
body perms m(mode)
body perms mo(mode,user)
body perms mog(mode,user,group)
body perms og(u,g) 
body perms owner(user)
body depth_search recurse(d)
body depth_search recurse_ignore(d,list)
body delete tidy
body rename disable
body rename rotate(level)
body rename to(file)
body file_select name_age(name,age)
body file_select days_old(days)
body file_select size_range(from,to)
body file_select exclude(name)
body file_select plain
body file_select dirs
body file_select ex_list(names)
body changes detect_all_change
body changes detect_content
body package_method zypper
body package_method apt
body package_method yum
body package_method solaris (pkgname, spoolfile, adminfile)
body package_method freebsd
body volume min_free_space(free)
body mount nfs(server,source)
body mount nfs_p(server,source,perm)
body mount unmount
body select_process exclude_procs(x)
body process_count check_range(name,lower,upper)

By importing this library, a lot of the complexity of lower-level policy writing is taken care of for you. You can also view a web-based version of what these policies contain here.

Looking back at promises.cf, on lines 34-35 we define a custom class using a module. Remember from before, when cf-agent executes in verbose mode, we can see which classes that were discovered. Using basic shell scripts, we can extend this to define whatever classes we want to. Lets take a look at the module we’ve defined.


$ cat /var/cfengine/modules/init_zone-type 
#!/bin/bash

PATH=/usr/bin:/usr/sbin

TYPE=`zoneadm list -cp | cut -d: -f2`

if [ "$TYPE" == 'global' ]
then
        echo '+global_zone'
else
        echo '+local_zone'
fi

So this is super easy. We can execute whatever shell commands we want and use basic logic to determine if we’ve matched that class. If so, echo “+[class_name]” and cf-agent will pick it up. So, if you wanted to create a class for machines with Python installed, specific running processes, etc… The possibilities here are endless. Use modules to define classes of systems. Don’t modify system state — just collect data and make decisions about what type of classes a machine should belong to. Use CFEngine policies to execute changes.

Finally, the last interesting part about promises.cf here is on line 16. We define a separate file for cf-execd. Using Neil Watson’s example tutorial, he split up configuration settings for the separate daemons into their own file. It makes sense, so I followed his example here. Remember, cf-execd actually drives the execution of cf-agent, so lets take a look at this file.


$ cat -n /var/cfengine/inputs/cf-execd.cf 
     1  body executor control 
     2  {
     3          # Splaytime is a critical varible that determines the "back off" time in minutes for cf-agent
     4          # to check in with cf-serverd.  Setting this to a higher value like "10" will allow thousands
     5          # of clients to pull updates from a single policy server without hammering the network too bad all
     6          # at once.
     7          splaytime               =>      "10";
     8          mailto                  =>      "fatkitty@sinatra.com";
     9          mailfrom                =>      "root@$(sys.host)";
    10          smtpserver              =>      "localhost";
    11          schedule                =>       { "Min00_Min10",  "Min15", "Min20", "Min25", "Hr07.Min30", "Min35", "Min40", "Min45", "Min50", "Min55" };
    12          executorfacility        =>      "LOG_DAEMON";
    13  
    14          # This is the command that actually drives cf-execd to execute cf-agent on the schedule above.
    15          exec_command            =>      "${sys.workdir}/bin/cf-agent -f failsafe.cf && ${sys.workdir}/bin/cf-agent";
    16  }
    17  ##########################################
    18  bundle agent garbage_collection
    19  {
    20  files:
    21  
    22    "$(sys.workdir)/outputs" 
    23  
    24      delete => tidy,
    25      file_select => days_old("3"),
    26      depth_search => recurse("inf");
    27  }

* Line 7 is critical. This is the tunable of how often / how hard cf-agent is going to pound cf-serverd. The clients take the value of splaytime and using a hash algorithm with their hostname, will check in randomly over this given time frame.
* Line 8 is where we want to mail the output of reports: type promises to.
* Line 9 uses an internal variable $(sys.host) to substitute the host name of the box.
* Line 11 is critical. By default, cf-execd will fire cf-agent off every 5 minutes. It doesn’t have to be this way. If you only want cf-agent to execute once an hour, then define the time classes here. Use periods to “combine” classes together. For example, Hr07.Min30 will only execute at 07:30. The example Min00_Min10 will execute throughout that entire span between the two values. You get the idea.
* Line 15 is critical. This is the command cf-execd will execute based upon its schedule. It executes cf-agent against failsafe.cf so configs are updated from the master policy servers — then it executes in “normal mode” against promises.cf.
** With no policy defined as in the 2nd operation on Line 15, promises.cf is assumed to be the default action.

That’s it in a nutshell! We’ve traced through normal client operations and how to construct general CFengine policies. To continue to add functionality into what cf-agent will execute upon, just include more *.cf files on the import and bundlesequence statemetns!

CFengine is an extremely powerful tool for controlling the configuration of hundreds / thousands / tens of thousands of machines. Its used at enterprises like Facebook to drive system management. ALWAYS test changes in a dev/test environment BEFORE pushing new policies into production. With a powerful tool, you can damage machines easily!

CFengine 3 Tutorial — Part 4 — Client failsafe.cf and update.cf

As stated in part 1 of this tutorial series, normal client-side operations of CFEngine is for cf-agent to:
1. Execute against /var/cfengine/inputs/failsafe.cf (which calls update.cf)
2. Execute against /var/cfengine/inputs/promises.cf

We stated that we dont want to break failsafe.cf or update.cf. When we write new CFEngine policies to implement, we import and call them from promises.cf. If we make a mistake and break the syntax, failsafe.cf / update.cf are still in pristine state. It will allow the clients to self-recover from the config breakage once we make the change through SVN.

Lets take a look at failsafe.cf


$ cat -n /var/cfengine/inputs/failsafe.cf 
     1  # failsafe.cf
     2  
     3  # Whatever you do, DO NOT MODIFY THIS FILE.  If you do, you can break the whole
     4  # CFEngine infrastructure as failsafe.cf and update.cf is the "failsafe" to restore things back into a working
     5  # state.  With a broken update.cf or failsafe.cf, you will be touching boxes manually
     6  # to recover.  If you break CFEngine, may shame be brought upon you and your offspring for generations.
     7  # You _should_ be modifying promises.cf to add additional bundlesequences and input files to extend CFEngine.
     8  
     9  body common control
    10  {
    11          bundlesequence  =>      { "update" };
    12          inputs          =>      { "update.cf" };
    13  }
    14  
    15  ############################################
    16  bundle common g
    17  {
    18  # Define Master Policy Servers
    19  vars:
    20          "phost" string  =>      "192.168.1.10";
    21  }
    22  
    23  ############################################
    24  
    25  body depth_search recurse(d)
    26  
    27  {
    28  depth => "$(d)";
    29  }

Nice. Breaking this down line-by-line again.

1. Lines 9-13 define our “body common control” stanza. Remember, this is like the main() function in Java / C / Python. This is the starting off point for our policy to execute.
2. Line 12 We input “update.cf” into the execution of this policy. By using input statements, we can break configurations out into multiple files. Think of this like importing Apache SSL configurations in /etc/www/conf/httpd.conf from an external file like /etc/www/conf/ssl.conf
3. Line 11 We start our execution from the “update” promise. This is located in update.cf.
4. Line 16-21 We define a global variable to be used throughout execution of this policy. Here, phost is defined as a single IP address. This is the IP address of our Master Policy Server. If we wanted to extend our CFEngine infrastructure over multiple Master Policy Hosts, you would extend that network information here.
5. Line 25-29 Defines a function called “recurse” that takes an argument. This argument tells the recurse function what depth to search to. We’ll see this being called in update.cf

So, this is all pretty straightforward stuff. Lets see the interesting bits in update.cf.


$ cat -n /var/cfengine/inputs/update.cf 
     1  # update.cf
     2  
     3  bundle agent update
     4  {
     5  vars:
     6  
     7  "master_location" string        =>      "/var/cfengine/masterfiles/client_inputs";
     8  "master_modules" string         =>      "/var/cfengine/masterfiles/client_modules";
     9  
    10  files:
    11          # /var/cfengine should remain 0700.  Nobody but the root user should be poking around in here
    12          "/var/cfengine/"
    13                  perms           =>      m_u_g("0700","root","root"),
    14                  depth_search    =>      avoid_inputs_recurse("inf"),
    15                  action          =>      immediate;
    16  
    17          # Update the config files from the policy master servers
    18          "/var/cfengine/inputs"
    19                  perms           =>      m_u_g("0600","root","root"),
    20                  copy_from       =>      remote_copy("$(master_location)","$(g.phost)"),
    21                  depth_search    =>      recurse("inf"),
    22                  action          =>      immediate;
    23  
    24          # Update the modules from the policy master servers
    25          "/var/cfengine/modules"
    26                  perms           =>      m_u_g("0700","root","root"),
    27                  copy_from       =>      remote_copy("$(master_modules)","$(g.phost)"),
    28                  depth_search    =>      recurse("inf"),
    29                  action          =>      immediate;
    30  
    31          # Update the binaries from the sbin directory
    32          "/var/cfengine/bin"
    33                   perms          =>      m_u_g("0700","root","root"),
    34                  copy_from       =>      mycopy("/var/cfengine/sbin","localhost"),
    35                  depth_search    =>      recurse("inf"),
    36                  action          =>      immediate;
    37  }
    38  ############################################
    39  body perms m_u_g(m,u,g)
    40  {
    41          mode            =>      "$(m)";
    42          owners          =>      { "$(u)" };
    43          groups          =>      { "$(g)" };
    44  }
    45  #########################################################
    46  body copy_from mycopy(from,server)
    47  {
    48          source          =>      "$(from)";
    49          compare         =>      "digest";
    50          purge           =>      "true";
    51  }
    52  
    53  #########################################################
    54  body action immediate
    55  {
    56          ifelapsed       =>      "1";
    57  }
    58  
    59  #########################################################
    60  body copy_from remote_copy(sourcedir,sourceserver)
    61  {
    62          source          =>      "$(sourcedir)";
    63          servers         =>      { "$(sourceserver)" };
    64          copy_backup     =>      "true";
    65          purge           =>      "true";
    66          trustkey        =>      "true";
    67          compare         =>      "digest";
    68          encrypt         =>      "true";
    69          verify          =>      "true";
    70  }
    71  body depth_search avoid_inputs_recurse(d)
    72  {
    73          depth           =>      "$(d)";
    74          exclude_dirs    =>      { "/var/cfengine/inputs", "/var/cfengine/state" };
    75  }

Finally. Some really interesting CFengine stuff to talk about. Lots of things are happening in those 75 lines. Lets break it down line by line again.

* Line 3 defines “bundle agent update” which is what we imported into bundlesequence from failsafe.cf on line 11. This is how the logic / flow of control is passed between different config files in CFEngine.
* LIne 5-8 defines some string variables. There are all sorts of variables that can be defined in CFengine. (integers, strings, arrays, etc..)
* Line 12 defines some actions we want to have taken on the /var/cfengine directory tree
** Line 13 executes the m_u_g function that is defined on line 39. We pass it the UNIX permissions for mode, user, and group we want to have applied.
*** Note, that “perms” is offered on line 13, but “perms” is also defined as the promise type in the m_u_g function on line 39. This is by design.
** Lines 14 and 15 also call other functions defined lower in updates.cf
* Lines 17-22 updates /var/cfengine/inputs on the clients from data on the Master Policy Servers. This is what will update promises.cf should we break it.
** Line 20 calls remote_copy, a function that we define starting on line 60.
** Line 20 calls remote_copy using two variables. The first variable, $master_location, we defined in “bundle agent update” i.e. this current scope — so we can call this variable directly. The variable $g.phost is “outside” the current scope. We defined $phost in failsafe.cf on line 20 in “bundle common g”. We prefix the name of the variable with the bundle it was called from. We called the bundle “g” on line 16 of failsafe.cf — hence the use of $g.phost to define which bundle / variable we want to access.
*** Line 60-70 defines the remote_copy function. Note that we encrypt the transport, we compare config files using a MD5 digest, verify the files were transferred correctly, etc…

*** Line 63 accepts an array of strings. We passed g.phost into this function, which is passed into “sourceservers” and finally into “servers.” But instead of a single entry in g.phost, we could have several and line 63 would be able to accept all of them. This isn’t a string variable. Its an array of strings.
* Lines 24-29 populates /var/cfengine/modules on the clients. Modules are shell scripts used to define custom classes on clients. More on this in part 5 of this tutorial.
* Lines 31-36 verifies that the contents of /var/cfengine/bin are idendical to /var/cfengine/sbin. The sbin is considered the “pristine” binary directory. If cf-agent detects that there is some change on the binaries which get executed in /var/cfengine/bin, then they will be re-populated from the pristine source. This is an attempt to make the clients auto-recover in a resilient hands off fashion should one of the binaries be corrupted.

That’s it! We’ve looked at our first complex CFEngine policy and saw how cf-agent is going to behave. This will be used to download new policies / auto recover in the case of a broken promises.cf policy.

One more note: Before we run this for the first time to “phone home” to the master policy servers, we need to generate a public / private key for this client machine. This works exactly like SSH public key authencation. On first contact of the client to the server, the server saves a copy of the public key from the client. Every attempt from the client to the server hence forward authenticates using this key. Data is encrypted between cf-serverd and cf-agent using this key. The client also saves a copy of the master policy server’s keys.

This key only needs to be generated once in the client’s lifetime. To generate the key (and I execute this from a postinstall package script) execute the cf-key binary.


 $ /var/cfengine/bin/cf-key 
Making a key pair for cfengine, please wait, this could take a minute...

 ls -l /var/cfengine/ppkeys/localhost.p*
-rw-------   1 root     root        1743 Jul  2 12:33 /var/cfengine/ppkeys/localhost.priv
-rw-------   1 root     root         426 Jul  2 12:33 /var/cfengine/ppkeys/localhost.pub

$ cat /var/cfengine/ppkeys/localhost.pub 
-----BEGIN RSA PUBLIC KEY-----
MIIBCAKCAQEA57cGTBfsqTwfuawgyO9K9tLt7IOvns7lAku/8XcyUkJ0AY0AATVK
TVjI7E1HT/moTvvLo+t6QuCD6Eo3+K++OaeP4pmSXhcGRFhuK4IVSLjfuDtYfwmn
Kd730gP2KONQZiIiVkQfsd1ADMTxTtldv/UR1COG49wexKA3f13iBNEj7d6YehHy
PFabbFpjcGmelg5yu0nDopUrGGg402BLAc8Z9H/7QxrzktH9uVrFuLitGE8reyJQ
2A8wQErRgtgpVBC2M1NFo4bWIk6mkLCukF6EOuUxzEgUjcToCc8p5sr5j2kpj+Vi
n5m16pcjkQo+EX+t7wbnRFy1PK0d98SrfwIBIw==
-----END RSA PUBLIC KEY-----

Keys are saved in /var/cfengine/ppkeys. If you rebuild a client, or regenerate keys, you’ll need to remove the old public key entry on the master policy servers. Since clients also cache the public key of the master policy servers — if the server is rebuilt or keys regenerated then this old key will need to be removed from all the clients so they can re-cache the new key. In short: treat these keys like you would with SSH keys. If you loose / damage a private key, it could be a PITA the recover from.

One last note: lets execute failsafe.cf and watch corrections take place. Note, we do not specify the absolute path in -f because by default, cf-agent will look in /var/cfengine/inputs (where failsafe.cf and update.cf live). We execute in –inform mode using -I so we see the good changes that cf-agent makes and -K so we aren’t held on locks.


/var/cfengine $ touch mike
/var/cfengine $ chmod 777 mike 
/var/cfengine $ /var/cfengine/bin/cf-agent -f failsafe.cf -I -K
 -> Object /var/cfengine/mike had permission 777, changed it to 700


/var/cfengine $ rm /var/cfengine/inputs/promises.cf 
/var/cfengine $ echo 'this is a garbage statement' > /var/cfengine/inputs/promises.cf 
/var/cfengine $ /var/cfengine/bin/cf-agent -f failsafe.cf -I -K
 -> Updated /var/cfengine/inputs/promises.cf from source /var/cfengine/masterfiles/client_inputs/promises.cf on 192.168.1.10

CFengine 3 Tutorial — Part 3 — Hello World

So up to this point, we’ve had a high level 10,000ft introduction to how CFEngine works. Hopefully we’ve gotten the needed bits built and packaged up to bootstrap our infrastructure. As any other programming language begins, lets look at the most basic “Hello World” type policy.


/var/tmp $ cat /var/tmp/hello_world.cf
     1  body common control
     2  {
     3          bundlesequence  =>      { "hello_prefetch_net_friends" };
     4  }
     5  bundle agent hello_prefetch_net_friends
     6  {
     7  reports:
     8          cfengine_3::
     9                  "Hello World";
    10  
    11          sunos_5_10::
    12                  "I am a Solaris 10 host";
    13  }

Great. Lets execute. Use the -f flag to point cf-agent at the Policy we want to execute against. If we don’t supply the -f, then cf-agent assumes the policy file resides in /var/cfengine/inputs. Since we created this example in /var/tmp, we need to direct cf-agent to execute the policy using the absolute path.


/var/tmp $ /var/cfengine/bin/cf-agent -f /var/tmp/hello_world.cf   
R: Hello World
R: I am a Solaris 10 host
/var/tmp $ /var/cfengine/bin/cf-agent -f /var/tmp/hello_world.cf 
/var/tmp $
/var/tmp $ /var/cfengine/bin/cf-agent -f /var/tmp/hello_world.cf -K
R: Hello World
R: I am a Solaris 10 host

Excellent! Lets break this down line-by-line on the sample policy above.

* Lines 1-4 define “body common control”. Think of this as the “main” function in Python / C / Java / etc.. This function is the “driver” that calls all other operations to occur, and which order to execute in. Every CFEngine policy needs a “body common control” statement. Here, we define bundesequence. This is a list of “functions” to execute in order.
* LIne 5 defines the “bundle agent hello_prefetch_net_friends” stanza. Bundle agent is the most commonly used type of CFEngine policy stanza. You can define files to be operated upon, commands to be executed, reports generated, etc. Check out the reference manual here for more of what bundle agent can do.
* LIne 7 defines the “type” of promise we are defining here — a “reports:” promise. Reports is a type of promise that will print some output to stdout or can send an email to an administrator with actions executed.
* Line 8 defines the “class” in the promise that will be affected. Here, we state that we only want to execute the actions if the host is in the “cfengine_3” class. If we belong to the “cfengine_3” class, then we print hello world.
* Line 11 defines a block of actions to execute upon if we match the “sunos_5_10” class. If we do, lets print that we are a Solaris 10 host.

So, this example forms the backbone of how CFengine policies are interpreted and executed. We defined a single promise in our bundlesequence, and in that bundlesequence we defined that we wanted the “type” to be a reports action — based upon the class of machine that cf-agent discovered. Nice! But wait.. When we executed the same command again immediately after, nothing was printed to stdout. Why?

By default, cf-agent will only execute a policy at minimum once a minute. When we specify the -K flag on the command line at the 3rd execution, the -K instructs cf-agent to ignore this “one minute rule.”


$ /var/cfengine/bin/cf-agent --help | grep '\-K'
--no-lock     , -K       - Ignore locking constraints during execution (ifelapsed/expireafter) if "too soon" to run

Nice! I want to see all the gory details of what cf-agent actually does. Show me the money! Throw in the -v flag for cf-agent to execute verbosely.


/var/cfengine/bin/cf-agent -f /var/tmp/hello_world.cf -v
     1  cf3 Cfengine - autonomous configuration engine - commence self-diagnostic prelude
     2  cf3 ------------------------------------------------------------------------
     3  cf3 Work directory is /var/cfengine
     4  cf3 Making sure that locks are private...
     5  cf3 Checking integrity of the state database
     6  cf3 Checking integrity of the module directory
     7  cf3 Checking integrity of the PKI directory
     8  cf3 Looking for a source of entropy in /var/cfengine/randseed
     9  cf3 Loaded /var/cfengine/ppkeys/localhost.priv
    10  cf3 Loaded /var/cfengine/ppkeys/localhost.pub
    11  cf3 No registered cfengine service, using default
    12  cf3  !!! System error for getservbyname: "Error 0"
    13  cf3 Setting cfengine default port to 5308 = 5308
    14  cf3 Reference time set to Fri Jul  2 11:15:21 2010
    15  cf3 Cfengine - 3.0.4 (C) Cfengine AS 2008-
    16  cf3 ------------------------------------------------------------------------
    17  cf3 Host name is: sinatra
    18  cf3 Operating System Type is sunos
    19  cf3 Operating System Release is 5.10
    20  cf3 Architecture = i86pc
    21  cf3 Using internal soft-class solarisx86 for host sinatra
    22  cf3 The time is now Fri Jul  2 11:15:21 2010
    23  cf3 ------------------------------------------------------------------------
    24  cf3 # Extended system discovery is only available in version Nova and above
    25  cf3 Additional hard class defined as: 32_bit
    26  cf3 Additional hard class defined as: sunos_5_10
    27  cf3 Additional hard class defined as: sunos_i86pc
    28  cf3 Additional hard class defined as: sunos_i86pc_5_10
    29  cf3 Additional hard class defined as: i386
    30  cf3 Additional hard class defined as: i86pc
    31  cf3 GNU autoconf class from compile time: compiled_on_solaris2_10
    32  cf3 Address given by nameserver: 172.18.33.58
    33  cf3 Adding alias loghost..
    34  cf3 Trying to locate my IPv6 address
    35  cf3 Looking for environment from cf-monitor...
    36  cf3 Loading environment...
    37  cf3 Environment data loaded
    38  cf3 ***********************************************************
    39  cf3  Loading persistent classes
    40  cf3 ***********************************************************
    41  cf3 ***********************************************************
    42  cf3  Loaded persistent memory
    43  cf3 ***********************************************************
    44  cf3  > Verifying the syntax of the inputs...
    45  cf3   > Parsing file /var/tmp/hello_world.cf
    46  cf3 Initiate variable convergence...
    47  cf3 Initiate control variable convergence...
    48  cf3 Initiate variable convergence...
    49  cf3 # Knowledge map reporting feature is only available in version Nova and above
    50  cf3  -> Defined classes = { 172_18_33_58 32_bit Day2 Friday GMT_Hr18 Hr11 Hr11_Q2 July Lcycle_0 Min15 Min15_20 Morning Q2 Yr2010 agent any cfengine_3 cfengine_3_0 cfengine_3_0_4 community_edition compiled_on_solaris2_10 corp diskfree_low_dev1 entropy_cfengine_in_low entropy_dns_in_low entropy_dns_out_low entropy_ftp_in_low entropy_ftp_out_low entropy_icmp_in_low entropy_icmp_out_low entropy_irc_in_low entropy_irc_out_low entropy_misc_in_low entropy_misc_out_low entropy_netbiosdgm_in_low entropy_netbiosdgm_out_low entropy_netbiosns_in_low entropy_netbiosns_out_low entropy_netbiosssn_in_low entropy_netbiosssn_out_low entropy_nfsd_in_low entropy_nfsd_out_low entropy_smtp_in_low entropy_smtp_out_low entropy_ssh_out_low entropy_tcpack_in_low entropy_tcpack_out_low entropy_tcpfin_in_low entropy_tcpfin_out_low entropy_tcpsyn_in_low entropy_tcpsyn_out_low entropy_udp_in_low entropy_udp_out_low entropy_www_in_low entropy_www_out_low entropy_wwws_in_low entropy_wwws_out_low sinatra sinatra i386 i86pc ipv4_172 ipv4_172_18 ipv4_172_18_33 ipv4_172_18_33_58 loghost net_iface_e1000g833000_2 net_iface_lo0_2 rootprocs_high_normal solarisx86 sunos_5_10 sunos_i86pc sunos_i86pc_5_10 sunos_i86pc_5_10_Generic_127128_11 verbose_mode }
    51  cf3  -> Negated Classes = { }
    52  cf3 Initiate variable convergence...
    53  cf3 Initiate control variable convergence...
    54  cf3  -> Immunizing against parental death
    55  cf3 -> Bundlesequence =>  {'hello_prefetch_net_friends'}
    56  cf3 
    57  cf3 *****************************************************************
    58  cf3 BUNDLE hello_prefetch_net_friends
    59  cf3 *****************************************************************
    60  cf3 
    61  cf3 
    62  cf3      +  Private classes augmented:
    63  cf3 
    64  cf3      -  Private classes diminished:
    65  cf3 
    66  cf3 
    67  cf3 
    68  cf3    =========================================================
    69  cf3    reports in bundle hello_prefetch_net_friends (1)
    70  cf3    =========================================================
    71  cf3 
    72  cf3 Verifying SQL table promises is only available with Cfengine Nova or above
    73  cf3  XX Nothing promised here [lock.hello_prefetch_net_friend] (0/1 minutes elapsed)
    74  cf3  XX Nothing promised here [lock.hello_prefetch_net_friend] (0/1 minutes elapsed)
    75  cf3 
    76  cf3      +  Private classes augmented:
    77  cf3 
    78  cf3      -  Private classes diminished:
    79  cf3 
    80  cf3 
    81  cf3 
    82  cf3    =========================================================
    83  cf3    reports in bundle hello_prefetch_net_friends (2)
    84  cf3    =========================================================
    85  cf3 
    86  cf3 Verifying SQL table promises is only available with Cfengine Nova or above
    87  cf3 
    88  cf3      +  Private classes augmented:
    89  cf3 
    90  cf3      -  Private classes diminished:
    91  cf3 
    92  cf3 
    93  cf3 
    94  cf3    =========================================================
    95  cf3    reports in bundle hello_prefetch_net_friends (3)
    96  cf3    =========================================================
    97  cf3 
    98  cf3 Verifying SQL table promises is only available with Cfengine Nova or above
    99  cf3 Outcome of version (not specified) (agent-0): Promises observed to be kept 100%, Promises repaired 0%, Promises not repaired 0%
   100  cf3 Estimated system complexity as touched objects = 0, for 2 promises

Lets break this down line-by-line on what cf-agent is telling us that happened.

* Lines 1-15 are some basic bootstrap operations for cf-agent to come online. Lines 8-10 contain encryption stuff. More on that later.
* Lines 17-23 are some basic facts that cf-agent discovered about the host it was executed on.
* Line 50 is the real meat of what we want to analyse here. It defines all of the automatically discovered classes that cf-agent found. From these classes, we can use these to build complex policies that will only execute on specific hosts. In this list we see:
** Time of day (CFengine can execute in a cron-like fashion where things will only execute at specific times of day / etc..)
** Subnet / VLAN the machine resides on
** Type of O/S the machine is running
** Hardware platform (x86 or sparc?)
** The kernel revision of Solaris 10 we’re currently on
** Version of CFengine the agent is..
* Lines 51-98 is the execution of our bundle hello_prefetch_net_friend.
* Lines 99 and 100 tell us how many promises were able to be executed, how many could not be executed, how complex the operation was, etc.

By default, if we execute cf-agent, it will only report back to us the actions it failed to take / problems that it encountered. This may not be the best of examples, but the -I flag (–inform) will instruct cf-agent to report back to us the “good” things it did. It will “inform” us of changes made, commands executed, config files modified, etc. So a common invocation of cf-agent from the CLI when testing newly written policies is:


$ /var/cfengine/bin/cf-agent -f [absolute path to policy file] -I -K

Again, -I will inform us of “successfully executed” actions and -K will allow the policy to execute even if its been less than a minute from the previous run.

CFengine 3 Tutorial — Part 2 — Building Software and SMF Manifests / Scripts

CFEngine and dependencies
Building OpenSSL

$ echo $PATH
/usr/bin:/usr/sbin:/sbin:/bin:/usr/sfw/bin:/usr/local/bin:/usr/ccs/bin
/var/tmp/openssl-1.0.0 $ ./Configure solaris-x86-gcc shared –prefix=/usr/local/ssl –openssldir=/usr/local/ssl
/var/tmp/openssl-1.0.0 $ gmake
/var/tmp/openssl-1.0.0 $ gmake install

Building PCRE

/var/tmp/pcre-8.02 $ echo $LDFLAGS
-L/usr/sfw/lib -L/usr/sfw/lib
/var/tmp/pcre-8.02 $ ./configure –disable-cpp CFLAGS=”-g -O3″ CC=gcc –enable-utf8 –enable-unicode-properties
/var/tmp/pcre-8.02 $ gmake
/var/tmp/pcre-8.02 $ sudo gmake install

Building CFEngine

Note: I modify the Makefile to statically compile in the BerkeleyDB, OpenSSL, and libpcre libraries into the CFEngine binaries. I also modify the reference for pthread to pthreads for Solaris. These adjustments to the Makefile are dependant on how you built the software above. Caveat emptor

/var/tmp/cfengine-3.0.4 $ export LD_LIBRARY_FLAGS=/usr/sfw/lib:/usr/local/lib:/usr/lib
/var/tmp/cfengine-3.0.4 $ export CC=/usr/sfw/bin/gcc
/var/tmp/cfengine-3.0.4 $ ./configure –prefix=/var/cfengine –with-openssl=/usr/local/ssl –without-sql –with-berkeleydb=/usr/local/BerkeleyDB/4.4 –enable-static
/var/tmp/cfengine-3.0.4 $ cd src
/var/tmp/cfengine-3.0.4/src $ perl -p -i.sav -e “s:-ldb:/usr/local/BerkeleyDB/4.4/lib/libdb.a:” Makefile
/var/tmp/cfengine-3.0.4/src $ perl -p -i.sav -e “s:-lcrypto:/usr/local/ssl/lib/libcrypto.a:” Makefile
/var/tmp/cfengine-3.0.4/src $ perl -p -i.sav -e “s:-lpcre:/usr/local/lib/libpcre.a:” Makefile
/var/tmp/cfengine-3.0.4/src $ perl -p -i.sav -e “s:-pthread:-pthreads:” Makefile
/var/tmp/cfengine-3.0.4/src $ cd ..
/var/tmp/cfengine-3.0.4 $ gmake
/var/tmp/cfengine-3.0.4 $ gmake install

Subversion and dependencies.
Building apr

/var/tmp $ bunzip2 apr-1.4.2.tar.bz2
/var/tmp $ tar -xf apr-1.4.2.tar
/var/tmp $ cd apr-1.4.2
/var/tmp/apr-1.4.2 $ ./configure –prefix=/usr/local/apr
/var/tmp/apr-1.4.2 $ gmake
/var/tmp/apr-1.4.2 $ sudo gmake install

Building apr-util

/var/tmp $ bunzip2 apr-util-1.3.9.tar.bz2
/var/tmp $ tar -xf apr-util-1.3.9.tar
/var/tmp $ cd apr-util-1.3.9
/var/tmp/apr-util-1.3.9 $ ./configure –with-apr=/usr/local/apr
/var/tmp/apr-util-1.3.9 $ gmake
/var/tmp/apr-util-1.3.9 $ sudo gmake install

Building sqlite

/var/tmp $ tar -xf sqlite-amalgamation-3.6.23.1.tar
/var/tmp/sqlite-3.6.23.1 $ ./configure
/var/tmp/sqlite-3.6.23.1 $ gmake
/var/tmp/sqlite-3.6.23.1 $ gmake install

Building Subversion

/var/tmp $ tar -xf subversion-1.6.12.tar
/var/tmp/subversion-1.6.12 $ echo $LD_LIBRARY_PATH
/usr/lib:/usr/share/lib:/usr/ccs/lib:/usr/sfw/lib:/usr/dt/lib:/usr/openwin/lib:/usr/java/lib:/usr/local/lib:/usr/local/lib/mysql:/usr/local/ssl/lib:/usr/local/apr/lib
/var/tmp/subversion-1.6.12 $ ./configure –with-apr-util=/usr/local/apr –with-apr=/usr/local/apr –without-berkeley-db –with-sqlite=/usr/local
/var/tmp/subversion-1.6.12 $ gmake
/var/tmp/subversion-1.6.12 $ gmake install

SMF Manifests / Scripts

I’ve created SMF manifests for cf-execd, cf-monitord, and cf-serverd. Clients should run cf-monitord and cf-execd. Master Policy Servers should run all three daemons. The SMF manifests / scripts for all 3 daemons are exactly the same, substitute the names of the daemons / descriptions. This is Solaris 10 specific.

$ svcs -a | grep cfengine
online Jun_30 svc:/application/cfengine/cf-serverd:default
online Jun_30 svc:/application/cfengine/cf-monitord:default
online Jun_30 svc:/application/cfengine/cf-execd:default

$ svccfg export cf-serverd
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
<service name='application/cfengine/cf-serverd' type='service' version='0'>
<create_default_instance enabled='true'/>
<single_instance/>
<exec_method name='start' type='method' exec='/var/cfengine/etc/cf-serverd.sh %m' timeout_seconds='60'>
<method_context>
<method_credential user='root'/>
</method_context>
</exec_method>
<exec_method name='restart' type='method' exec='/var/cfengine/etc/cf-serverd.sh %m' timeout_seconds='60'>
<method_context>
<method_credential user='root'/>
</method_context>
</exec_method>
<exec_method name='stop' type='method' exec='/var/cfengine/etc/cf-serverd.sh %m' timeout_seconds='60'>
<method_context>
<method_credential user='root'/>
</method_context>
</exec_method>
<property_group name='startd' type='framework'>
<propval name='duration' type='astring' value='contract'/>
</property_group>
<template>
<common_name>
<loctext xml:lang='C'>Cfengine server process</loctext>
</common_name>
<documentation>
<doc_link name='Further information' uri='http://www.cfengine.org/manuals/cf3-reference.html'/>
</documentation>
</template>
</service>
</service_bundle>

$ cat /var/cfengine/etc/cf-serverd.sh
#!/bin/bash

PATH=/sbin:/bin:/usr/sbin:/usr/bin

case “$1” in
start)
echo “Starting cf-serverd”
/var/cfengine/bin/cf-serverd
;;

stop)
echo “Stopping cf-serverd”
kill -15 `cat /var/cfengine/cf-serverd.pid`
;;

restart|force-reload)
$0 stop
sleep 1
$0 start
;;

*)
N=/var/cfengine/etc/cf-serverd.sh
echo “Usage: $N {start|stop|restart|force-reload}” >&2
exit 1
;;
esac
exit 0

CFengine 3 Tutorial — Part 1 — System Architecture

I recently stood up a CFengine 3 configuration management infrastructure and took notes during the process to share with my team. This was my first attempt at using CFengine, so hopefully this multi-part overview will help others trying to bootstrap their environments as well. Many of these notes were taken from the CFengine 3 reference manual and tutorial found on the docs website here. There is some excellent documentation on the CFengine.org so if you have more questions about something specific, be sure to check out the reference manuals!
Neil Watson has also compiled an excellent tutorial on his CFengine 3 setup. I organized some of the structure of my config files from his examples. There is also the CFengine help mailing list. You can browse the archives through the web here. Some of the details in the following documentation (building software, SMF scripts) may be Solaris 10 specific as that was the platform I was working with.

High Level Architecture and Objectives
What are some examples of what CFEngine can do?

* Performing post-installation tasks such as configuring the network interface.
* Editing system configuration files and other files.
* Creating symbolic links.
* Checking and correcting file permissions and ownership.
* Deleting unwanted files.
* Compressing selected files.
* Distributing files within a network.
* Automatically mounting NFS file systems.
* Verifying the presence and integrity of important files and file systems.
* Executing commands and scripts.
* Applying security-related patches and similar system corrections.
* Managing system server processes.
* Makes sandwiches via sudo.

Fundamental concepts, rules, and terms CFEngine uses.

1. Host: Generally, a host is a single computer that runs an operating system like UNIX, Linux, or Windows. We will sometimes talk about machines too, and a host can also be a virtual machine supported by an environment such as VMware or Xen/Linux.
2. Policy: This is a specification of what we want a host to be like. Rather than be in any sort of computer program, a policy is essentially a piece of documentation that describes technical details and characteristics. Cfengine implements policies that are specified via directives.
3. Configuration: The configuration of a host is the actual state of its resources
4. Operation: A unit of change is called an operation. CFEngine deals with changes to a system, and operations are embedded into the basic sentences of a cfengine policy. They tell us how policy constrains a host — in other words, how we will prevent a host from running away.
5. Convergence: An operation is convergent if it always brings the configuration of a host closer to its ideal state and has no effect if the host is already in that state.
6. Classes: A class is a way of slicing up and mapping out the complex environment of one or more hosts in to regions that can then be referred to by a symbol or name. They describe scope: where something is to be constrained.
7. Autonomy: No cfengine component is capable of receiving information that it has not explicitly asked for itself.
8. Scalable distributed action: Each host is responsible for carrying out checks and maintenance on/for itself, based on its local copy of policy.
9. The fact that each cfengine agent keeps a local copy of policy (regardless of whether it was written locally or inherited from a central authority) means that cfengine will continue to function even if network communications are down.

Critical CFEngine Daemons and Commands

1. cf-agent: Interprets policy promises and implements them in a convergent manner. The agent fetches data from cf-servd running on the Master Policy Servers.
2. cf-execd: Executes cf-agent and logs its output (optionally sending a summary via email). It can be run in daemon (standalone) mode. We have configured Solaris’ SMF to keep cf-execd online, which drives cf-agent.
3. cf-serverd: Monitors the cfengine port: serves file data to cf-agent. Every bit of data that we transfer between cf-agent and cf-serverd is encrypted.
4. cf-monitord: Collects statistics about resource usage on each host for anomaly detection purposes. The information is made available to the agent in the form of cfengine classes so that the agent can check for and respond to anomalies dynamically.
5. cf-key: Generates public-private key pairs on a host. You normally run this program only once, as part of the cfengine software installation process.

On a client system, cf-agent will be executed automatically by the cf-execd daemon; the latter also handles logging during cf-agent runs. In addition, operations such as file copying between hosts are initiated by cf-agent on the local system, and they rely on the cf-serverd daemon on the Master Policy Server to obtain remote data.

High Level Architecture of pushing configurations

Image borrowed from the CFEngine tutorial.

* SVN becomes the source of truth for CFEngine. The Architecture we are using will allow us to start with only one “Master Policy Server” or “Distribution Server” per site, but we can easily scale to multiple machines if wanted.
* A cron entry on the Master Policy Server will check the SVN repository at svn:/// every minute. If a updated configuration is detected, it will download the client configurations into /var/cfengine/masterfiles on the Master Policy Server.
* Depending upon the value configured for “splaytime” on the clients, they will check in randomly over a given period of, say, 10 minutes. The new policy file that was downloaded to /var/cfengine/masterfiles will be served by cf-serverd on the Master Policy Server and transferred (with encryption) to the client by the cf-agent command and pulled into /var/cfengine/inputs.
* The client runs the cf-execd daemon through SMF. The cf-execd daemon peridoically wakes up to execute cf-agent which runs the policies in /var/cfengine/inputs. If a new policy was transferred to the client, cf-agent will execute it.


The data flow on performing a change is as follows:
Pushing Configuration Changes

1. I make a config change on my local machine and push to SVN. push —-> SVN
2. Updated configuration detected. Download changes via cron script into /var/cfengine/masterfiles on policy server <—- pull from SVN
3. Policy Server running cf-serverd now has updated configurations in /var/cfengine/masterfiles to push to clients <—— pull from SVN from cron script
4. Clients running cf-execd daemon execute cf-agent based upon schedule (by default every 5 minutes)
5. cf-agent looks at configured "splaytime" variable to figure out how long to wait before contacting cf-serverd. (compute hash and randomly check in over interval) This random “back off” time keeps the master policy server from being hammered all at once by thousands of clients. If we randomly check in over a 10 minute interval, then we have less bursts of network i/o, etc…
6. cf-agent contacts cf-serverd running on Master Policy Server(s) and pulls updated policies / configs / etc via encrypted link. This happens via execution of failsafe.cf and update.cf <—— pull from Master Policy Servers. **** Clients pull. Servers don’t “push”. Changes are done on the client opportunistically. If the network is down, nothing happens on the clients. The next time the client can contact the Master Policy Server, the change is executed. *****
7. cf-agent executes policies via promises.cf. Changes happen on the client here.

8. cf-execd records details of the execution of promises.cf and records what happened into /var/cfengine/outputs.
9. cf-monitord records behavior of the machine and records details in /var/cfengine/reports
10. cf-execd kept running / monitored by Solaris SMF on client.
11. cf-monitord kept running / monitored by Solaris SMF on client.
12. cf-report ran manually through the CLI. cf-report analyzes data collected by cf-monitord in /var/cfengine/reports. Outputs to html / text / XML / etc…
13. Predefined schedule of XXX minutes passes again and cf-execd executes cf-agent again. Repeat from step 4.

Why does everything reside in /var/cfengine? How is CFengine resilient to failures?

* Cfengine likes to keep itself as resilient as possible. Some environments have /usr/local NFS mounted, so /var/cfengine was chosen as it was pretty much guaranteed to be kept locally on disk.
* Binaries that get executed reside in /var/cfengine/bin. Pristine copies of binaries reside in /var/cfengine/sbin. Every time cf-agent executes failsafe.cf (which calls update.cf), it verifies that the MD5 digest of the binaries in /var/cfengine/bin match /var/cfengine/sbin. If they don’t match, permissions have changed, ownership, etc. then they will automatically be copied from /var/cfengine/sbin to /var/cfengine/bin. This is a fail safe protection mechanism that will attempt to have CFEngine automatically recover itself from some sort of corruption.
* If you look at the “Part 2 — How I compiled CFEngine” page, you’ll see that we manually changed some configurations in the Makefile. This was to ensure that libpcre, libgcc.so.1, and libcrypto.a were statically compiled into the CFEngine client binaries. We dont want to have CFEngine rely on software under /usr/sfw/lib or /usr/local/lib – its completely self contained in /var/cfengine (other than general system libraries.)
* cf-agent actually gets executed twice on each run. The first run is to update all policy files via execution of failsafe.cf from the master policy server, but not to actually execute the policies. The second run executes promises.cf and really performs the changes. We modify promises.cf. We never modify failsafe.cf or update.cf once in production.
* This allows us to have syntax errors in promises.cf, but allow the clients to recover themselves in an automated fashion. If promises.cf is corrupt, we can’t actually execute policies. But if failsafe.cf and update.cf are in a good state, the clients will continue to poll the master policy server for updated copies of files.
* We can correct promises.cf from our syntax error — clients will pull the updated and corrected promises.cf, and the auto-recovery process of the configs is complete.
* If you break failsafe.cf or update.cf on the clients, then the clients will have to be touched manually to recover. Don’t modify these configurations once in a production environment — or be extremely careful to test your changes if you absolutely must.