CFengine 3 Tutorial -- Part 1 -- System Architecture


I recently stood up a CFengine 3 configuration management infrastructure and took notes during the process to share with my team. This was my first attempt at using CFengine, so hopefully this multi-part overview will help others trying to bootstrap their environments as well. Many of these notes were taken from the CFengine 3 reference manual and tutorial found on the docs website here. There is some excellent documentation on the CFengine.org so if you have more questions about something specific, be sure to check out the reference manuals! Neil Watson has also compiled an excellent tutorial on his CFengine 3 setup. I organized some of the structure of my config files from his examples. There is also theCFengine help mailing list. You can browse thearchives through the web here. Some of the details in the following documentation (building software, SMF scripts) may be Solaris 10 specific as that was the platform I was working with.

High Level Architecture and Objectives What are some examples of what CFEngine can do?

Fundamental concepts, rules, and terms CFEngine uses.

  1. Host: Generally, a host is a single computer that runs an operating system like UNIX, Linux, or Windows. We will sometimes talk about machines too, and a host can also be a virtual machine supported by an environment such as VMware or Xen/Linux.
  2. Policy: This is a specification of what we want a host to be like. Rather than be in any sort of computer program, a policy is essentially a piece of documentation that describes technical details and characteristics. Cfengine implements policies that are specified via directives.
  3. Configuration: The configuration of a host is the actual state of its resources
  4. Operation: A unit of change is called an operation. CFEngine deals with changes to a system, and operations are embedded into the basic sentences of a cfengine policy. They tell us how policy constrains a host — in other words, how we will prevent a host from running away.
  5. Convergence: An operation is convergent if it always brings the configuration of a host closer to its ideal state and has no effect if the host is already in that state.
  6. Classes: A class is a way of slicing up and mapping out the complex environment of one or more hosts in to regions that can then be referred to by a symbol or name. They describe scope: where something is to be constrained.
  7. Autonomy: No cfengine component is capable of receiving information that it has not explicitly asked for itself.
  8. Scalable distributed action: Each host is responsible for carrying out checks and maintenance on/for itself, based on its local copy of policy.
  9. The fact that each cfengine agent keeps a local copy of policy (regardless of whether it was written locally or inherited from a central authority) means that cfengine will continue to function even if network communications are down.

Critical CFEngine Daemons and Commands

  1. cf-agent: Interprets policy promises and implements them in a convergent manner. The agent fetches data from cf-servd running on the Master Policy Servers.
  2. cf-execd: Executes cf-agent and logs its output (optionally sending a summary via email). It can be run in daemon (standalone) mode. We have configured Solaris’ SMF to keep cf-execd online, which drives cf-agent.
  3. cf-serverd: Monitors the cfengine port: serves file data to cf-agent. Every bit of data that we transfer between cf-agent and cf-serverd is encrypted.
  4. cf-monitord: Collects statistics about resource usage on each host for anomaly detection purposes. The information is made available to the agent in the form of cfengine classes so that the agent can check for and respond to anomalies dynamically.
  5. cf-key: Generates public-private key pairs on a host. You normally run this program only once, as part of the cfengine software installation process.

On a client system, cf-agent will be executed automatically by the cf-execd daemon; the latter also handles logging during cf-agent runs. In addition, operations such as file copying between hosts are initiated by cf-agent on the local system, and they rely on the cf-serverd daemon on the Master Policy Server to obtain remote data.**

High Level Architecture of pushing configurations

SVN becomes the source of truth for CFEngine. The Architecture we are using will allow us to start with only one “Master Policy Server” or “Distribution Server” per site, but we can easily scale to multiple machines if wanted.

The data flow on performing a change is as follows: Pushing Configuration Changes**

  1. I make a config change on my local machine and push to SVN. push —-> SVN
  2. Updated configuration detected. Download changes via cron script into /var/cfengine/masterfiles on policy server <—- pull from SVN
  3. Policy Server running cf-serverd now has updated configurations in /var/cfengine/masterfiles to push to clients <—— pull from SVN from cron script
  4. Clients running cf-execd daemon execute cf-agent based upon schedule (by default every 5 minutes)
  5. cf-agent looks at configured “splaytime” variable to figure out how long to wait before contacting cf-serverd. (compute hash and randomly check in over interval) This random “back off” time keeps the master policy server from being hammered all at once by thousands of clients. If we randomly check in over a 10 minute interval, then we have less bursts of network i/o, etc… **6. cf-agent contacts cf-serverd running on Master Policy Server(s) and pulls updated policies / configs / etc via encrypted link. This happens via execution of failsafe.cf and update.cf <—— pull from Master Policy Servers. Clients pull. Servers don’t “push”. Changes are done on the client opportunistically. If the network is down, nothing happens on the clients. The next time the client can contact the Master Policy Server, the change is executed.
  6. cf-agent executes policies via promises.cf. Changes happen on the client here.**
  7. cf-execd records details of the execution of promises.cf and records what happened into /var/cfengine/outputs.
  8. cf-monitord records behavior of the machine and records details in /var/cfengine/reports
  9. cf-execd kept running / monitored by Solaris SMF on client.
  10. cf-monitord kept running / monitored by Solaris SMF on client.
  11. cf-report ran manually through the CLI. cf-report analyzes data collected by cf-monitord in /var/cfengine/reports. Outputs to html / text / XML / etc…
  12. Predefined schedule of XXX minutes passes again and cf-execd executes cf-agent again. Repeat from step 4.

Why does everything reside in /var/cfengine? How is CFengine resilient to failures?

Cfengine likes to keep itself as resilient as possible. Some environments have /usr/local NFS mounted, so /var/cfengine was chosen as it was pretty much guaranteed to be kept locally on disk.

This article was posted by Matty on 2010-07-02 10:54:00 -0400 -0400