Searching in cfengine


Next: , Previous: (dir), Up: (dir)

CFEngine-Anomalies

COMPLETE TABLE OF CONTENTS

Summary of contents

In this module you will learn about

  • How to select particular files in a search
  • How to select paricular processes in the process table
  • Find and edit text in files.
  • Combine powerful search criteria


Next: , Previous: Top, Up: Top

1 Searching for files


Next: , Previous: Searching for files, Up: Searching for files

1.1 Controlling Directory Tree Traversal

There are many cases in system configuration where we want to scan through a set of files and apply checks and controls to all or a few of them. A typical pattern is to specify a directory and apply a rule to all files and subdirectories. It is also possible to limit the search to include or exclude certain directories, or pick out specific files matching specified criteria.

There are two options which control how subdirectories are handled. By default, rules apply only to the items directly within specified directories; in other words, actions are not recursive by default.

recurse=depth
Perform recursive checks/operations, descending at most depth levels. Use the keyword inf to descend to the bottom of the directory tree.
xdev=false
Do not descend into subdirectories residing on different disk partitions. By default, partition boundaries are not crossed.

Here are some examples of these options:

     files:   # Check ownerships under /usr/local

         /usr/local
              owner=root
              group=admin
              mode=755
              recurse=inf


     tidy:    # Clear /tmp and subdirectories (>3 days old)

         /tmp
             age=3
             include=*
             rmdirs=sub
             exclude=.X11
             recurse=inf
             xdev=off

The first rule checks the user and group owners of files in the /usr/local directory tree, reporting on any which are incorrectly set. The second rule removes files and empty subdirectories that not been accessed in 3 days under /tmp (except the .X11 subdirectory), regardless of the disk partition on which the items reside.

See the discussion of the ignore option in the next section for another method of controlling directory tree traversal.


Next: , Previous: Controlling Directory Tree Traversal, Up: Searching for files

1.2 Inclusion and Exclusion Patterns

The simplest way to limit a file search is to use the three pattern matching criteria below. These directives use simple shell pattern matching symbols or wildcards ‘*’ and ‘?’, not POSIX regular expressions.


Next: , Previous: Inclusion and Exclusion Patterns, Up: Inclusion and Exclusion Patterns

1.2.1 Local criteria (per promise rule)

include=pattern
Include file-only items matching the specified patterns when selecting files for verification or modification. Patterns may include the shell wildcards * (match any characters, including no characters) and ? (match any one character).
exclude=pattern
Exclude file-only items matching the specified patterns when selecting files for verification or modification. Global exclusion lists can be specified for copying and linking operations via the ExcludeCopy and ExcludeLink settings in the control section (respectively).
ignore=pattern
Ignore items matching the specified patterns. In contrast to exclude, directories matching an item in the ignore list are not traversed during recursive operations. A global list of directories to ignore can be specified via the ignore stanza (see the example below).


Previous: Local criteria (per promise rule), Up: Inclusion and Exclusion Patterns

1.2.2 Global criteria (per promise rule)

Here are some examples of global versions of settings and options:

     control:

         ExcludeCopy = ( *.bak *~ )

     ignore:

         .Xll
         /usr/local

     tidy:    # Remove non-recent files from /tmp and /scratch

         /tmp age=1 include= * recurse=inf
         /scratch age=1 include=* exclude=*.sav

     copy:    # Update local documentation from server silo

        /masterdoc dest=/usr/local/doc server=silo

The example specifies a global exclusion list for copy operations and a list of subdirectories to ignore during recursive operations. The tidy rule will clean up files that haven't been accessed today from /tmp and all of its subdirectories except /tmp/.X11 (the location of X11 semaphores). It will also remove such files from the /scratch directory except ones having the extension .sav.

The copy rule copies all files from /masterdoc on remote host silo that are newer than the version in /usr/local/doc (if any), excluding any whose names end in a tilde character (emacs backup files) or having the extension .bak, using the global copy exclusion list. Note that having /usr/local in the directory ignore list does not affect the file copying operation since the former applies only to directory traversal in recursive operations.


Next: , Previous: Inclusion and Exclusion Patterns, Up: Searching for files

1.3 Wildcards and Regular expressions

HINT: Use the regex tester on the cfengine.com website to make sure your regular expressions behave as you expect.

Regular expressions may be used in many contexts. They should not be confused with shell wildcards or patterns, which are cruder pattern matching strings.

Regular expressions are used in cfengine particularly in connection with editfiles and processes to search for lines matching certain expressions. They can also be used in filters. A regular expression is a generalized wildcard. In cfengine wildcards, you can use the characters '*' and '?' to match any character or number of characters. Regular expressions are more complicated than wildcards, but have far more flexibility.

NOTE: the special characters * and ? used in wildcards do not have the same meanings as regular expressions!

Some regular expressions match only a single string. For example, every string which contains no special characters is a regular expression which matches only a string identical to itself. Thus the regular expression cfengine would match only the string "cfengine", not "Cfengine" or "cfengin" etc. Other regular expressions could match more general strings. For instance, the regular expression c* matches any number of c's (including none). Thus this expression would match the empty string, "c", "cccc", "ccccccccc", but not "cccx".


Previous: Wildcards and Regular expressions, Up: Wildcards and Regular expressions

1.3.1 Regular expressions

The wildcards belong to the shell. They are used for matching filenames. UNIX has a more general and widely used mechanism for matching strings, this is through regular expressions.

Regular expressions are used by the egrep utility, text editors like ed, vi and emacs and sed and awk. They are also used in the C programming language for matching input as well as in the Perl programming language and lex tokenizer. Here are some examples using the egrep command which print lines from the file /etc/rc which match certain conditions. The construction is part of egrep. Everything in between these symbols is a regular expression. Notice that special shell symbols ! * & have to be preceded with a backslash \ in order to prevent the shell from expanding them!

     # Print all lines beginning with a comment #

     egrep '(^#)'           /etc/rc

     # Print all lines which DON'T begin with #

     egrep '(^[^#])'        /etc/rc

     # Print all lines beginning with e, f or g.

     egrep '(^[efg])'       /etc/rc

     # Print all lines beginning with uppercase

     egrep '(^[A-Z])'       /etc/rc

     # Print all lines NOT beginning with uppercase

     egrep '(^[^A-Z])'      /etc/rc

     # Print all lines containing ! * &

     egrep '([\!\*\&])'     /etc/rc

     # All lines containing ! * & but not starting #

     egrep '([^#][\!\*\&])' /etc/rc

Regular expressions are made up of the following `atoms'.

These examples assume that the file /etc/rc exists. If it doesn't exist on the machine you are using, try to find the equivalent by, for instance, replacing /etc/rc with /etc/rc* which will try to find a match beginning with the rc.

.
Match any single character except the end of line.
^
Match the beginning of a line as the first character.
$
Match end of line as last character.
[..]
Match any character in the list between the square brackets.(see below).
*
Match zero or more occurrences of the preceding expression.
+
Match one or more occurrences of the preceding expression.
?
Match zero or one occurrence of the preceding expression.

You can find a complete list in the Unix manual pages. The square brackets above are used to define a class of characters to be matched. Here are some examples,


Next: , Previous: Wildcards and Regular expressions, Up: Searching for files

1.4 File Filters

Sometimes, the inclusion and exclusion options do not provide sufficient flexibility to select just the items we intend. For such cases, cfengine provides filters which can be used to build complex file and process selection expressions. A filter is a description of items that we would like to include. Filters are declared in separate stanzas in their own section of the cfagent.conf configuration and are attached to any number of rules, as attributes, using their identifier.

Each filter is parameterized by a number of matching criteria. Each filter has a result which is expressed as the logical combination of a number of criteria.

1.4.1 File Filter Parameters

The following components can be used to construct file filters:


Next: , Previous: File Filters, Up: File Filters

1.4.2 Example file filter - by magic number

     filters:
         { badgif        # Look for executables disguised as GIF
             NameRegex:  ".*gif"
             ExecRegex: "/bin/file (.*ELF.*)"
             Result: "ExecRegex.NameRegex"
         }


Next: , Previous: Example file filter by magic number, Up: File Filters

1.4.3 Example file filter by link destination

         { histnull      # Check if users set history to dev/null
             NameRegex:   ".*history"
             IsSymLinkTo: "/dev/null"
             Result: "IsSymLinkTo.NameRegex"
             DefineClasses: "history"
         }


Next: , Previous: Example file filter by link dest, Up: File Filters

1.4.4 Example file filter combined parameters

         { old_or_big    # Find .dat files that are old or big
            FromMtime: "date(2001,1,1,0,0,0)"
            ToMtime: "tminus(0,0,1,0,0,0)"
            FromSize: "5m"
            ToSize: "inf"
            NameRegex: "dat\$"
            Return: "NameRegex.(Mtime|Size)"
         }

In the final filter, Size is shorthand for (FromSize.ToSize), and similar abbreviations can be used for other numerical and time-period based items (e.g., Mtime in the same filter).


Next: , Previous: Example file filter combined, Up: File Filters

1.4.5 Example file filter - setuid

     filters:
       { setuid
         Owner: "root"
         Mode: "+6000"
         Result: "Owner.Mode"
       }

     files:
         /home recurse=inf filter=setuid mode=-6002
           action=fixplain inform=on syslog=on


Next: , Previous: Example file filter setuid, Up: File Filters

1.4.6 Example file filter - custom scanner

Another filter example shows how you might use a custom program or script to scan files, e.g. for testing for viruses or harmful content.

     filters:

     { virus

     Type: "reg"
     ExecRegex: "$(Grep) 'Content-.*EXE.*'  $(this) (.*)" # Look for EXE attacments
     Result:    "Type.ExecRegex"
     DefineClasses: "virus"
     }

     ###########################################################################

     files:

     /imap-dir r=inf filter=virus action=alert
     /var/mail r=inf filter=virus action=alert

     ###########################################################################

     shellcommands:

     virus::

      "/bin/echo Virus Alert on files"


Next: , Previous: Example file filter custom scan, Up: File Filters

1.4.7 Example file filter - tidy junk

     # cf.users

     control:

         # backup exclusions

         excludecopy = ( *.EXE *.avi *.ZIP *.AVI *.MP3
                         *.mp3 *.o *.dvi *.rar  )

         backupdirs = ( bkupAH:bkupIN:bkupOZ )

         SensibleCount = ( 20 )

     filters:

         { history  # Shell history = /dev/null
           NameRegex:     ".*history"
           IsSymLinkTo:   "/dev/null"
           Result:        "IsSymLinkTo.NameRegex"
           DefineClasses: "historyalert"
         }

         { setuid  # SetUID/SetGID
           Owner:   "root"
           Group:   "0"
           Mode:    "+6000"
           Result:  "(Owner|Group).Mode"
         }

     tidy:

       emergency|labs::  # emergency class used for ad-hoc runs
         /home include=.rhosts age=0 inform=on
         /home include=core r=inf age=1
         /home include=a.out r=inf age=1
         /home include=*.o r=inf age=7

         /home/.netscape/cache include=*
           recurse=inf age=3 type=atime

     # Make sure backup disks don't get full
        backupserver.Hr17.OnTheHour::
          /${backupdirs} include=* recurse=inf age=14


Next: , Previous: Example file filter tidy junk, Up: File Filters

1.4.8 Example file filter - compress files

Example compressing all pdf files in sub-directories of /mydirectory.

     control:

      actionsequence = ( files )

      CompressCommand = (/usr/bin/gzip )

     filters:

         { pdf_files

         NameRegex:     ".*.pdf|.*.fdf"
         Result:        "NameRegex"
         }

     files:

       /mydirectory

         filter=pdf_files
         r=inf
         action=compress


Previous: Example file filter compress, Up: File Filters

1.4.9 Example file filter - find files changed

This example shows how to generate a list of the files that cfengine identifies, by promising an unrealistic condition which you know will not by satisfied by any file you are looking for (removing all permissions from the file).

     control:

      actionsequence = ( files )

     filters:

         { new_files

         # Files changed in the last 3 hr 30 mins
         FromMtime: "tminus(0,0,0,3,30,0)"
         ToMtime:   "inf"
         Result:    "Mtime"
         }

     files:

       /home/user

         mode=0           # trick to generate a warning
         filter=new_files
         r=inf
         action=warnall



Previous: File Filters, Up: Searching for files

1.5 Patterns self-test questions

  1. How do I add a filter to a file search?
  2. How do I specify what the filter is looking for?
  3. How can I see/test what files cfengine identifies?


Next: , Previous: Searching for files, Up: Top

2 Searching for text inside files

The contents of a file are an independent degree of freedom to configure. Most Unix files are traditionally line based (though increasingly Java has brought XML into play). CFEngine 2's editing features are based primarily on line-based files.


Next: , Previous: Searching for text inside files, Up: Searching for text inside files

2.1 Line based editing patterns

There is a recurring pattern in editing features:

..IfNoSuchLine
Applies if there is no literal match between a line in the file and the quoted string.
..LineContaining
Applies if there is (no) literal match between a substring of a line and the quoted string.
..LineStarting
Applies if there is (no) literal match between the start of a line and the quoted string.
..LineMatching
Applies if there is (no) literal match between a complete line of the file and the quoted regular expression.

For example

     SetLine "mark woz 'ere"
     AppendIfNoLineMatching "mark .*"

will append the line in SetLine if no line matches the regular

     editfiles:
         { /etc/hosts.allow  # Disable access for this domain
             HashCommentLinesContaining "bad-guys.org"
         }

         { /etc/xinetd.d
             # Make sure telnet is disabled
             BeginGroupIfFileExists "telnet"
               DeleteLinesMatching "disable *="
               GotoLastLine
               InsertLine "disable = yes"
             EndGroup

             # If no access control, limit to subnet
             SetLine "only_from = 192.168.9"
             AppendIfNoLineMatching "only_from"
         }


Next: , Previous: Line based editing patterns, Up: Searching for text inside files

2.2 Replacing Text fragments

      ReplaceAll quoted-regex With quoted-string

Replace all instances of strings matching the regular expression in the first quotes with the exact string in the second set of quotes, throughout the current file. Note that cfengine matches on a left to right basis, with the first match taking precedence, so if your regular expression matches text ambiguously it is the first occurrence which is replaced. For example, if you replace ‘cf.*’ with ‘CFENGINE’ and cfengine encounters a line ‘hello cfengine cfengine’, then this will be replaced with ‘hello CFENGINE’ even though two possible strings match the regular expression. On the other hand if the expression is not ambiguous, say replacing ‘cfengine’ with ‘CFENGINE’, then the result would be ‘hello CFENGINE CFENGINE’.

Rather than replacing all occurrences, you might want to pick only the first.

      ReplaceFirst quoted-regex With quoted-string

For each line of the current file, replace the first string matching the regular expression in the first quotes (quoted-regex) with the string given in the second set of quotes (quoted-string). Matching is done left to right. For example, if you replace ‘``YY = [[:digit:]][[:digit:]]''’ with ‘``YY = 04''’ and cfengine encounters ‘``YY = 03 but old YY = 70''’ then it will be replaced with ‘``YY = 04 but old YY = 70''


Next: , Previous: Replacing Text fragments, Up: Replacing Text fragments

2.2.1 Examples of text matching in editfiles

     editfiles:
         { /var/spool/cron/tabs/root
           AutoCreate
           AppendIfNoSuchLine
             "0,15,30,45 * * * * /var/cfengine/bin/cfexecd -F"
         }

         { /etc/sysconfig/apache2
           BeginGroupIfNoLineMatching "APACHE_SERVER_FLAGS=\"SVN_VIEWCVS\""
             ReplaceAll "APACHE_SERVER_FLAGS=\"\"
               With "APACHE_SERVER_FLAGS=\"SVN_VIEWCVS\""
           EndGroup

           BeginGroupIfNoLineMatching
           "APACHE_CONF_INCLUDE_FILES=\"/master/my-http.conf\""
             ReplaceAll "APACHE_CONF_INCLUDE_FILES=\"\"
               With "APACHE_CONF_INCLUDE_FILES=\"/master/my-http.conf\"
           EndGroup

           BeginGroupIfNoLineMatching ".*php4 dav dav_svn.*"
             ReplaceAll "php4" With "php4 dav dav_svn"
           EndGroup
         }

         { /etc/postfix/main.cf
           ReplaceAll "^mydomain =.*" With "mydomain = iu.hio.no"
           ReplaceAll "^relayhost =.*" With "relayhost = [nexus.iu.hio.no]"
           AppendIfNoSuchLine "relayhost = [nexus.iu.hio.no]"
           AppendIfNoSuchLine "mydomain = iu.hio.no"
         }

     # Default PHP memory model is too small

         { /etc/php.ini
           ReplaceAll "^memory_limit =.*" With "memory_limit = 16M"
           AppendIfNoSuchLine "memory_limit = 16M"
         }


Previous: Examples of text matching in editfiles, Up: Replacing Text fragments

2.2.2 Replacing fields in tabular files

     ReplaceLinesMatchingField quoted-number

This command replaces any lines in the current file with the current line set by SetLine or ForEachLineIn, if the lines are split into fields (e.g. the password file) separated by the SplitOn character (':' by default), and the corresponding fields match.

The idea behind this command was to be able to override global passwords (from a file which gets distributed) by new passwords in a local file. Rather than maintaining the files separately, this simply overrides the entries with the new ones.


Previous: Replacing Text fragments, Up: Searching for text inside files

2.3 Editing self-test questions

  1. How can I search and replace text in editfiles?
  2. How do I ensure that the replacement is convergent?
  3. How can I edit fields in tabular files?
  4. How can I make conditional edits based on matched patterns?


Next: , Previous: Searching for text inside files, Up: Top

3 Searching for processes

Process filters match common fields from ps command output.

Note that these names are all case sensitive.


Next: , Previous: Searching for processes, Up: Searching for processes

3.1 Example process filter by accumulated time

Here is an example process filter in action:

     filters:

     #  Processes owned by root with > 2 hrs CPU time
         { program_gone_bad
             Owner: "root"
             FromTTime: "accumulated(0,0,0,200,0,0)"
             ToTTime: "inf"
             Result: "Owner.TTime"
         }

     processes:

         "." filter=program_gone_bad action=warn

In this case, the regular expression searched for among the output from ps is any character (specified by single period).


Next: , Previous: Example file process filter by accumulated time, Up: Searching for processes

3.2 Example process filter - started recently

     control:

      actionsequence = ( processes )

     filters:

         { new_processes

         # New processes started in past hour and half

         FromSTime: "tminus(0,0,0,1,30,0)"
         ToSTime:   "inf"
         Result:    "STime"
         }

     processes:

      "." filter=new_processes action=warn


Previous: Example file process filter started recently, Up: Searching for processes

3.3 Text editing self-test questions

  1. What is the difference between STime and TTime?
  2. Do I have to run process scripts as root?
  3. How can I make a more powerful killall command?


Next: , Previous: Searching for processes, Up: Top

4 Troubleshooting Filters

Two common causes of error in constructing filters are the following:


Previous: Troubleshooting Filters, Up: Top

5 Iteration

In all cases a search for a pattern implies some iteration over the possible matches. Cfengine's iteration (i.e. looping) capabilities. In some cases we can think of iterating over files, processes or other objects. In other cases we iterate over the set of all possible parameters to a given configuration promise.


Previous: Iteration, Up: Iteration

5.1 Iteration over lists as a pattern

Variables can be used as iterators in a limited number of scenarios in cfengine 2. The iteration was originally modelled on the shell list idea, as represented in PATH names, and the ‘IFS’ variable. Only in cfengine 3 has this poor decision been rectified.

Iteration is implicit. When list variables are subsituted into expressions iteration over the list elements is assumed if i) it makes sense, and ii) the limitations of cfengine 2 can cope with the iteration.

A loop is therefore made by substituing a variables that is a list. A list variable is one which consists of strings separated by a list separator. The default list separator is the colon ‘:’ character, as in the shell:

     control:

      listvar = ( one:two:three:four )

The action that contains a variable to be interpreted as a list appears as separate actions, one for each case:

     shellcommand:

       "/bin/echo $(listvar)"

is equivalent to

     shellcommand:

       "/bin/echo one"
       "/bin/echo two"
       "/bin/echo three"
       "/bin/echo four"

If multiple iterators are used, these are handled as nested loops:

     cfengine::/bin/echo one 1:     one 1
     cfengine::/bin/echo one 2:     one 2
     cfengine::/bin/echo one 3:     one 3
     cfengine::/bin/echo one 4:     one 4
     cfengine::/bin/echo two 1:     two 1
     cfengine::/bin/echo two 2:     two 2
     cfengine::/bin/echo two 3:     two 3
     cfengine::/bin/echo two 4:     two 4
     cfengine::/bin/echo three:     three 1
     cfengine::/bin/echo three:     three 2
     cfengine::/bin/echo three:     three 3
     cfengine::/bin/echo three:     three 4
     cfengine::/bin/echo four :     four 1
     cfengine::/bin/echo four :     four 2
     cfengine::/bin/echo four :     four 3
     cfengine::/bin/echo four :     four 4

Where iterators are not allowed, the implied lists are treated as scalars:

     alerts:

      amnexus::

       "do $(list1) $(list2)"

e.g.

     cfengine:: do one:two:three:four 1:2:3:4

Iterative expansion is currently restricted to:

Table of Contents