In this module you will learn about
|
There are many cases in system configuration where we want to scan through a set of files and apply checks and controls to all or a few of them. A typical pattern is to specify a directory and apply a rule to all files and subdirectories. It is also possible to limit the search to include or exclude certain directories, or pick out specific files matching specified criteria.
There are two options which control how subdirectories are handled. By default, rules apply only to the items directly within specified directories; in other words, actions are not recursive by default.
inf
to descend to the bottom of the directory tree.
Here are some examples of these options:
files: # Check ownerships under /usr/local /usr/local owner=root group=admin mode=755 recurse=inf tidy: # Clear /tmp and subdirectories (>3 days old) /tmp age=3 include=* rmdirs=sub exclude=.X11 recurse=inf xdev=off
The first rule checks the user and group owners of files in the /usr/local directory tree, reporting on any which are incorrectly set. The second rule removes files and empty subdirectories that not been accessed in 3 days under /tmp (except the .X11 subdirectory), regardless of the disk partition on which the items reside.
See the discussion of the ignore
option in the next section
for another method of controlling directory tree traversal.
The simplest way to limit a file search is to use the three pattern matching criteria below. These directives use simple shell pattern matching symbols or wildcards ‘*’ and ‘?’, not POSIX regular expressions.
*
(match any characters, including no characters) and ?
(match any one character).
ExcludeCopy
and ExcludeLink
settings in the control
section (respectively).
exclude
, directories matching
an item in the ignore
list are not traversed during recursive operations. A global list of directories to ignore
can be specified via the ignore
stanza (see the example below).
Here are some examples of global versions of settings and options:
control: ExcludeCopy = ( *.bak *~ ) ignore: .Xll /usr/local tidy: # Remove non-recent files from /tmp and /scratch /tmp age=1 include= * recurse=inf /scratch age=1 include=* exclude=*.sav copy: # Update local documentation from server silo /masterdoc dest=/usr/local/doc server=silo
The example specifies a global exclusion list for copy operations and
a list of subdirectories to ignore during recursive operations. The
tidy
rule will clean up files that haven't been accessed today
from /tmp and all of its subdirectories except /tmp/.X11
(the location of X11 semaphores). It will also remove such files from
the /scratch directory except ones having the extension
.sav
.
The copy
rule copies all files from /masterdoc on remote host silo
that are newer than the version in /usr/local/doc (if any),
excluding any whose names end in a tilde character (emacs
backup files) or having the extension .bak, using the global
copy exclusion list. Note that having /usr/local in the
directory ignore list does not affect the file copying operation since
the former applies only to directory traversal in recursive
operations.
HINT: Use the regex tester on the cfengine.com website to make sure your regular expressions behave as you expect. |
Regular expressions may be used in many contexts. They should not be confused with shell wildcards or patterns, which are cruder pattern matching strings.
Regular expressions are used in cfengine particularly in connection with
editfiles
and processes
to search for lines matching
certain expressions. They can also be used in filter
s.
A regular expression is a generalized wildcard. In
cfengine wildcards, you can use the characters '*' and '?' to match any
character or number of characters. Regular expressions are more
complicated than wildcards, but have far more flexibility.
NOTE: the special characters *
and ?
used in
wildcards do not have the same meanings as regular expressions!
Some regular expressions match only a single string. For example, every
string which contains no special characters is a regular expression
which matches only a string identical to itself. Thus the regular
expression cfengine
would match only the string "cfengine", not
"Cfengine" or "cfengin" etc. Other regular expressions could match more
general strings. For instance, the regular expression c*
matches
any number of c's (including none). Thus this expression would match the
empty string, "c", "cccc", "ccccccccc", but not "cccx".
The wildcards belong to the shell. They are used for matching filenames. UNIX has a more general and widely used mechanism for matching strings, this is through regular expressions.
Regular expressions are used by the egrep
utility, text editors
like ed
, vi
and emacs
and sed
and awk
.
They are also used in the C programming language
for matching input as well as in the Perl programming language and lex
tokenizer. Here are some examples using the egrep
command
which print lines from the file /etc/rc
which match certain
conditions. The construction is part of egrep
. Everything
in between these symbols is a regular expression. Notice that
special shell symbols ! * &
have to be preceded with a backslash
\
in order to prevent the shell from expanding them!
# Print all lines beginning with a comment # egrep '(^#)' /etc/rc # Print all lines which DON'T begin with # egrep '(^[^#])' /etc/rc # Print all lines beginning with e, f or g. egrep '(^[efg])' /etc/rc # Print all lines beginning with uppercase egrep '(^[A-Z])' /etc/rc # Print all lines NOT beginning with uppercase egrep '(^[^A-Z])' /etc/rc # Print all lines containing ! * & egrep '([\!\*\&])' /etc/rc # All lines containing ! * & but not starting # egrep '([^#][\!\*\&])' /etc/rc
Regular expressions are made up of the following `atoms'.
These examples assume that the file /etc/rc exists.
If it doesn't exist on the machine you are using, try to
find the equivalent by, for instance, replacing
/etc/rc
with /etc/rc*
which will try to
find a match beginning with the rc.
You can find a complete list in the Unix manual pages. The square brackets above are used to define a class of characters to be matched. Here are some examples,
Sometimes, the inclusion and exclusion options do not provide sufficient flexibility to select just the items we intend. For such cases, cfengine provides filters which can be used to build complex file and process selection expressions. A filter is a description of items that we would like to include. Filters are declared in separate stanzas in their own section of the cfagent.conf configuration and are attached to any number of rules, as attributes, using their identifier.
Each filter is parameterized by a number of matching criteria. Each filter has a result which is expressed as the logical combination of a number of criteria.
The following components can be used to construct file filters:
Owner
and Group
can use numerical id's or names, or "none" for users or groups which are undefined in the system passwd/group file.
Mode
applies only to file objects. It shares syntax with the mode= strings in the files command. This test returns true if the bits which are specified as `should be set' are indeed set, and those which are specified as `should not be set' are not set.
Atime
, Ctime
, Mtime
These specify times to time ranges (via the From
and To
prefixes—see the third example below). If the file's time stamps lie in the specified range, the expression evaluates to true. Times are specified by a six component vector: (year, month, day, hour, minutes, seconds)
This may be evaluated as two functions: date()
or tminus()
which give
absolute times and times relative to the current time respectively. In
addition, the keywords now
and inf
(infinity) may be used.
Size
Specifies the file's size (or a size range when the prefixes From
or To
are included).
The keyword inf
may also be used.
Type:
applies only to file objects may be a list of file types which are to be matched. The list should be separated by the OR symbol $\mid$, since these types are mutually exclusive. Values include reg
, link
,
dir
, socket
, fifo
, door
and char
.
NameRegex
matches the name of the file with a regular expression.
IsSymLinkTo
applies only when the file object is a symbolic link.
It is true if the regular expression matches the contents of the link.
ExecProgram
matches if the command returns successfully (with return code 0). Note that this feature introduces an implicit dependency on the command being called. This might be exploitable as a security weakness by advanced intruders.
ExecRegex
matches the parenthesized test string against the output of the specified command.
Result
A logical expression specifying the way in which the above elements are combined into a single filter.
filters: { badgif # Look for executables disguised as GIF NameRegex: ".*gif" ExecRegex: "/bin/file (.*ELF.*)" Result: "ExecRegex.NameRegex" }
{ histnull # Check if users set history to dev/null NameRegex: ".*history" IsSymLinkTo: "/dev/null" Result: "IsSymLinkTo.NameRegex" DefineClasses: "history" }
{ old_or_big # Find .dat files that are old or big FromMtime: "date(2001,1,1,0,0,0)" ToMtime: "tminus(0,0,1,0,0,0)" FromSize: "5m" ToSize: "inf" NameRegex: "dat\$" Return: "NameRegex.(Mtime|Size)" }
In the final filter, Size
is shorthand for (FromSize.ToSize)
, and similar
abbreviations can be used for other numerical and time-period based items (e.g.,
Mtime
in the same filter).
filters: { setuid Owner: "root" Mode: "+6000" Result: "Owner.Mode" } files: /home recurse=inf filter=setuid mode=-6002 action=fixplain inform=on syslog=on
Another filter example shows how you might use a custom program or script to scan files, e.g. for testing for viruses or harmful content.
filters: { virus Type: "reg" ExecRegex: "$(Grep) 'Content-.*EXE.*' $(this) (.*)" # Look for EXE attacments Result: "Type.ExecRegex" DefineClasses: "virus" } ########################################################################### files: /imap-dir r=inf filter=virus action=alert /var/mail r=inf filter=virus action=alert ########################################################################### shellcommands: virus:: "/bin/echo Virus Alert on files"
# cf.users control: # backup exclusions excludecopy = ( *.EXE *.avi *.ZIP *.AVI *.MP3 *.mp3 *.o *.dvi *.rar ) backupdirs = ( bkupAH:bkupIN:bkupOZ ) SensibleCount = ( 20 ) filters: { history # Shell history = /dev/null NameRegex: ".*history" IsSymLinkTo: "/dev/null" Result: "IsSymLinkTo.NameRegex" DefineClasses: "historyalert" } { setuid # SetUID/SetGID Owner: "root" Group: "0" Mode: "+6000" Result: "(Owner|Group).Mode" } tidy: emergency|labs:: # emergency class used for ad-hoc runs /home include=.rhosts age=0 inform=on /home include=core r=inf age=1 /home include=a.out r=inf age=1 /home include=*.o r=inf age=7 /home/.netscape/cache include=* recurse=inf age=3 type=atime # Make sure backup disks don't get full backupserver.Hr17.OnTheHour:: /${backupdirs} include=* recurse=inf age=14
Example compressing all pdf files in sub-directories of /mydirectory.
control: actionsequence = ( files ) CompressCommand = (/usr/bin/gzip ) filters: { pdf_files NameRegex: ".*.pdf|.*.fdf" Result: "NameRegex" } files: /mydirectory filter=pdf_files r=inf action=compress
This example shows how to generate a list of the files that cfengine identifies, by promising an unrealistic condition which you know will not by satisfied by any file you are looking for (removing all permissions from the file).
control: actionsequence = ( files ) filters: { new_files # Files changed in the last 3 hr 30 mins FromMtime: "tminus(0,0,0,3,30,0)" ToMtime: "inf" Result: "Mtime" } files: /home/user mode=0 # trick to generate a warning filter=new_files r=inf action=warnall
|
The contents of a file are an independent degree of freedom to configure. Most Unix files are traditionally line based (though increasingly Java has brought XML into play). CFEngine 2's editing features are based primarily on line-based files.
There is a recurring pattern in editing features:
For example
SetLine "mark woz 'ere" AppendIfNoLineMatching "mark .*"
will append the line in SetLine
if no line matches the regular
editfiles: { /etc/hosts.allow # Disable access for this domain HashCommentLinesContaining "bad-guys.org" } { /etc/xinetd.d # Make sure telnet is disabled BeginGroupIfFileExists "telnet" DeleteLinesMatching "disable *=" GotoLastLine InsertLine "disable = yes" EndGroup # If no access control, limit to subnet SetLine "only_from = 192.168.9" AppendIfNoLineMatching "only_from" }
ReplaceAll quoted-regex With quoted-string
Replace all instances of strings matching the regular expression in the first quotes with the exact string in the second set of quotes, throughout the current file. Note that cfengine matches on a left to right basis, with the first match taking precedence, so if your regular expression matches text ambiguously it is the first occurrence which is replaced. For example, if you replace ‘cf.*’ with ‘CFENGINE’ and cfengine encounters a line ‘hello cfengine cfengine’, then this will be replaced with ‘hello CFENGINE’ even though two possible strings match the regular expression. On the other hand if the expression is not ambiguous, say replacing ‘cfengine’ with ‘CFENGINE’, then the result would be ‘hello CFENGINE CFENGINE’.
Rather than replacing all occurrences, you might want to pick only the first.
ReplaceFirst quoted-regex With quoted-string
For each line of the current file, replace the first string matching the regular expression in the first quotes (quoted-regex) with the string given in the second set of quotes (quoted-string). Matching is done left to right. For example, if you replace ‘``YY = [[:digit:]][[:digit:]]''’ with ‘``YY = 04''’ and cfengine encounters ‘``YY = 03 but old YY = 70''’ then it will be replaced with ‘``YY = 04 but old YY = 70''’
editfiles: { /var/spool/cron/tabs/root AutoCreate AppendIfNoSuchLine "0,15,30,45 * * * * /var/cfengine/bin/cfexecd -F" } { /etc/sysconfig/apache2 BeginGroupIfNoLineMatching "APACHE_SERVER_FLAGS=\"SVN_VIEWCVS\"" ReplaceAll "APACHE_SERVER_FLAGS=\"\" With "APACHE_SERVER_FLAGS=\"SVN_VIEWCVS\"" EndGroup BeginGroupIfNoLineMatching "APACHE_CONF_INCLUDE_FILES=\"/master/my-http.conf\"" ReplaceAll "APACHE_CONF_INCLUDE_FILES=\"\" With "APACHE_CONF_INCLUDE_FILES=\"/master/my-http.conf\" EndGroup BeginGroupIfNoLineMatching ".*php4 dav dav_svn.*" ReplaceAll "php4" With "php4 dav dav_svn" EndGroup } { /etc/postfix/main.cf ReplaceAll "^mydomain =.*" With "mydomain = iu.hio.no" ReplaceAll "^relayhost =.*" With "relayhost = [nexus.iu.hio.no]" AppendIfNoSuchLine "relayhost = [nexus.iu.hio.no]" AppendIfNoSuchLine "mydomain = iu.hio.no" } # Default PHP memory model is too small { /etc/php.ini ReplaceAll "^memory_limit =.*" With "memory_limit = 16M" AppendIfNoSuchLine "memory_limit = 16M" }
ReplaceLinesMatchingField quoted-number
This command replaces any lines in the current file with the current line
set by SetLine
or ForEachLineIn
, if the lines
are split into fields (e.g. the password file) separated by the
SplitOn
character (':' by default), and the corresponding
fields match.
The idea behind this command was to be able to override global passwords (from a file which gets distributed) by new passwords in a local file. Rather than maintaining the files separately, this simply overrides the entries with the new ones.
|
Process filters match common fields from ps
command output.
PID:
process ID (parameter is a quoted regular expression).
PPID:
parent process ID (quoted regular expression).
PGID:
process group ID (quoted regular expression).
RSize:
resident size (quoted regular expression).
VSize:
virtual memory size (quoted regular expression).
Status:
status (quoted regular expression).
Command:
CMD or COMMAND fields (quoted regular expression).
TTime:
Total elapsed time in TIME field (accumulated time). The prefixes From
and To
may be used to specify
a range.
STime:
Starting time for process in STIME or START field (accumulated time).
The prefixes From
and To
may be used to specify
a range.
TTY:
terminal type, or none (quoted regular expression).
Priority:
PRI or NI field (quoted regular expression).
Threads:
NLWP field for SVR4 (quoted regular expression).
Result:
logical combination of above returned by filter (quoted regular expression).
Note that these names are all case sensitive.
Here is an example process filter in action:
filters: # Processes owned by root with > 2 hrs CPU time { program_gone_bad Owner: "root" FromTTime: "accumulated(0,0,0,200,0,0)" ToTTime: "inf" Result: "Owner.TTime" } processes: "." filter=program_gone_bad action=warn
In this case, the regular expression searched for among the output from
ps
is any character (specified by single period).
control: actionsequence = ( processes ) filters: { new_processes # New processes started in past hour and half FromSTime: "tminus(0,0,0,1,30,0)" ToSTime: "inf" Result: "STime" } processes: "." filter=new_processes action=warn
|
Two common causes of error in constructing filters are the following:
FromMtime
.
FromSize "100m" Will not work without the colon!
In all cases a search for a pattern implies some iteration over the possible matches. Cfengine's iteration (i.e. looping) capabilities. In some cases we can think of iterating over files, processes or other objects. In other cases we iterate over the set of all possible parameters to a given configuration promise.
Variables can be used as iterators in a limited number of scenarios in
cfengine 2. The iteration was originally modelled on the shell list
idea, as represented in PATH
names, and the ‘IFS’
variable. Only in cfengine 3 has this poor decision been rectified.
Iteration is implicit. When list variables are subsituted into expressions iteration over the list elements is assumed if i) it makes sense, and ii) the limitations of cfengine 2 can cope with the iteration.
A loop is therefore made by substituing a variables that is a list. A list variable is one which consists of strings separated by a list separator. The default list separator is the colon ‘:’ character, as in the shell:
control: listvar = ( one:two:three:four )
The action that contains a variable to be interpreted as a list appears as separate actions, one for each case:
shellcommand: "/bin/echo $(listvar)"
is equivalent to
shellcommand: "/bin/echo one" "/bin/echo two" "/bin/echo three" "/bin/echo four"
If multiple iterators are used, these are handled as nested loops:
cfengine::/bin/echo one 1: one 1 cfengine::/bin/echo one 2: one 2 cfengine::/bin/echo one 3: one 3 cfengine::/bin/echo one 4: one 4 cfengine::/bin/echo two 1: two 1 cfengine::/bin/echo two 2: two 2 cfengine::/bin/echo two 3: two 3 cfengine::/bin/echo two 4: two 4 cfengine::/bin/echo three: three 1 cfengine::/bin/echo three: three 2 cfengine::/bin/echo three: three 3 cfengine::/bin/echo three: three 4 cfengine::/bin/echo four : four 1 cfengine::/bin/echo four : four 2 cfengine::/bin/echo four : four 3 cfengine::/bin/echo four : four 4
Where iterators are not allowed, the implied lists are treated as scalars:
alerts: amnexus:: "do $(list1) $(list2)"
e.g.
cfengine:: do one:two:three:four 1:2:3:4
Iterative expansion is currently restricted to:
Table of Contents