Enterprise report collection
How does CFEngine Enterprise collect reports?
cf-hub
makes connections from the hub to remote agents currently registered in
the lastseen database (viewable with cf-key -s
)
on [body hub control port
]body hub control port. The hub
tries to collect from up to the licensed number of hosts for each collection
round as identified by hub_schedule
as defined
in body hub control
.
Note: No ordering is specified, so if the number of entries in the lastseen database is greater than the number of licensed hosts it is not possible to determine which hosts will be collected from and which hosts will be skipped.
See Also:
hostsseen()
,hostswithclass()
How are agents not running determined?
Hosts who's last agent execution status is "FAIL" will show up under "Agents not running". A hosts last agent execution status is set to "FAIL" when the hub notices that there are no promise results within 3x of the expected agent run interval. The agents average run interval is computed by a geometric average based on the 4 most recent agent executions.
You can inspect hosts last execution time, execution status (from the hubs perspective), and average run interval using the following SQL.
SELECT Hosts.HostName AS "Host name",
AgentStatus.LastAgentLocalExecutionTimeStamp AS "Last agent local execution
time", cast(AgentStatus.AgentExecutionInterval AS integer) AS "Agent execution
interval", AgentStatus.LastAgentExecutionStatus AS "Last agent execution status"
FROM AgentStatus INNER JOIN Hosts ON Hosts.HostKey = AgentStatus.HostKey
This can be queried over the API most easily by placing the query into a json
file. And then using the query
API.
agent_execution_time_interval_status.query.json
:
{
"query": "SELECT Hosts.HostName, AgentStatus.LastAgentLocalExecutionTimeStamp, cast(AgentStatus.AgentExecutionInterval AS integer), AgentStatus.LastAgentExecutionStatus FROM AgentStatus INNER JOIN Hosts ON Hosts.HostKey = AgentStatus.HostKey"
}
$ curl -s -u admin:admin http://hub/api/query -X POST -d @agent_execution_time_interval_status.query.json | jq ".data[0].rows"
[
[
"hub",
"2016-07-25 16:53:23+00",
"296",
"OK"
],
[
"host001",
"2016-07-25 16:06:50+00",
"305",
"FAIL"
]
]
See Also: Enterprise API Reference
, Enterprise API Examples
How are hosts not reporting determined?
Hosts that have not been collected from within blueHostHorizon
seconds will
show up under "Hosts not reporting".
blueHostHorizon
defaults to 900 seconds (15 minutes). You can inspect the
current value of blueHostHorizon
from Mission Portal or via the API:
$ curl -s -u admin:admin http://hub/api/settings/ | jq ".data[0].blueHostHorizon"
900
See Also: Enterprise API Reference
, Enterprise API Examples
, Enterprise Settings
Troubleshooting report collection
The following steps can be used to help diagnose and potentially restore reporting for hosts experiencing issues.
Perform manual delta collection for a single host
Performing back to back delta collections and comparing the data received can help to expose so called patching issues. If the same amount of data is collected twice a rebase may resolve it.
[root@hub ~]# cf-hub -q delta -H 192.168.33.2 -v
verbose: ----------------------------------------------------------------
verbose: Initialization preamble
verbose: ----------------------------------------------------------------
# <snipped for brevity>
verbose: Connecting to host 192.168.33.2, port 5308 as address 192.168.33.2
verbose: Waiting to connect...
verbose: Setting socket timeout to 10 seconds.
verbose: Connected to host 192.168.33.2 address 192.168.33.2 port 5308 (socket descriptor 4)
verbose: TLS version negotiated: TLSv1.2; Cipher: AES256-GCM-SHA384,TLSv1/SSLv3
verbose: TLS session established, checking trust...
verbose: Received public key compares equal to the one we have stored
verbose: Server is TRUSTED, received key 'SHA=e77d408e9802e2c549417d5e3379c43050d2ad5928a198855dbb7e9c8af9a6f1' MATCHES stored one.
verbose: Key digest for address '192.168.33.2' is SHA=e77d408e9802e2c549417d5e3379c43050d2ad5928a198855dbb7e9c8af9a6f1
verbose: Will request from host 192.168.33.2 (digest = SHA=e77d408e9802e2c549417d5e3379c43050d2ad5928a198855dbb7e9c8af9a6f1) data later than timestamp 1481901790
verbose: Successfully opened extension plugin 'cfengine-report-collect.so' from '/var/cfengine/lib/cfengine-report-collect.so'
verbose: Successfully loaded extension plugin 'cfengine-report-collect.so'
verbose: Sending query at Fri Dec 16 15:24:23 2016
verbose: h>s QUERY delta 1481901790 1481901863
verbose: Sending query at Fri Dec 16 15:24:23 2016
verbose: Received reply of 5050 bytes at Fri Dec 16 15:24:23 2016 -> Xfer time 0 seconds (processing time 0 seconds)
verbose: Processing report: MOM (items: 44)
verbose: Processing report: MOY (items: 48)
verbose: Processing report: MOH (items: 22)
verbose: Processing report: EXS (items: 1)
verbose: Received 5 kb of report data with 115 individual items
verbose: Connection to 192.168.33.2 is closed
Perform manual rebase collection for a single host
A rebase
causes the hub to throw away all reports since the last collection
and collect only the output from the most recent run.
[root@hub ~]# cf-hub -q rebase -H 192.168.33.2 -v
verbose: ----------------------------------------------------------------
verbose: Initialization preamble
verbose: ----------------------------------------------------------------
# <snipped for brevity>
verbose: Connecting to host 192.168.33.2, port 5308 as address 192.168.33.2
verbose: Waiting to connect...
verbose: Setting socket timeout to 10 seconds.
verbose: Connected to host 192.168.33.2 address 192.168.33.2 port 5308 (socket descriptor 4)
verbose: TLS version negotiated: TLSv1.2; Cipher: AES256-GCM-SHA384,TLSv1/SSLv3
verbose: TLS session established, checking trust...
verbose: Received public key compares equal to the one we have stored
verbose: Server is TRUSTED, received key 'SHA=e77d408e9802e2c549417d5e3379c43050d2ad5928a198855dbb7e9c8af9a6f1' MATCHES stored one.
verbose: Key digest for address '192.168.33.2' is SHA=e77d408e9802e2c549417d5e3379c43050d2ad5928a198855dbb7e9c8af9a6f1
verbose: Successfully opened extension plugin 'cfengine-report-collect.so' from '/var/cfengine/lib/cfengine-report-collect.so'
verbose: Successfully loaded extension plugin 'cfengine-report-collect.so'
verbose: Sending query at Fri Dec 16 15:35:10 2016
verbose: h>s QUERY rebase 0 1481902510
verbose: Sending query at Fri Dec 16 15:35:10 2016
verbose: Received reply of 128157 bytes at Fri Dec 16 15:35:10 2016 -> Xfer time 0 seconds (processing time 0 seconds)
verbose: Processing report: CLD (items: 46)
verbose: Processing report: VAD (items: 52)
verbose: Processing report: LSD (items: 13)
verbose: Processing report: SDI (items: 327)
verbose: Processing report: SPD (items: 143)
verbose: Processing report: ELD (items: 205)
verbose: ts #0 > 1481902510
verbose: Received 125 kb of report data with 786 individual items
verbose: Connection to 192.168.33.2 is closed
Note: The Enterprise hub automatically schedules rebase queries if it has
been unable to collect from a given candidate for client_history_timeout
hours.
If a manual rebase collection does not restore reporting functionality for a host continue on to restarting the report collection components.
Restart report collection components
Sometimes it is necessary to restart the report collection subsystem in order to
re-synchronize the caching layer with the database. To restart the report
collection subsystem simply kill cf-hub
, cf-consumer
, redis-server
, and
run the update policy.
For systemd hosts this can be accomplished by simply restarting the cf-hub
service. The related component restarts are automatically handled via the unit
dependencies:
[root@hub ~]# systemctl restart cf-hub
For non-systemd hosts:
[root@hub ~]# pkill cf-consumer
[root@hub ~]# pkill cf-hub
[root@hub ~]# pkill redis-server
[root@hub ~]# cf-agent -KIf update.cf
info: Executing 'no timeout' ... '/var/cfengine/bin/redis-server /var/cfengine/config/redis.conf'
info: Command related to promiser '/var/cfengine/bin/redis-server /var/cfengine/config/redis.conf' returned code defined as promise kept 0
info: Completed execution of '/var/cfengine/bin/redis-server /var/cfengine/config/redis.conf'
info: Executing 'no timeout' ... '/var/cfengine/bin/cf-consumer'
info: Command related to promiser '/var/cfengine/bin/cf-consumer' returned code defined as promise kept 0
info: Completed execution of '/var/cfengine/bin/cf-consumer'
info: Executing 'no timeout' ... '"/var/cfengine/bin/cf-hub"'
info: Command related to promiser '"/var/cfengine/bin/cf-hub"' returned code defined as promise kept 0
info: Completed execution of '"/var/cfengine/bin/cf-hub"'