Normal operation

Pacemaker inherits most of its logging setting from either CMAN or Corosync - depending on what its running on top of.

In order to avoid spamming syslog, Pacemaker only logs a summary of its actions (NOTICE and above) to syslog.

If the level of detail in syslog is insufficient, you should enable a cluster log file. Normally one is configured by default and it contains everything except debug and trace messages.

To find the location of this file, either examine your CMAN (cluster.conf) or Corosync (corosync.conf) configuration file or look for syslog entries such as:

pacemakerd[1823]:   notice: crm_add_logfile: Additional logging available in /var/log/cluster/corosync.log

If you do not see a line like this, either update the cluster configuration or set PCMK_debugfile in /etc/sysconfig/pacemaker

crm_report also knows how to find all the Pacemaker related logs and blackbox files

If the level of detail in the cluster log file is still insufficient, or you simply wish to go blind, you can turn on debugging in Corosync/CMAN, or set PCMK_debug in /etc/sysconfig/pacemaker.

A minor advantage of setting PCMK_debug is that the value can be a comma-separated list of processes which should produce debug logging instead of a global yes/no.

When an ERROR occurs

Pacemaker includes support for a blackbox.

When enabled, the blackbox contains a rolling buffer of all logs (not just those sent to syslog or a file) and is written to disk after a crash or assertion failure.

The blackbox recorder can be enabled by setting PCMK_blackbox in /etc/sysconfig/pacemaker or at runtime by sending SIGUSR1. Eg.

killall -USR1 crmd

When enabled you’ll see a log such as:

crmd[1811]:   notice: crm_enable_blackbox: Initiated blackbox recorder: /var/lib/pacemaker/blackbox/crmd-1811

If a crash occurs, the blackbox will be available at that location. To extract the contents, pass it to qb-blackbox:

qb-blackbox /var/lib/pacemaker/blackbox/crmd-1811

Which produces output like:

Dumping the contents of /var/lib/pacemaker/blackbox/crmd-1811
[debug] shm size:5242880; real_size:5242880; rb->word_size:1310720
[debug] read total of: 5242892
Ringbuffer:
 ->NORMAL
 ->write_pt [5588]
 ->read_pt [0]
 ->size [1310720 words]
 =>free [5220524 bytes]
 =>used [22352 bytes]
...
trace   May 19 23:20:55 gio_read_socket(368):0: 0x11ab920.5 1 (ref=1)
trace   May 19 23:20:55 pcmk_ipc_accept(458):0: Connection 0x11aee00
info    May 19 23:20:55 crm_client_new(302):0: Connecting 0x11aee00 for uid=0 gid=0 pid=24425 id=0e943a2a-dd64-49bc-b9d5-10fa6c6cb1bd
debug   May 19 23:20:55 handle_new_connection(465):2147483648: IPC credentials authenticated (24414-24425-14)
...
[debug] Free'ing ringbuffer: /dev/shm/qb-create_from_file-header

When an ERROR occurs you’ll also see the function and line number that produced it such as:

crmd[1811]: Problem detected at child_death_dispatch:872 (mainloop.c), please see /var/lib/pacemaker/blackbox/crmd-1811.1 for additional details
crmd[1811]: Problem detected at main:94 (crmd.c), please see /var/lib/pacemaker/blackbox/crmd-1811.2 for additional details

Again, simply pass the files to qb-blackbox to extract and query the contents.

Note the a counter is added to the end so as to avoid name collisions.

Diving into files and functions

In case you have not already guessed, all logs include the name of the function that generated them. So:

crmd[1811]:   notice: crm_update_peer_state: cman_event_callback: Node corosync-host-1[1] - state is now lost (was member)

came from the function crm_update_peer_state().

To obtain more detail from that or any other function, you can set PCMK_trace_functions in /etc/sysconfig/pacemaker to a comma separated list of function names. Eg.

PCMK_trace_functions=crm_update_peer_state,run_graph

For a bigger stick, you may also activate trace logging for all the functions in a particular source file or files by setting PCMK_trace_files as well.

PCMK_trace_files=cluster.c,election.c

These additional logs are sent to the cluster log file. Note that enabling tracing options also alters the output format.

Instead of:

crmd:  notice: crm_cluster_connect: 	Connecting to cluster infrastructure: cman

the output includes file and line information:

crmd: (   cluster.c:215   )  notice: crm_cluster_connect: 	Connecting to cluster infrastructure: cman

But wait there’s still more

Still need more detail? You’re in luck! The blackbox can be dumped at any time, not just when an error occurs.

First, make sure the blackbox is active (we’ll assume its the crmd that needs to be debugged):

killall -USR1 crmd

Next, discard any previous contents by dumping them to disk

killall -TRAP crmd

now cause whatever condition you’re trying to debug, and send -TRAP when you’re ready to see the result.

killall -TRAP crmd

You can now look for the result in syslog:

grep -e crm_write_blackbox: /var/log/messages

This will include a filename containing the trace logging:

crmd[1811]:   notice: crm_write_blackbox: Blackbox dump requested, please see /var/lib/pacemaker/blackbox/crmd-1811.1 for contents
crmd[1811]:   notice: crm_write_blackbox: Blackbox dump requested, please see /var/lib/pacemaker/blackbox/crmd-1811.2 for contents

To extract the trace loging for our test, pass the most recent file to qb-blackbox:

qb-blackbox /var/lib/pacemaker/blackbox/crmd-1811.2

At this point you’ll probably want to use grep :)