Release Candidate: 1.1.12-rc1

As promised, this announcement brings the first release candidate for Pacemaker 1.1.12

https://github.com/ClusterLabs/pacemaker/releases/Pacemaker-1.1.12-rc1

This release primarily focuses on important but mostly invisible changes under-the-hood:

The CIB is now O(2) faster. Thats 100x for those not familiar with Big-O notation :-)
This has massively reduced the cluster’s use of system resources, allowing us to scale further on the same hardware, and dramatically reduced failover times for large clusters.
Support for ACLs are is enabled by default.
The new implementation can restrict cluster access for containers where pacemaker-remoted is used and is also more efficient.
All CIB updates are now serialized and pre-synchronized via the corosync CPG interface. This makes it impossible for updates to be lost, even when the cluster is electing a new DC.
Schema versioning changes
New features are no longer silently added to the schema. Instead the ${Y} in pacemaker-${X}-${Y} will be incremented for simple additions, and ${X} will be bumped for removals or other changes requiring an XSL transformation.
To take advantage of new features, you will need to updates all the nodes and then run the equivalent of cibadmin --upgrade.

Thankyou to everyone that has tested out the new CIB and ACL code already. Please keep those bug reports coming in!

List of known bugs to be investigating during the RC phase:

5206 Fileencoding broken
5194 A resource starts with a standby node. (Latest attrd does not serve as the crmd-transition-delay parameter)
5197 Fail-over is delayed. (State transition is not calculated.)
5139 Each node fenced in its own transition during start-up fencing
5200 target node is over-utilized with allow-migrate=true
5184 Pending probe left in the cib
5165 Add support for transient node utilization attributes

To build rpm packages for testing:

Clone the current sources:
# git clone –depth 0 git://github.com/ClusterLabs/pacemaker.git # cd pacemaker
Install dependancies (if you haven’t already)
[Fedora] # sudo yum install -y yum-utils [ALL] # make rpm-dep
Build Pacemaker
# make rc
Copy and deploy as needed

Details

Changesets: 633 Diff: 184 files changed, 12690 insertions(+), 5843 deletions(-)

Highlights

Features added since Pacemaker-1.1.11

Changes to the ACL schema to support nodes and unix groups
cib: Check ACLs prior to making the update instead of parsing the diff afterwards
cib: Default ACL support to on
cib: Enable the more efficient xml patchset format
cib: Implement zero-copy status update (performance)
cib: Send all r/w operations via the cluster connection and have all nodes process them
crm_mon: Display brief output if “-b/–brief” is supplied or ‘b’ is toggled
crm_ticket: Support multiple modifications for a ticket in an atomic operation
Fencing: Add the ability to call stonith_api_time() from stonith_admin
logging: daemons always get a log file, unless explicitly set to configured ‘none’
PE: Automatically re-unfence a node if the fencing device definition changes
pengine: cl#5174 - Allow resource sets and templates for location constraints
pengine: Support cib object tags
pengine: Support cluster-specific instance attributes based on rules
pengine: Support id-ref in nvpair with optional “name”
pengine: Support per-resource maintenance mode
pengine: Support site-specific instance attributes based on rules
tools: Display pending state in crm_mon/crm_resource/crm_simulate if –pending/-j is supplied (cl#5178)
xml: Add the ability to have lightweight schema revisions
xml: Enable resource sets in location constraints for 1.2 schema
xml: Support resources that require unfencing

Changes since Pacemaker-1.1.11

acl: Authenticate pacemaker-remote requests with the node name as the client
cib: allow setting permanent remote-node attributes
cib: Do not disable cib disk writes if on-disk cib is corrupt
cib: Ensure ‘cibadmin -R/–replace’ commands get replies
cib: Fix remote cib based on TLS
cib: Ingore patch failures if we already have their contents
cib: Resolve memory leaks in query paths
cl#5055: Improved migration support.
cluster: Fix segfault on removing a node
controld: Do not consider the dlm up until the address list is present
controld: handling startup fencing within the controld agent, not the dlm
crmd: Ack pending operations that were cancelled due to rsc deletion
crmd: Actions can only be executed if their pre-requisits completed successfully
crmd: Do not erase the status section for unfenced nodes
crmd: Do not overwrite existing node state when fencing completes
crmd: Do not start timers for already completed operations
crmd: Fenced nodes that return prior to an election do not need to have their status section reset
crmd: make lrm_state hash table not case sensitive
crmd: make node_state erase correctly
crmd: Prevent manual fencing confirmations from attempting to create node entries for unknown nodes
crmd: Prevent memory leak in error paths
crmd: Prevent memory leak when accepting a new DC
crmd: Prevent message relay from attempting to create node entries for unknown nodes
crmd: Prevent SIGPIPE when notifying CMAN about fencing operations
crmd: Report unsuccessful unfencing operations
crm_diff: Allow the generation of xml patchsets without digests
crm_mon: Allow the file created by –as-html to be world readable
crm_mon: Ensure resource attributes have been unpacked before displaying connectivity data
crm_node: Only remove the named resource from the cib
crm_node: Prevent use-after-free in tools_remove_node_cache()
crm_resource: Gracefully handle -EACCESS when querying the cib
fencing: Advertise support for reboot/on/off in the metadata for legacy agents
fencing: Automatically switch from ‘list’ to ‘status’ to ‘static-list’ if those actions are not advertised in the metadata
fencing: Correctly record which peer performed the fencing operation
fencing: default to ‘off’ when agent does not advertise ‘reboot’ in metadata
fencing: Execute all required fencing devices regardless of what topology level they are at
fencing: Pass the correct options when looking up the history by node name
fencing: Update stonith device list only if stonith is enabled
get_cluster_type: failing concurrent tool invocations on heartbeat
iso8601: Different logic is needed when logging and calculating durations
lrmd: Cancel recurring operations before stop action is executed
lrmd: Expose logging variables expected by OCF agents
lrmd: Merge duplicate recurring monitor operations
lrmd: Provide stderr output from agents if available, otherwise fall back to stdout
mainloop: Fixes use after free in process monitor code
make resource ID case sensitive
mcp: Tell systemd not to respawn us if we exit with rc=100
pengine: Allow container nodes to migrate with connection resource
pengine: cl#5186 - Avoid running rsc on two nodes when node is fenced during migration
pengine: cl#5187 - Prevent resources in an anti-colocation from even temporarily running on a same node
pengine: Correctly handle origin offsets in the future
pengine: Correctly search failcount
pengine: Default sequential to TRUE for resource sets for consistency with colocation sets
pengine: Delay unfencing until after we know the state of all resources that require unfencing
pengine: Do not initiate fencing for unclean nodes when fencing is disabled
pengine: Do not unfence nodes that are offline, unclean or shutting down
pengine: Fencing devices default to only requiring quorum in order to start
pengine: fixes invalid transition caused by clones with more than 10 instances
pengine: Force record pending for migrate_to actions
pengine: handles edge case where container order constraints are not honored during migration
pengine: Ignore failure-timeout only if the failed operation has on-fail=”block”
pengine: Log when resources require fencing but fencing is disabled
pengine: Memory leaks
pengine: Unfencing is based on device probes, there is no need to unfence when normal resources are found active
Portability: Use basic types for DBus compatability struct
remote: Allow baremetal remote-node connection resources to migrate
remote: Enable migration support for baremetal connection resources by default
services: Correctly reset the nice value for lrmd’s children
services: Do not allow duplicate recurring op entries
services: Do not block synced service executions
services: Fixes segfault associated with cancelling in-flight recurring operations.
services: Reset the scheduling policy and priority for lrmd’s children without replying on SCHED_RESET_ON_FORK
services_action_cancel: Interpret return code from mainloop_child_kill() correctly
stonith_admin: Ensure pointers passed to sscanf() are properly initialized
stonith_api_time_helper now returns when the most recent fencing operation completed
systemd: Prevent use-of-NULL when determining if an agent exists
upstart: Allow comilation with glib versions older than 2.28
xml: Better move detection logic for xml nodes
xml: Check all available schemas when doing upgrades
xml: Convert misbehaving #define into a more easily understood inline function
xml: If validate-with is missing, we find the most recent schema that accepts it and go from there
xml: Update xml validation to allow ‘’

Share on

Twitter Facebook Google+ LinkedIn

Release Candidate: 1.1.12-rc1

Andrew Beekhof

Details

Highlights

Features added since Pacemaker-1.1.11

Changes since Pacemaker-1.1.11

Share on

You May Also Enjoy

Building a Kubernetes Operator in a Day with AI

Savaged by Softdog, a Cautionary Tale

A New Fencing Mechanism (TBD)

A New Thing

Release Candidate: 1.1.12-rc1

Andrew Beekhof

Tag Cloud

Details

Highlights

Features added since Pacemaker-1.1.11

Changes since Pacemaker-1.1.11

Share on

You May Also Enjoy

Building a Kubernetes Operator in a Day with AI

Savaged by Softdog, a Cautionary Tale

A New Fencing Mechanism (TBD)

A New Thing