'glusterfs' OOM'd on a server earlier today:
Apr 13 23:03:57 mw9 kernel: Out of memory: Kill process 20373 (glusterfs) score 82 or sacrifice child
Puppet cannot properly remount GlusterFS:
Apr 13 23:13:20 mw9 puppet-agent[21626]: (/Stage[main]/Role::Mediawiki/Gluster::Mount[/mnt/mediawiki-static]/Exec[/mnt/mediawiki-static]/returns) /bin/mkdir: cannot create directory ‘/mnt/mediawiki-static’: File exists Apr 13 23:13:20 mw9 puppet-agent[21626]: '/bin/mkdir -p '/mnt/mediawiki-static'' returned 1 instead of one of [0] Apr 13 23:13:20 mw9 puppet-agent[21626]: (/Stage[main]/Role::Mediawiki/Gluster::Mount[/mnt/mediawiki-static]/Exec[/mnt/mediawiki-static]/returns) change from 'notrun' to ['0'] failed: '/bin/mkdir -p '/mnt/mediawiki-static'' returned 1 instead of one of [0] (corrective) Apr 13 23:13:22 mw9 puppet-agent[21626]: (/Stage[main]/Role::Mediawiki/Gluster::Mount[/mnt/mediawiki-static]/Mount[/mnt/mediawiki-static]) Dependency Exec[/mnt/mediawiki-static] has failure Apr 13 23:13:22 mw9 puppet-agent[21626]: (/Stage[main]/Role::Mediawiki/Gluster::Mount[/mnt/mediawiki-static]/Mount[/mnt/mediawiki-static]) Skipping because of failed dependencies Apr 13 23:13:22 mw9 puppet-agent[21626]: (Stage[main]) Unscheduling all events on Stage[main]
A umount -l /mnt/mediawiki-static fixed the situation:
Apr 13 23:15:59 mw9 systemd[15896]: mnt-mediawiki\x2dstatic.mount: Succeeded. Apr 13 23:15:59 mw9 systemd[1]: mnt-mediawiki\x2dstatic.mount: Succeeded. Apr 13 23:15:59 mw9 systemd[24940]: mnt-mediawiki\x2dstatic.mount: Succeeded. Apr 13 23:15:59 mw9 systemd[1]: mnt-mediawiki\x2dstatic.automount: Got automount request for /mnt/mediawiki-static, triggered by 894 (nginx) Apr 13 23:15:59 mw9 systemd[1]: Mounting /mnt/mediawiki-static...
As long as the OOM is a one-off incident, I am not very concerned, but services must self-heal after failures, which didn't happen here. The -p flag in mkdir should prevent the 'File exists' error, but it doesn't. In the puppet tree, we run mkdir manually, can't we change this to file { '/mnt/mediawiki-static': ensure => directory, <put other parameters here> }?