Skip to content

OSG-SEC-2023-09-26 CRITICAL PMIx race condition vulnerability affecting Slurm

Dear OSG Security Contacts,

A CRITICAL risk vulnerability concerning PMIx has been discovered, which may permit a malicious user to obtain ownership of an arbitrary file and could thus result in privilege escalation to root. [1] [2] [3] This is a critical issue for all sites that are using Slurm built with PMIx support.

IMPACTED VERSIONS:

OpenPMIx PMIx before 4.2.6 and 5.0.x before 5.0.1

WHAT ARE THE VULNERABILITIES:

CVE-2023-41915 A filesystem race condition could permit a malicious user to obtain ownership of an arbitrary file on the filesystem when parts of the PMIx library is called by a process running as uid 0. This may happen under the default configuration of certain workload managers, including Slurm. [1]

WHAT YOU SHOULD DO:

First check if Slurm was built with PMIx support by running:

$ srun --mpi=list

Typical output will be something like

MPI plugin types are...
            cray_shasta
            none
            pmi2
            pmix
specific pmix plugin versions available: pmix_v3,pmix_v4

This will tell you whether Slurm is built with PMIx support and if so which version of PMIx.

If the command returns no pmix option, your Slurm installation can be assumed to be unaffected by this CVE.

Otherwise, upgrade PMIx to the fixed releases v4.2.6 or v5.0.1 [1].

Note that there is at least one ABI-breaking change between versions 4.2.3 and 4.2.6, which means that upgrading to 4.2.6 requires Slurm packages to be rebuilt.

If Slurm upgrade isn't an option, you can disable PMIx support by removing the mpi_pmix*.so libraries on the compute nodes and adjusting the MpiDefault setting in your Slurm configuration.

You can also patch your current PMIx version by replacing the chown function with lchown in the source and rebuilding the PMIx rpm [4]. After installing the patched PMIx, this command shouldn't return any result:

objdump -t /usr/lib64/libpmi*.so* | grep chown

After the installation, slurmd on the compute nodes needs to be restarted.

Note that all versions prior to PMIx 4.2.6 are vulnerable, but some older PMIx versions are no longer supported and will not be patched.

Thanks to the EGI CSIRT for their alert and write up on this issue. [5]

REFERENCES

OSG was alerted to this vulnerability by the EGI CSIRT.

[1] https://github.com/openpmix/openpmix/releases/tag/v4.2.6

[2] https://nvd.nist.gov/vuln/detail/CVE-2023-41915

[3] https://access.redhat.com/security/cve/CVE-2023-41915

[4] https://github.com/openpmix/openpmix/pull/3150

[5] https://confluence.egi.eu/display/EGIBG/SVG+Advisories

Please contact the OSG security team at [email protected] if you have any questions or concerns.

OSG Security Team