Security Context Misconfiguration with vGPU Nodes in NVIDIA Device Plugin Helm Chart #854
Labels
lifecycle/stale
Denotes an issue or PR has remained open with no activity and has become stale.
Description
In the NVIDIA Device Plugin Helm chart (v0.16.1), when using a ConfigMap strategy, all nodes are incorrectly assigned elevated privileges, regardless of their MIG strategy configuration. This is due to a flaw in the template logic that prevents the actual content of the ConfigMap from being evaluated.
Current Behavior
migStrategy: none
and no ConfigMap, the correct security context is applied.migStrategy: none
:Expected Behavior
Nodes should receive the appropriate security context based on their actual MIG strategy configuration, especially those using the default
migStrategy: none
.Impact
This issue unnecessarily elevates privileges on all nodes when using a ConfigMap strategy, potentially compromising security, particularly in mixed GPU environments.
Steps to Reproduce
Code Analysis
The issue begins in the
nvidia-device-plugin.allPossibleMigStrategiesAreNone
template. Here's the relevant part:The initial issue is in this section:
The
hasConfigMap
function is defined as:where the configMapName function is defined as:
Root Cause
hasConfigMap
function always returns true if there's any content inconfig.map
within the values.yaml, without examining its contents.$result = false
whenever a ConfigMap is present, regardless of its content.range
loop) is never reached when a ConfigMap is defined.Additional Context
This code:
The problem is that this logic doesn't distinguish between the default configuration and others. It treats all configurations equally. As a result:
Potential action items
Fundamentally, the issue begins with the described logical misconfiguration, however there remains an underlying issue due to a single Daemonset being generated for all configurations. This subsequently applies the most permissive set of contexts regardless if the node only contains vGPU's. A method to mitigate this would include a unique Daemonset per configuration, that applies the appropriate permissions based on the gpu type. However, this would require a non trivial overhaul to the helm chart.
The text was updated successfully, but these errors were encountered: