Fix/improve security in the inference server start command #940

bigbitbus · 2025-01-13T22:33:07Z

Description

Harden the inference server start command in several ways for the cpu and gpu containers. Notably:

Container Privilege Restrictions:

Added security_opt=["no-new-privileges"] to prevent the container from gaining new privileges

Added cap_drop=["ALL"] to drop all Linux capabilities by default

Only adds back minimal required capabilities with cap_add=["NET_BIND_SERVICE"] (and SYS_ADMIN for GPU containers)
These restrictions are only applied when not running on Jetson devices (if not is_jetson)

Read-only Filesystem:

Added read_only=not is_jetson to make the container filesystem read-only
Only /tmp directory is mounted as writable for necessary runtime files
Explicitly defines cache directories to use /tmp for various components:
"MODEL_CACHE_DIR=/tmp/model-cache",
"TRANSFORMERS_CACHE=/tmp/huggingface",
"YOLO_CONFIG_DIR=/tmp/yolo",
"MPLCONFIGDIR=/tmp/matplotlib",
"HOME=/tmp/home",

Network Isolation:
Added network_mode="bridge" to ensure container uses bridge networking
Added ipc_mode="private" to isolate the IPC namespace (except for Jetson devices)

Type of change

Security fixes

Tested on CPU (mac), GPU (T4 Nvidia GPU VM with Intel architecture) and on Jetson 5.X

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

Docs updated? What were the changes:

grzegorz-roboflow · 2025-01-15T20:08:04Z

inference_cli/lib/container_adapter.py

+    is_gpu = "gpu" in image and "jetson" not in image
+    is_jetson = "jetson" in image
+
+    if is_gpu:
        device_requests = [
            docker.types.DeviceRequest(device_ids=["all"], capabilities=[["gpu"]])


User can try to run inference-gpu docker image on non-gpu host, this will result in
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Probably nothing to worry about - trying to run gpu image on non-gpu hardware feels like unintended situation.

The get_image() function checks which image is the appropriate one already. (line 134)

grzegorz-roboflow · 2025-01-15T20:09:28Z

inference_cli/lib/container_adapter.py

@@ -167,7 +170,25 @@ def start_inference_container(
        labels=labels,
        ports=ports,
        device_requests=device_requests,
-        environment=environment,
+        environment=environment + [
+            "MODEL_CACHE_DIR=/tmp/model-cache",


is MODEL_CACHE_DIR host path or docker path? By default we cache models under /tmp/cache

Its the path within the docker container.

bigbitbus requested review from PawelPeczek-Roboflow, grzegorz-roboflow, yeldarby, probicheaux and hansent as code owners January 13, 2025 22:33

bigbitbus force-pushed the fix/improve-docker-container-security branch from 08489a0 to 6363864 Compare January 13, 2025 22:37

grzegorz-roboflow reviewed Jan 15, 2025

View reviewed changes

bigbitbus requested a review from grzegorz-roboflow January 17, 2025 17:28

bigbitbus and others added 2 commits January 17, 2025 17:34

Improve docker container security

72abd88

Added writable directories as needed

39ea380

bigbitbus force-pushed the fix/improve-docker-container-security branch from 6363864 to 39ea380 Compare January 17, 2025 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/improve security in the inference server start command #940

Fix/improve security in the inference server start command #940

bigbitbus commented Jan 13, 2025 •

edited

Loading

grzegorz-roboflow Jan 15, 2025

bigbitbus Jan 17, 2025

grzegorz-roboflow Jan 15, 2025 •

edited

Loading

bigbitbus Jan 17, 2025

Fix/improve security in the inference server start command #940

Are you sure you want to change the base?

Fix/improve security in the inference server start command #940

Conversation

bigbitbus commented Jan 13, 2025 • edited Loading

Description

Type of change

Any specific deployment considerations

Docs

grzegorz-roboflow Jan 15, 2025

Choose a reason for hiding this comment

bigbitbus Jan 17, 2025

Choose a reason for hiding this comment

grzegorz-roboflow Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

bigbitbus Jan 17, 2025

Choose a reason for hiding this comment

bigbitbus commented Jan 13, 2025 •

edited

Loading

grzegorz-roboflow Jan 15, 2025 •

edited

Loading