Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/improve security in the inference server start command #940

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bigbitbus
Copy link
Contributor

@bigbitbus bigbitbus commented Jan 13, 2025

Description

Harden the inference server start command in several ways for the cpu and gpu containers. Notably:

Container Privilege Restrictions:

Added security_opt=["no-new-privileges"] to prevent the container from gaining new privileges

Added cap_drop=["ALL"] to drop all Linux capabilities by default

Only adds back minimal required capabilities with cap_add=["NET_BIND_SERVICE"] (and SYS_ADMIN for GPU containers)
These restrictions are only applied when not running on Jetson devices (if not is_jetson)

Read-only Filesystem:

Added read_only=not is_jetson to make the container filesystem read-only
Only /tmp directory is mounted as writable for necessary runtime files
Explicitly defines cache directories to use /tmp for various components:
"MODEL_CACHE_DIR=/tmp/model-cache",
"TRANSFORMERS_CACHE=/tmp/huggingface",
"YOLO_CONFIG_DIR=/tmp/yolo",
"MPLCONFIGDIR=/tmp/matplotlib",
"HOME=/tmp/home",

Network Isolation:
Added network_mode="bridge" to ensure container uses bridge networking
Added ipc_mode="private" to isolate the IPC namespace (except for Jetson devices)

Type of change

Security fixes

Tested on CPU (mac), GPU (T4 Nvidia GPU VM with Intel architecture) and on Jetson 5.X
image

Any specific deployment considerations

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs

  • Docs updated? What were the changes:

is_gpu = "gpu" in image and "jetson" not in image
is_jetson = "jetson" in image

if is_gpu:
device_requests = [
docker.types.DeviceRequest(device_ids=["all"], capabilities=[["gpu"]])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User can try to run inference-gpu docker image on non-gpu host, this will result in
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Probably nothing to worry about - trying to run gpu image on non-gpu hardware feels like unintended situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The get_image() function checks which image is the appropriate one already. (line 134)

@@ -167,7 +170,25 @@ def start_inference_container(
labels=labels,
ports=ports,
device_requests=device_requests,
environment=environment,
environment=environment + [
"MODEL_CACHE_DIR=/tmp/model-cache",
Copy link
Contributor

@grzegorz-roboflow grzegorz-roboflow Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is MODEL_CACHE_DIR host path or docker path? By default we cache models under /tmp/cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its the path within the docker container.

@bigbitbus bigbitbus force-pushed the fix/improve-docker-container-security branch from 6363864 to 39ea380 Compare January 17, 2025 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants