-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/improve security in the inference server start command #940
base: main
Are you sure you want to change the base?
Conversation
08489a0
to
6363864
Compare
is_gpu = "gpu" in image and "jetson" not in image | ||
is_jetson = "jetson" in image | ||
|
||
if is_gpu: | ||
device_requests = [ | ||
docker.types.DeviceRequest(device_ids=["all"], capabilities=[["gpu"]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User can try to run inference-gpu
docker image on non-gpu host, this will result in
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
Probably nothing to worry about - trying to run gpu image on non-gpu hardware feels like unintended situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_image() function checks which image is the appropriate one already. (line 134)
@@ -167,7 +170,25 @@ def start_inference_container( | |||
labels=labels, | |||
ports=ports, | |||
device_requests=device_requests, | |||
environment=environment, | |||
environment=environment + [ | |||
"MODEL_CACHE_DIR=/tmp/model-cache", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is MODEL_CACHE_DIR
host path or docker path? By default we cache models under /tmp/cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its the path within the docker container.
6363864
to
39ea380
Compare
Description
Harden the
inference server start
command in several ways for the cpu and gpu containers. Notably:Container Privilege Restrictions:
Added security_opt=["no-new-privileges"] to prevent the container from gaining new privileges
Added cap_drop=["ALL"] to drop all Linux capabilities by default
Only adds back minimal required capabilities with cap_add=["NET_BIND_SERVICE"] (and SYS_ADMIN for GPU containers)
These restrictions are only applied when not running on Jetson devices (if not is_jetson)
Read-only Filesystem:
Added read_only=not is_jetson to make the container filesystem read-only
Only /tmp directory is mounted as writable for necessary runtime files
Explicitly defines cache directories to use /tmp for various components:
"MODEL_CACHE_DIR=/tmp/model-cache",
"TRANSFORMERS_CACHE=/tmp/huggingface",
"YOLO_CONFIG_DIR=/tmp/yolo",
"MPLCONFIGDIR=/tmp/matplotlib",
"HOME=/tmp/home",
Network Isolation:
Added network_mode="bridge" to ensure container uses bridge networking
Added ipc_mode="private" to isolate the IPC namespace (except for Jetson devices)
Type of change
Security fixes
Tested on CPU (mac), GPU (T4 Nvidia GPU VM with Intel architecture) and on Jetson 5.X
Any specific deployment considerations
For example, documentation changes, usability, usage/costs, secrets, etc.
Docs