Skip to content

Commit

Permalink
Update docs for release 0.14.0 (#420)
Browse files Browse the repository at this point in the history
* Generate ref docs

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix jinja template error

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update developer guide

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Add Helm installation docs

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Fix input is missing in torchserve llm docs

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update kserve installation docs to use server side apply

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update Alibi explainer docs

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* change to comment

Signed-off-by: Dan Sun <[email protected]>

* change to comment

Signed-off-by: Dan Sun <[email protected]>

* Update ingress controller

Signed-off-by: Dan Sun <[email protected]>

* Update installation guide

---------

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Co-authored-by: Dan Sun <[email protected]>
  • Loading branch information
sivanantha321 and yuzisun authored Nov 24, 2024
1 parent 71c5049 commit d1ee184
Show file tree
Hide file tree
Showing 18 changed files with 1,056 additions and 156 deletions.
54 changes: 35 additions & 19 deletions docs/admin/kubernetes_deployment.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
# Kubernetes Deployment Installation Guide
KServe supports `RawDeployment` mode to enable `InferenceService` deployment with Kubernetes resources [`Deployment`](https://kubernetes.io/docs/concepts/workloads/controllers/deployment), [`Service`](https://kubernetes.io/docs/concepts/services-networking/service), [`Ingress`](https://kubernetes.io/docs/concepts/services-networking/ingress) and [`Horizontal Pod Autoscaler`](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale). Comparing to serverless deployment it unlocks Knative limitations such as mounting multiple volumes, on the other hand `Scale down and from Zero` is not supported in `RawDeployment` mode.

Kubernetes 1.22 is the minimally required version and please check the following recommended Istio versions for the corresponding
Kubernetes 1.28 is the minimally required version and please check the following recommended Istio versions for the corresponding
Kubernetes version.

## Recommended Version Matrix
| Kubernetes Version | Recommended Istio Version |
| :----------------- | :------------------------ |
| 1.27 | 1.18, 1.19 |
| 1.28 | 1.19, 1.20 |
| 1.29 | 1.20, 1.21 |
| 1.28 | 1.22 |
| 1.29 | 1.22, 1.23 |
| 1.30 | 1.22, 1.23 |

## 1. Install Istio
## 1. Install Ingress Controller

The minimally required Istio version is 1.13 and you can refer to the [Istio install guide](https://istio.io/latest/docs/setup/install).
In this guide we choose to install Istio as ingress controller. The minimally required Istio version is 1.22 and you can refer to the [Istio install guide](https://istio.io/latest/docs/setup/install).

Once Istio is installed, create `IngressClass` resource for istio.
```yaml
Expand All @@ -25,11 +25,9 @@ spec:
controller: istio.io/ingress-controller
```
!!! note
Istio ingress is recommended, but you can choose to install with other [Ingress controllers](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/) and create `IngressClass` resource for your Ingress option.




## 2. Install Cert Manager
The minimally required Cert Manager version is 1.15.0 and you can refer to [Cert Manager installation guide](https://cert-manager.io/docs/installation/).
Expand All @@ -41,31 +39,49 @@ The minimally required Cert Manager version is 1.15.0 and you can refer to [Cert
!!! note
The default KServe deployment mode is `Serverless` which depends on Knative. The following step changes the default deployment mode to `RawDeployment` before installing KServe.

=== "Install using Helm"

I. Install KServe CRDs

```shell
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }}
```

II. Install KServe Resources

Set the `kserve.controller.deploymentMode` to `RawDeployment` and `kserve.controller.gateway.ingressGateway.className` to point to the `IngressClass`
name created in [step 1](#1-install-ingress-controller).

```shell
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }} \
--set kserve.controller.deploymentMode=RawDeployment \
--set kserve.controller.gateway.ingressGateway.className=your-ingress-class
```

=== "Install using YAML"

**i. Install KServe**
I. Install KServe:
`--server-side` option is required as the InferenceService CRD is large, see [this issue](https://github.com/kserve/kserve/issues/3487) for details.

=== "kubectl"
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v{{ kserve_release_version }}/kserve.yaml
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve.yaml
```

Install KServe default serving runtimes:
II. Install KServe default serving runtimes:

=== "kubectl"
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v{{ kserve_release_version }}/kserve-cluster-resources.yaml
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve-cluster-resources.yaml
```

**ii. Change default deployment mode and ingress option**
III. Change default deployment mode and ingress option

First in ConfigMap `inferenceservice-config` modify the `defaultDeploymentMode` in the `deploy` section,
First in the ConfigMap `inferenceservice-config` modify the `defaultDeploymentMode` from the `deploy` section to `RawDeployment`,

=== "kubectl"
```bash
kubectl patch configmap/inferenceservice-config -n kserve --type=strategic -p '{"data": {"deploy": "{\"defaultDeploymentMode\": \"RawDeployment\"}"}}'
```

then modify the `ingressClassName` in `ingress` section to point to `IngressClass` name created in [step 1](#1-install-istio).
then modify the `ingressClassName` from `ingress` section to the `IngressClass` name created in [step 1](#1-install-ingress-controller).
```yaml
ingress: |-
{
Expand Down
25 changes: 19 additions & 6 deletions docs/admin/serverless/serverless.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,16 +33,29 @@ The minimally required Cert Manager version is 1.15.0 and you can refer to [Cert
Cert manager is required to provision webhook certs for production grade installation, alternatively you can run self signed certs generation script.

## 4. Install KServe
=== "kubectl"

=== "Install using Helm"

Install KServe CRDs
```bash
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }}
```

Install KServe Resources
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v{{ kserve_release_version }}/kserve.yaml
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }}
```

=== "Install using YAML"
Install KServe CRDs and Controller, `--server-side` option is required as the InferenceService CRD is large, see [this issue](https://github.com/kserve/kserve/issues/3487) for details.

## 5. Install KServe Built-in ClusterServingRuntimes
{{ kserve_release_version }}
=== "kubectl"
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v{{ kserve_release_version }}/kserve-cluster-resources.yaml
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{kserve_release_version}}/kserve.yaml
```

Install KServe Built-in ClusterServingRuntimes
```bash
kubectl apply --server-side -f https://github.com/kserve/kserve/releases/download/v{{ kserve_release_version }}/kserve-cluster-resources.yaml
```

!!! note
Expand Down
8 changes: 4 additions & 4 deletions docs/admin/serverless/servicemesh/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ Apply the `PeerAuthentication` and `AuthorizationPolicy` rules with [auth.yaml](
kubectl apply -f auth.yaml
```

### Disable Top Level Virtual Service {#disable-top-level-vs}
### Disable Top Level Virtual Service {{ '{' }}#disable-top-level-vs{{ '}' }}
KServe currently creates an Istio top level virtual service to support routing between InferenceService components like predictor, transformer and explainer, as well as support path based routing as an alternative routing with service hosts.
In serverless service mesh mode this creates a problem that in order to route through the underlying virtual service created by Knative Service, the top level virtual service is required to route to the `Istio Gateway` instead of to the InferenceService component on the service mesh directly.

Expand All @@ -80,7 +80,7 @@ ingress : |- {
}
```

## Turn on strict mTLS on the entire service mesh {#mesh-wide-mtls}
## Turn on strict mTLS on the entire service mesh {{ '{' }}#mesh-wide-mtls{{ '}' }}
In the previous section, turning on strict mTLS on a namespace is discussed. For users requiring to lock down all workloads in the service mesh, Istio can be configured with [strict mTLS on the whole mesh](https://istio.io/latest/docs/tasks/security/authentication/mtls-migration/#lock-down-mutual-tls-for-the-entire-mesh).

Istio's Mutual TLS Migration docs are using `PeerAuthentication` resources to lock down the mesh, which act at server side. This means that Istio sidecars will only reject non-mTLS incoming connections while non-TLS _outgoing_ connections will still be allowed ([reference](https://istio.io/latest/docs/concepts/security/#authentication-policies)). To further lock down the mesh, you can also create a `DestinationRule` resource to require mTLS on outgoing connections. However, under this very strict mTLS configuration, you may notice that the KServe top level virtual service will stop working and inference requests will be blocked. You can fix this either by disabling the top level virtual service [as mentioned above](#disable-top-level-vs), or by configuring the Knative's local gateway with mTLS on its listening port.
Expand Down Expand Up @@ -109,7 +109,7 @@ spec:

After patching Knative's local gateway resource, the KServe top level virtual service will work again.

## Deploy InferenceService with Istio sidecar injection {#isvc-inject-sidecar}
## Deploy InferenceService with Istio sidecar injection {{ '{' }}#isvc-inject-sidecar{{ '}' }}
First label the namespace with `istio-injection=enabled` to turn on the sidecar injection for the namespace.

```bash
Expand Down Expand Up @@ -294,7 +294,7 @@ kubectl exec -it sleep-6d6b49d8b8-6ths6 -- curl -v sklearn-iris-burst-predictor-
{"name":"sklearn-iris-burst","ready":true}
```

## Invoking InferenceServices from workloads that are not part of the mesh {#invoking-isvc-non-mesh}
## Invoking InferenceServices from workloads that are not part of the mesh {{ '{' }}#invoking-isvc-non-mesh{{ '}'}}
Ideally, when using service mesh, all involved workloads should belong to the service mesh. This allows enabling strict mTLS in Istio and ensures policies are correctly applied. However, given the diverse requirements of applications, it is not always possible to migrate all workloads to the service mesh.

When using KServe, you may successfully migrate InferenceServices to the service mesh (i.e. by [injecting the Istio sidecar](#isvc-inject-sidecar)), while workloads invoking inferencing remain outside the mesh. In this hybrid environment, workloads that are not part of the service mesh need to use an Istio ingress gateway as a port-of-entry to InferenceServices. In the default setup, KServe integrates with the gateway used by Knative. Both Knative and KServe apply the needed configurations to allow for these hybrid environments, however, this only works transparently if you haven't enabled strict TLS on Istio.
Expand Down
2 changes: 1 addition & 1 deletion docs/blog/articles/2022-02-18-KServe-0.8-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ spec:
- name: kserve-container
image: kserve/sklearnserver:latest
args:
- --model_name={{.Name}}
- --model_name={{ '{{' }}.Name{{ '}}'}}
- --model_dir=/mnt/models
- --http_port=8080
resources:
Expand Down
13 changes: 11 additions & 2 deletions docs/developer/developer.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Before submitting a PR, see also [CONTRIBUTING.md](https://github.com/kserve/kse

You must install these tools:

1. [`go`](https://golang.org/doc/install): KServe controller is written in Go and requires Go 1.20.0+.
1. [`go`](https://golang.org/doc/install): KServe controller is written in Go and requires Go 1.22.7+.
1. [`git`](https://help.github.com/articles/set-up-git/): For source control.
1. [`Go Module`](https://blog.golang.org/using-go-modules): Go's new dependency management system.
1. [`ko`](https://github.com/google/ko):
Expand All @@ -24,6 +24,8 @@ You must install these tools:
managing development environments.
1. [`kustomize`](https://github.com/kubernetes-sigs/kustomize/) To customize YAMLs for different environments, requires v5.0.0+.
1. [`yq`](https://github.com/mikefarah/yq) yq is used in the project makefiles to parse and display YAML output, requires yq `4.*`.
1. [`pre-commit`](https://pre-commit.com/) pre-commit is used to run checks on the codebase before committing changes.
1. [`helm`](https://helm.sh/docs/intro/install/) Helm is used to install Kserve.

### Install Knative on a Kubernetes cluster

Expand All @@ -33,7 +35,7 @@ KServe currently requires `Knative Serving` for auto-scaling, canary rollout, `I
use the [Knative Operators](https://knative.dev/docs/install/operator/knative-with-operators/) to manage your installation. Observability, tracing and logging are optional but are often very valuable tools for troubleshooting difficult issues,
they can be installed via the [directions here](https://github.com/knative/docs/blob/release-0.15/docs/serving/installing-logging-metrics-traces.md).

* If you start from scratch, KServe requires Kubernetes 1.25+, Knative 1.7+, Istio 1.15+.
* If you start from scratch, KServe requires Kubernetes, Knative, Istio. You can find the recommended [version matrix](../admin/serverless/serverless.md#recommended-version-matrix) in the installation documentation.

* If you already have `Istio` or `Knative` (e.g. from a Kubeflow install) then you don't need to install them explicitly, as long as version dependencies are satisfied.

Expand Down Expand Up @@ -97,6 +99,13 @@ _Adding the `upstream` remote sets you up nicely for regularly
Once you reach this point you are ready to do a full build and deploy as
described below.

### Install pre-commit hooks
Configuring pre-commit hooks will run checks on the codebase before committing changes. This will help you to catch lint errors, formatting issues, and other common problems before they reach the repository.

```shell
pre-commit install --install-hooks
```

## Deploy KServe

### Check Knative Serving installation
Expand Down
11 changes: 3 additions & 8 deletions docs/get_started/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ You can use [`kind`](https://kind.sigs.k8s.io/docs/user/quick-start){target=_bla

The [Kubernetes CLI (`kubectl`)](https://kubernetes.io/docs/tasks/tools/install-kubectl){target=_blank}, allows you to run commands against Kubernetes clusters. You can use `kubectl` to deploy applications, inspect and manage cluster resources, and view logs.

### Install Helm
The [Helm](https://helm.sh/docs/intro/install/){target=_blank} package manager for Kubernetes helps you define, install and upgrade software built for Kubernetes.

## Install the KServe "Quickstart" environment
1. After having kind installed, create a `kind` cluster with:
Expand All @@ -31,17 +33,10 @@ The [Kubernetes CLI (`kubectl`)](https://kubernetes.io/docs/tasks/tools/install-
```bash
kubectl config use-context kind-kind
```

to use this context.

3. You can then get started with a local deployment of KServe by using _KServe Quick installation script on Kind_:

```bash
curl -s "https://raw.githubusercontent.com/kserve/kserve/release-0.13/hack/quick_install.sh" | bash
curl -s "https://raw.githubusercontent.com/kserve/kserve/release-{{ kserve_release_version | replace('.0', '') }}/hack/quick_install.sh" | bash
```

or install via our published Helm Charts:
```bash
helm install kserve-crd oci://ghcr.io/kserve/charts/kserve-crd --version v{{ kserve_release_version }}
helm install kserve oci://ghcr.io/kserve/charts/kserve --version v{{ kserve_release_version }}
```
30 changes: 17 additions & 13 deletions docs/modelserving/explainer/alibi/cifar10/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ spec:
resources:
requests:
cpu: 0.1
memory: 5Gi
memory: 5Gi
limits:
memory: 10Gi
explainer:
Expand All @@ -25,18 +25,22 @@ spec:
image: kserve/alibi-explainer:v0.12.1
args:
- --model_name=cifar10
alibi:
type: AnchorImages
storageUri: "gs://kfserving-examples/models/tensorflow/cifar/explainer-0.9.1"
config:
batch_size: "40"
stop_on_first: "True"
resources:
requests:
cpu: 0.1
memory: 5Gi
limits:
memory: 10Gi
- --http_port=8080
- --predictor_host=cifar10-predictor.default
- --storage_uri=/mnt/models
- AnchorImages
- --batch_size=40
- --stop_on_first=True
env:
- name: STORAGE_URI
value: "gs://kfserving-examples/models/tensorflow/cifar/explainer-0.9.1"
resources:
requests:
cpu: 0.1
memory: 5Gi
limits:
cpu: 1
memory: 10Gi
```
!!! Note
The InferenceService resource describes:
Expand Down
35 changes: 22 additions & 13 deletions docs/modelserving/explainer/alibi/cifar10/cifar10.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,28 @@ spec:
resources:
requests:
cpu: 0.1
memory: 5Gi
memory: 5Gi
limits:
memory: 10Gi
explainer:
alibi:
type: AnchorImages
storageUri: "gs://kfserving-examples/models/tensorflow/cifar/explainer-0.9.1"
config:
batch_size: "40"
stop_on_first: "True"
resources:
requests:
cpu: 0.1
memory: 5Gi
limits:
memory: 10Gi
containers:
- name: kserve-container
image: kserve/alibi-explainer:v0.12.1
args:
- --model_name=cifar10
- --http_port=8080
- --predictor_host=cifar10-predictor.default
- --storage_uri=/mnt/models
- AnchorImages
- --batch_size=40
- --stop_on_first=True
env:
- name: STORAGE_URI
value: "gs://kfserving-examples/models/tensorflow/cifar/explainer-0.9.1"
resources:
requests:
cpu: 0.1
memory: 5Gi
limits:
cpu: 1
memory: 10Gi
29 changes: 19 additions & 10 deletions docs/modelserving/explainer/alibi/income/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,25 @@ spec:
memory: 1Gi
explainer:
minReplicas: 1
alibi:
type: AnchorTabular
storageUri: "gs://kfserving-examples/models/sklearn/1.3/income/explainer"
resources:
requests:
cpu: 0.1
memory: 1Gi
limits:
cpu: 1
memory: 4Gi
containers:
- name: kserve-container
image: kserve/alibi-explainer:v0.12.1
args:
- --model_name=income
- --http_port=8080
- --predictor_host=income-predictor.default
- --storage_uri=/mnt/models
- AnchorTabular
env:
- name: STORAGE_URI
value: "gs://kfserving-examples/models/sklearn/1.3/income/explainer"
resources:
requests:
cpu: 0.1
memory: 1Gi
limits:
cpu: 1
memory: 4Gi
```
Create the InferenceService with above yaml:
Expand Down
Loading

0 comments on commit d1ee184

Please sign in to comment.