Skip to content

Commit bbd2a67

Browse files
authored
Fix Kubeflow orchestrator docs (#2985)
1 parent e2d7f4f commit bbd2a67

File tree

1 file changed

+60
-72
lines changed

1 file changed

+60
-72
lines changed

docs/book/component-guide/orchestrators/kubeflow.md

Lines changed: 60 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,7 @@ You should use the Kubeflow orchestrator if:
2121

2222
### How to deploy it
2323

24-
The Kubeflow orchestrator supports two different modes: `Local` and `remote`. In case you want to run the orchestrator on a local Kubernetes cluster running on your machine, there is no additional infrastructure setup necessary.
25-
26-
If you want to run your pipelines on a remote cluster instead, you'll need to set up a Kubernetes cluster and deploy Kubeflow Pipelines:
24+
To run ZenML pipelines on Kubeflow, you'll need to set up a Kubernetes cluster and deploy Kubeflow Pipelines on it. This can be done in a variety of ways, depending on whether you want to use a cloud provider or your own infrastructure:
2725

2826
{% tabs %}
2927
{% tab title="AWS" %}
@@ -41,7 +39,7 @@ If you want to run your pipelines on a remote cluster instead, you'll need to se
4139
{% tab title="GCP" %}
4240
* Have an existing GCP [GKE cluster](https://cloud.google.com/kubernetes-engine/docs/quickstart) set up.
4341
* Make sure you have the [Google Cloud CLI](https://cloud.google.com/sdk/docs/install-sdk) set up first.
44-
* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and [configure](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl) it to talk to your GKE cluster using the following command:
42+
* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and [configure](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl) it to talk to your GKE cluster using the following command:
4543
4644
```powershell
4745
gcloud container clusters get-credentials CLUSTER_NAME
@@ -53,31 +51,33 @@ If you want to run your pipelines on a remote cluster instead, you'll need to se
5351
{% tab title="Azure" %}
5452
* Have an existing [AKS cluster](https://azure.microsoft.com/en-in/services/kubernetes-service/#documentation) set up.
5553
* Make sure you have the [`az` CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) set up first.
56-
* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and ensure that it talks to your AKS cluster using the following command:
54+
* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and ensure that it talks to your AKS cluster using the following command:
5755
5856
```powershell
5957
az aks get-credentials --resource-group RESOURCE_GROUP --name CLUSTER_NAME
6058
```
6159
* [Install](https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/#deploying-kubeflow-pipelines) Kubeflow Pipelines onto your cluster.
6260
63-
> Since Kubernetes v1.19, AKS has shifted
64-
65-
to [`containerd`](https://docs.microsoft.com/en-us/azure/aks/cluster-configuration#container-settings)
66-
67-
> . However, the workflow controller installed with the Kubeflow installation has `Docker` set as the default runtime. In order to make your pipelines work, you have to change the value to one of the options
61+
{% hint style="info" %}
62+
Since Kubernetes v1.19, AKS has shifted to [`containerd`](https://docs.microsoft.com/en-us/azure/aks/cluster-configuration#container-settings).
63+
However, the workflow controller installed with the Kubeflow installation has `Docker` set as the default runtime. In order to make your pipelines work, you have to change the value to one of the options listed [here](https://argoproj.github.io/argo-workflows/workflow-executors/#workflow-executors), preferably `k8sapi`.
6864
69-
listed [here](https://argoproj.github.io/argo-workflows/workflow-executors/#workflow-executors)
65+
This change has to be made by editing the `containerRuntimeExecutor` property of the `ConfigMap` corresponding to the workflow controller. Run the following commands to first know what config map to change and then to edit it to reflect your new value:
7066
71-
> , preferably `k8sapi`.
72-
>
73-
> This change has to be made by editing the `containerRuntimeExecutor` property of the `ConfigMap` corresponding to the workflow controller. Run the following commands to first know what config map to change and then to edit it to reflect your new value.
74-
>
75-
> ```
76-
> kubectl get configmap -n kubeflow
77-
> kubectl edit configmap CONFIGMAP_NAME -n kubeflow
78-
> # This opens up an editor that can be used to make the change.
79-
> ```
67+
```
68+
kubectl get configmap -n kubeflow
69+
kubectl edit configmap CONFIGMAP_NAME -n kubeflow
70+
# This opens up an editor that can be used to make the change.
71+
```
72+
{% endhint %}
73+
{% endtab %}
74+
{% tab title="Other Kubernetes" %}
75+
* Have an existing Kubernetes cluster set up.
76+
* Download and [install](https://kubernetes.io/docs/tasks/tools/) `kubectl` and configure it to talk to your Kubernetes cluster.
77+
* [Install](https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/#deploying-kubeflow-pipelines) Kubeflow Pipelines onto your cluster.
78+
* ( optional) [set up a Kubernetes Service Connector](../../how-to/auth-management/kubernetes-service-connector.md) to grant ZenML Stack Components easy and secure access to the remote Kubernetes cluster. This is especially useful if your Kubernetes cluster is remotely accessible, as this enables other ZenML users to use it to run pipelines without needing to configure and set up `kubectl` on their local machines.
8079
{% endtab %}
80+
8181
{% endtabs %}
8282
8383
{% hint style="info" %}
@@ -102,67 +102,40 @@ You can pass other configurations specific to the stack components as key-value
102102

103103
To use the Kubeflow orchestrator, we need:
104104

105-
* The ZenML `kubeflow` integration installed. If you haven't done so, run
105+
* A Kubernetes cluster with Kubeflow pipelines installed. See the [deployment section](kubeflow.md#how-to-deploy-it) for more information.
106+
* A ZenML server deployed remotely where it can be accessed from the Kubernetes cluster. See the [deployment guide](../../getting-started/deploying-zenml/README.md) for more information.
107+
* The ZenML `kubeflow` integration installed. If you haven't done so, run
106108

107109
```shell
108110
zenml integration install kubeflow
109111
```
110-
* [Docker](https://www.docker.com) installed and running.
112+
* [Docker](https://www.docker.com) installed and running (unless you are using a remote [Image Builder](../image-builders/image-builders.md) in your ZenML stack).
111113
* [kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) installed (optional, see below)
112114

113115
{% hint style="info" %}
114116
If you are using a single-tenant Kubeflow installed in a Kubernetes cluster managed by a cloud provider like AWS, GCP or Azure, it is recommended that you set up [a Service Connector](../../how-to/auth-management/service-connectors-guide.md) and use it to connect ZenML Stack Components to the remote Kubernetes cluster. This guarantees that your Stack is fully portable on other environments and your pipelines are fully reproducible.
115117
{% endhint %}
116118

117-
{% tabs %}
118-
{% tab title="Local" %}
119-
When using the Kubeflow orchestrator locally, you'll additionally need:
120-
121-
* [K3D](https://k3d.io/v5.2.1/#installation) installed to spin up a local Kubernetes cluster.
122-
* [Terraform](https://www.terraform.io/downloads.html) installed to set up the Kubernetes cluster with various deployments.
123-
* [MLStacks](https://mlstacks.zenml.io) installed to handle the deployment
124-
125-
To run the pipeline on a local Kubeflow Pipelines deployment, you can use the ZenML `mlstacks` package to spin up a local Kubernetes cluster and install Kubeflow Pipelines on it.
126-
127-
To deploy the stack, run the following commands:
128-
129-
```shell
130-
# Deploy the stack using the ZenML CLI:
131-
zenml stack deploy k3d-modular -o kubeflow -a minio --provider k3d
132-
zenml stack set k3d-modular
133-
```
134-
135-
```shell
136-
# Get the Kubeflow Pipelines UI endpoint
137-
kubectl get ingress -n kubeflow -o jsonpath='{.items[0].spec.rules[0].host}'
138-
```
139-
140-
You can read more about `mlstacks` on [our dedicated documentation page here](https://mlstacks.zenml.io).
141-
142-
{% hint style="warning" %}
143-
The local Kubeflow Pipelines deployment requires more than 4 GB of RAM, and 30 GB of disk space, so if you are using Docker Desktop make sure to update the resource limits in the preferences.
144-
{% endhint %}
145-
{% endtab %}
146-
147-
{% tab title="Remote" %}
148-
When using the Kubeflow orchestrator with a remote cluster, you'll additionally need:
149-
150-
* A remote ZenML server deployed to the cloud. See the [deployment guide](../../getting-started/deploying-zenml/README.md) for more information.
151-
* Kubeflow pipelines deployed on a remote cluster. See the [deployment section](kubeflow.md#how-to-deploy-it) for more information.
152-
* The name of your Kubernetes context which points to your remote cluster. Run `kubectl config get-contexts` to see a list of available contexts. **NOTE**: this is no longer required if you are using [a Service Connector ](../../how-to/auth-management/service-connectors-guide.md)to connect your Kubeflow Orchestrator Stack Component to the remote Kubernetes cluster.
119+
* The name of your Kubernetes context which points to your remote cluster. Run `kubectl config get-contexts` to see a list of available contexts. **NOTE**: this is no longer required if you are using [a Service Connector](../../how-to/auth-management/service-connectors-guide.md) to connect your Kubeflow Orchestrator Stack Component to the remote Kubernetes cluster.
153120
* A [remote artifact store](../artifact-stores/artifact-stores.md) as part of your stack.
154121
* A [remote container registry](../container-registries/container-registries.md) as part of your stack.
155122

156123
We can then register the orchestrator and use it in our active stack. This can be done in two ways:
157124

158125
1. If you have [a Service Connector](../../how-to/auth-management/service-connectors-guide.md) configured to access the remote Kubernetes cluster, you no longer need to set the `kubernetes_context` attribute to a local `kubectl` context. In fact, you don't need the local Kubernetes CLI at all. You can [connect the stack component to the Service Connector](../../how-to/auth-management/service-connectors-guide.md#connect-stack-components-to-resources) instead:
159126
127+
```shell
128+
# List all available Kubernetes clusters that can be accessed by service connectors
129+
zenml service-connector list-resources --resource-type kubernetes-cluster -e
130+
# Register the Kubeflow orchestrator and connect it to the remote Kubernetes cluster
131+
zenml orchestrator register <ORCHESTRATOR_NAME> --flavor kubeflow --connector <SERVICE_CONNECTOR_NAME> --resource-id <KUBERNETES_CLUSTER_NAME>
132+
# Register a new stack with the orchestrator
133+
zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> -a <ARTIFACT_STORE_NAME> -c <CONTAINER_REGISTRY_NAME> ... # Add other stack components as needed
160134
```
161-
$ zenml orchestrator register <ORCHESTRATOR_NAME> --flavor kubeflow
162-
Running with active workspace: 'default' (repository)
163-
Running with active stack: 'default' (repository)
164-
Successfully registered orchestrator `<ORCHESTRATOR_NAME>`.
165135
136+
The following example demonstrates how to register the orchestrator and connect it to a remote Kubernetes cluster using a Service Connector:
137+
138+
```shell
166139
$ zenml service-connector list-resources --resource-type kubernetes-cluster -e
167140
The following 'kubernetes-cluster' resources can be accessed by service connectors configured in your workspace:
168141
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓
@@ -176,34 +149,49 @@ We can then register the orchestrator and use it in our active stack. This can b
176149
┃ 1c54b32a-4889-4417-abbd-42d3ace3d03a │ gcp-sa-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃
177150
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛
178151
179-
$ zenml orchestrator connect <ORCHESTRATOR_NAME> --connector aws-iam-multi-us
180-
Running with active workspace: 'default' (repository)
181-
Running with active stack: 'default' (repository)
182-
Successfully connected orchestrator `<ORCHESTRATOR_NAME>` to the following resources:
152+
$ zenml orchestrator register aws-kubeflow --flavor kubeflow --connector aws-iam-multi-eu --resource-id zenhacks-cluster
153+
Successfully registered orchestrator `aws-kubeflow`.
154+
Successfully connected orchestrator `aws-kubeflow` to the following resources:
183155
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━┓
184156
┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃
185157
┠──────────────────────────────────────┼──────────────────┼────────────────┼───────────────────────┼──────────────────┨
186158
┃ ed528d5a-d6cb-4fc4-bc52-c3d2d01643e5 │ aws-iam-multi-us │ 🔶 aws │ 🌀 kubernetes-cluster │ zenhacks-cluster ┃
187159
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━┛
188160
189-
# Add the orchestrator to the active stack
190-
$ zenml stack update -o <ORCHESTRATOR_NAME>
161+
# Create a new stack with the orchestrator
162+
$ zenml stack register --set aws-kubeflow -o aws-kubeflow -a aws-s3 -c aws-ecr
163+
Stack 'aws-kubeflow' successfully registered!
164+
Stack Configuration
165+
┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━┓
166+
┃ COMPONENT_TYPE │ COMPONENT_NAME ┃
167+
┠────────────────────┼─────────────────┨
168+
┃ ARTIFACT_STORE │ aws-s3 ┃
169+
┠────────────────────┼─────────────────┨
170+
┃ ORCHESTRATOR │ aws-kubeflow ┃
171+
┠────────────────────┼─────────────────┨
172+
┃ CONTAINER_REGISTRY │ aws-ecr ┃
173+
┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━┛
174+
'aws-kubeflow' stack
175+
No labels are set for this stack.
176+
Stack 'aws-kubeflow' with id 'dab28f94-36ab-467a-863e-8718bbc1f060' is owned by user user.
177+
Active global stack set to:'aws-kubeflow'
191178
```
192-
2. if you don't have a Service Connector on hand and you don't want to [register one](../../how-to/auth-management/service-connectors-guide.md#register-service-connectors) , the local Kubernetes `kubectl` client needs to be configured with a configuration context pointing to the remote cluster. The `kubernetes_context` stack component must also be configured with the value of that context:
179+
180+
2. if you don't have a Service Connector on hand and you don't want to [register one](../../how-to/auth-management/service-connectors-guide.md#register-service-connectors), the local Kubernetes `kubectl` client needs to be configured with a configuration context pointing to the remote cluster. The `kubernetes_context` stack component must also be configured with the value of that context:
193181
194182
```shell
195183
zenml orchestrator register <ORCHESTRATOR_NAME> \
196184
--flavor=kubeflow \
197185
--kubernetes_context=<KUBERNETES_CONTEXT>
198186
199-
# Add the orchestrator to the active stack
200-
zenml stack update -o <ORCHESTRATOR_NAME>
187+
# Register a new stack with the orchestrator
188+
zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> -a <ARTIFACT_STORE_NAME> -c <CONTAINER_REGISTRY_NAME> ... # Add other stack components as needed
201189
```
202190
{% endtab %}
203191
{% endtabs %}
204192
205193
{% hint style="info" %}
206-
ZenML will build a Docker image called `<CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME>` which includes your code and use it to run your pipeline steps in Kubeflow. Check out [this page](../../how-to/customize-docker-builds/README.md) if you want to learn more about how ZenML builds these images and how you can customize them.
194+
ZenML will build a Docker image called `<CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME>` which includes all required software dependencies and use it to run your pipeline steps in Kubeflow. Check out [this page](../../how-to/customize-docker-builds/README.md) if you want to learn more about how ZenML builds these images and how you can customize them.
207195
{% endhint %}
208196
209197
You can now run any ZenML pipeline using the Kubeflow orchestrator:

0 commit comments

Comments
 (0)