How to install Kubeflow Pipelines v2 on Apple Silicon

Médéric Hurier (Fmind)
5 min readSep 24, 2022

Kubeflow Pipelines (KFP) is a powerful platform for building machine learning pipelines at scale with Kubernetes. The platform is well supported on major cloud platforms such as GCP (Vertex AI Pipelines) or AWS (Kubeflow on AWS). However, installing KFP on Apple Silicon (macOS 12.5.1 with Apple M1 Pro) proved to be more challenging than I imagined. Thus, I wanted to share my experience and tips to install KFP as easily as possible on your shiny Mac.

In this article, I present 4 steps to install Kubeflow on Apple Silicon, using Rancher Desktop for setting up Docker/Kubernetes. In the end, I list the problems I encountered during the installation of Kubeflow Pipelines.

Step 1: Install Rancher Desktop to setup Kubernetes

Rancher Desktop is the most friendly solution I found to install Docker and Kubernetes. While Docker Desktop is also a popular option, this solution now has a significant price tag for companies with more than 250 employees OR $10 million in annual revenue.

Once you install Rancher Desktop, you need to configure it using the startup window below. Select the stable version of Kubernetes and let Rancher Desktop configures the path automatically. In this case, we use containerd as this is the default container runtime for Kubernetes.

I also recommend you to adjust the # CPUs and the Memory (GB) in Preferences > Virtual Machine to avoid Out of Memory errors (OOM).

You show now be able to access the kubectl command in your terminal. If not, check that your PATH environment variable includes $HOME/.rd/bin where Rancher Desktop binaries are installed by default.

Step 2 : Install Kubeflow Pipelines

The official instructions to install Kubeflow Pipelines are available on the documentation website for KFP v1. You can also find the latest instructions on Kubeflow Pipelines GitHub repository for KFP v2. As you can see, there is no easy one-click solution available at the time, but we can easily adapt the instructions.

You need to make two choices for installing Kubeflow Pipelines:

  • The version of KFP (KFP_VERSION): We select the latest v2 version (v2.0.0b4) for the installation. This is a beta release at the time of this writing, but I found this version stable enough for my use case.
  • The platform for KFP (KFP_PLATFORM): This configuration defines the Argo Workflow execution engine to choose from. I select the new emissary platform (platform-agnostic-emissary) that is now shipped by default in the new versions of KFP.
# set the variables for the installation
KFP_PLATFORM=platform-agnostic-emissary
KFP_VERSION=2.0.0b4
# star the installation using kubectl aply
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$KFP_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/$KFP_PLATFORM?ref=$KFP_VERSION"

The last command may take several minutes to complete, as Kubernetes needs to download the required container images and create the pods for Kubeflow Pipelines. You can monitor the deployment happening in the kubeflow namespace using this command:

kubectl get pods -n kubeflow --watch

Once all Kubeflow Pipelines Pods have the RUNNING Status, you can move to the next step.

Step 3: Port Forwarded

We need to forward the port of KFP UI to your system to access the main dashboard from your browser. To do so, set a port forward from the ml-pipeline-ui Pod to a port on your machine (e.g., 8443 in the example below).

kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8443:80

You should now be able to access Kubeflow Pipelines at this address: http://localhost:8443/#/pipelines

You can see in the screenshot above the pipelines shipped by default with Kubeflow Pipelines.

Step 4: Run a Pipeline

To ensure the system is working properly, we are going to run the pipelines available by default on a new installation. As a reminder, a KFP Pipeline is a Directed Acyclic Graph (DAG) of components that generates outputs from the inputs and parameters given. The screenshot below shows the structure of the [Demo] XGBBoost — Iterative model training pipeline.

[Demo] XGBBoost — Iterative model training pipeline

I tried all the pipelines available, but some of them had issues that do not seem related to the installation procedure:

  • 🛑 [Demo] XGBoost — Iterative model training: the container that trains the model (Xgboost train) does not include the CMake utility.
  • 🛑 [Demo] TFX — Taxi prediction model training: the pipeline only works on GCP as indicated in the description of the pipeline.

(Optional) What went wrong

This guide was written after many trials and errors. You can find below the issues I encountered during the installation of Kubeflow Pipelines v2.

  • Constant ErrImagePull and ImagePullBackoff during installation: I was not able to install KFP with the docker (moby) engine because of this error. One workaround is to connect directly to the Cluster Node and pull the image directly to fix this error:
# open a shell to the node executing your pods
rdctl shell
# install the required container images manually
docker pull ...
  • Slow UI with Rancher Desktop port forwarding: Rancher Desktop provides a GUI to set up port forwarding. However, the KFP UI was not responsive when I follow this method.
  • Solutions not compatible with Apple Silicon: other alternatives to Rancher Desktop like minikube and kind didn’t work on my system (although I did not investigate these issues further).

Conclusions

While Kubeflow Pipelines is a nice system to use out of the box on Cloud platforms, I find its installation cumbersome for local development. I hope this guide will help you set up KFP on your system, and create amazing machine learning models!

Happy Artificial Intelligence ✌️!

--

--

Médéric Hurier (Fmind)

Freelancer: AI/FM/MLOps Engineer | Data Scientist | MLOps Community Organizer | MLflow Ambassador | Hacker | PhD