This directory contains code for the components that comprise the Kubeflow Pipelines backend.
To run all unittests for backend:
go test -v -cover ./backend/...
The API server itself can be built using:
go build -o /tmp/apiserver backend/src/apiserver/*.go
Backend codebase follows the Google's Go Style Guide. Please, take time to get familiar with the best practices. It is not intended to be exhaustive, but it often helps minimizing guesswork among developers and keep codebase uniform and consistent.
We use golangci-lint tool that can catch common mistakes locally (see detailed configuration here). It can be conveniently integrated with multiple popular IDEs such as VS Code or Vim.
Finally, it is advised to install pre-commit in order to automate linter checks (see configuration here)
The API server image can be built from the root folder of the repo using:
export API_SERVER_IMAGE=api_server
docker build -f backend/Dockerfile . --tag $API_SERVER_IMAGE
Run
kubectl edit deployment.v1.apps/ml-pipeline -n kubeflow
You'll see the field reference the api server docker image. Change it to point to your own build, after saving and closing the file, apiserver will restart with your change.
After making changes to proto files, the Go client libraries, Python client libraries and swagger files need to be regenerated and checked-in. Refer to backend/api for details.
-
Install go-licenses tool and refer to its documentation for how to use it.
-
Run the tool to update all licenses:
make all
pip-tools is used to manage python
dependencies. To update dependencies, edit requirements.in
and run ./update_requirements.sh
to update and pin the transitive
dependencies.
pip-tools is used to manage python
dependencies. To update dependencies, edit requirements.in
and run ./update_requirements.sh
to update and pin the transitive
dependencies.
Run
docker build . -f backend/Dockerfile.conformance -t <tag>
This deploys a local Kubernetes cluster leveraging kind, with all the components required
to run the Kubeflow Pipelines API server. Note that the ml-pipeline
Deployment
(API server) has its replicas set to
0 so that the API server can be run locally for debugging and faster development. The local API server is available by
pods on the cluster using the ml-pipeline
Service
.
- The kind CLI is installed.
- The following ports are available on your localhost: 3000, 3306, 8080, 9000, and 8889. If these are unavailable, modify kind-config.yaml and configure the API server with alternative ports when running locally.
- If using a Mac, you will need to modify the Endpoints manifest to leverage the bridge network interface through Docker/Podman Desktop. See kind #1200 for an example manifest.
- Optional: VSCode is installed to leverage a sample
launch.json
file.
To provision the kind cluster, run the following from the Git repository's root directory,:
make -C backend dev-kind-cluster
This may take several minutes since there are many pods. Note that many pods will be in "CrashLoopBackOff" status until all the pods have started.
Run the following to delete the cluster:
kind delete clusters dev-pipelines-api
After the cluster is provisioned, you may leverage the following sample .vscode/launch.json
file to run the API
server locally:
{
"version": "0.2.0",
"configurations": [
{
"name": "Launch API Server (Kind)",
"type": "go",
"request": "launch",
"mode": "debug",
"program": "${workspaceFolder}/backend/src/apiserver",
"env": {
"POD_NAMESPACE": "kubeflow",
"DBCONFIG_MYSQLCONFIG_HOST": "localhost",
"MINIO_SERVICE_SERVICE_HOST": "localhost",
"MINIO_SERVICE_SERVICE_PORT": "9000",
"METADATA_GRPC_SERVICE_SERVICE_HOST": "localhost",
"METADATA_GRPC_SERVICE_SERVICE_PORT": "8080",
"ML_PIPELINE_VISUALIZATIONSERVER_SERVICE_HOST": "localhost",
"ML_PIPELINE_VISUALIZATIONSERVER_SERVICE_PORT": "8888"
},
"args": [
"--config",
"${workspaceFolder}/backend/src/apiserver/config",
"-logtostderr=true"
]
}
]
}
Once the cluster is provisioned and the API server is running, you can access the API server at http://localhost:8888 (e.g. http://localhost:8888/apis/v2beta1/pipelines).
You can also access the Kubeflow Pipelines web interface at http://localhost:3000.
You can also directly connect to the MariaDB database server with:
mysql -h 127.0.0.1 -u root
These instructions assume you are leveraging the Kind cluster in the Run Locally With a Kind Cluster section.
Run the following to create the backend/Dockerfile.driver-debug
file and build the container image
tagged as kfp-driver:debug
. This container image is based on backend/Dockerfile.driver
but installs
Delve, builds the binary without compiler optimizations so the binary matches the
source code (via GCFLAGS="all=-N -l"
), and copies the source code to the destination container for the debugger.
Any changes to the Driver code will require rebuilding this container image.
make -C backend image_driver_debug
Then load the container image in the Kind cluster.
make -C backend kind-load-driver-debug
Alternatively, you can use this Make target that does both.
make -C kind-build-and-load-driver-debug
You may use the following VS Code launch.json
file to run the API server which overrides the Driver
command to use Delve and the Driver image to use debug image built previously.
{
"version": "0.2.0",
"configurations": [
{
"name": "Launch API server (Kind) (Debug Driver)",
"type": "go",
"request": "launch",
"mode": "debug",
"program": "${workspaceFolder}/backend/src/apiserver",
"env": {
"POD_NAMESPACE": "kubeflow",
"DBCONFIG_MYSQLCONFIG_HOST": "localhost",
"MINIO_SERVICE_SERVICE_HOST": "localhost",
"MINIO_SERVICE_SERVICE_PORT": "9000",
"METADATA_GRPC_SERVICE_SERVICE_HOST": "localhost",
"METADATA_GRPC_SERVICE_SERVICE_PORT": "8080",
"ML_PIPELINE_VISUALIZATIONSERVER_SERVICE_HOST": "localhost",
"ML_PIPELINE_VISUALIZATIONSERVER_SERVICE_PORT": "8888",
"V2_DRIVER_IMAGE": "kfp-driver:debug",
"V2_DRIVER_COMMAND": "dlv exec --listen=:2345 --headless=true --api-version=2 --log /bin/driver --",
}
}
]
}
Start by launching a pipeline. This will eventually create a Driver pod that is waiting for a remote debug connection.
You can see the pods with the following command.
kubectl -n kubeflow get pods -w
Once you see a pod with -driver
in the name such as hello-world-clph9-system-dag-driver-10974850
, port forward
the Delve port in the pod to your localhost (replace <driver pod name>
with the actual name).
kubectl -n kubeflow port-forward <driver pod name> 2345:2345
Set a breakpoint on the Driver code in VS Code. Then remotely connect to the Delve debug session with the following VS
Code launch.json
file:
{
"version": "0.2.0",
"configurations": [
{
"name": "Connect to remote driver",
"type": "go",
"request": "attach",
"mode": "remote",
"remotePath": "/go/src/github.com/kubeflow/pipelines",
"port": 2345,
"host": "127.0.0.1",
}
]
}
Once the Driver pod succeeds, the remote debug session will close. Then repeat the process of forwarding the port of subsequent Driver pods and starting remote debug sessions in VS Code until the pipeline completes.
For debugging a specific Driver pod, you'll need to continuously port forward and connect to the remote debug session without a breakpoint so that Delve will continue execution until the Driver pod you are interested in starts up. At that point, you can set a break point, port forward, and connect to the remote debug session to debug that specific Driver pod.