Kubeflow is an open-source platform for machine learning and MLOps on Kubernetes introduced by Google. The different stages in a typical MLOps are represented with different software components in Kubeflow, including model development (Kubeflow Notebooks), model training (Kubeflow Pipelines, Kubeflow Training Operator), model serving (KServe), and automated machine learning (Katib).
Each component of Kubeflow can be deployed separately, and it is not a requirement to deploy every component.
History
The Kubeflow project was first announced at
KubeCon + CloudNativeCon North America 2017 by
Google engineers David Aronchick, Jeremy Lewi, and Vishnu Kannan
to address a perceived lack of flexible options for building production-ready machine learning systems.
The project has also stated it began as a way for Google to open-source how they ran
TensorFlow internally.
The first release of Kubeflow (Kubeflow 0.1) was announced at KubeCon + CloudNativeCon Europe 2018. Kubeflow 1.0 was released in March 2020 via a public blog post announcing that many Kubeflow components were graduating to a "stable status", indicating they were now ready for production usage.
In October 2022, Google announced that the Kubeflow project had applied to join the Cloud Native Computing Foundation. In July 2023, the foundation voted to accept Kubeflow as an incubating stage project.
Components
Kubeflow Notebooks for model development
Machine learning models are developed in the notebooks component called
Kubeflow Notebooks. The component runs web-based development environments inside a Kubernetes cluster, with native support for
Jupyter Notebook, Visual Studio Code, and
RStudio.
Kubeflow Pipelines for model training
Once developed, models are trained in the
Kubeflow Pipelines component. The component acts as a platform for building and deploying portable, scalable machine learning workflows based on Docker containers.
Google Cloud Platform has adopted the
Kubeflow Pipelines DSL within its
Vertex AI Pipelines product.
Kubeflow Training Operator for model training
For certain machine learning models and libraries, the
Kubeflow Training Operator component provides Kubernetes custom resources support. The component runs distributed or non-distributed
TensorFlow,
PyTorch,
Apache MXNet,
XGBoost, and
Open MPI training jobs on Kubernetes.
KServe for model serving
The
KServe component (previously named KFServing
) provides Kubernetes custom resources for serving machine learning models on arbitrary frameworks including
TensorFlow,
XGBoost,
scikit-learn,
PyTorch, and ONNX.
KServe was developed collaboratively by
Google,
IBM, Bloomberg,
NVIDIA, and Seldon.
Publicly disclosed adopters of KServe include Bloomberg,
Gojek,
the Wikimedia Foundation,
and others.
Katib for automated machine learning
Lastly, Kubeflow includes a component for automated training and development of machine learning models, the
Katib component. It is described as a Kubernetes-native project and features hyperparameter tuning,
early stopping, and neural architecture search.
Release timeline
+ Release timeline |
|
Kubeflow 0.1 | 5 April, 2018 | - | https://kubernetes.io/blog/2018/05/04/announcing-kubeflow-0.1/ |
Kubeflow 0.2 | 2 July, 2018 | - | https://medium.com/kubeflow/kubeflow-0-2-offers-new-components-and-simplified-setup-735e4c56988d |
Kubeflow 0.3 | 5 October, 2018 | - | https://medium.com/kubeflow/kubeflow-0-3-simplifies-setup-improves-ml-development-98b8ca10bd69 |
Kubeflow 0.4 | 8 January, 2019 | - | https://medium.com/kubeflow/kubeflow-0-4-release-enhancements-for-machine-learning-productivity-d77c54df07a9 |
Kubeflow 0.5 | 9 April, 2019 | - | https://medium.com/kubeflow/kubeflow-v0-5-simplifies-model-development-with-enhanced-ui-and-fairing-library-78e19cdc9f50 |
Kubeflow 0.6 | 19 July, 2019 | https://www.kubeflow.org/docs/releases/kubeflow-0.6/ | https://medium.com/kubeflow/kubeflow-v0-6-a-robust-foundation-for-artifact-tracking-data-versioning-multi-user-support-9896d329412c |
Kubeflow 0.7 | 17 October, 2019 | https://www.kubeflow.org/docs/releases/kubeflow-0.7/ | https://medium.com/kubeflow/kubeflow-v0-7-delivers-beta-functionality-in-the-leadup-to-v1-0-1e63036c07b8 |
Kubeflow 1.0 | 20 February, 2020 | https://www.kubeflow.org/docs/releases/kubeflow-1.0/ | https://blog.kubeflow.org/releases/2020/03/02/kubeflow-1-0-cloud-native-ml-for-everyone |
Kubeflow 1.1 | 31 July, 2020 | https://www.kubeflow.org/docs/releases/kubeflow-1.1/ | https://blog.kubeflow.org/release/official/2020/07/31/kubeflow-1.1-blog-post |
Kubeflow 1.2 | 18 November, 2020 | https://www.kubeflow.org/docs/releases/kubeflow-1.2/ | https://blog.kubeflow.org/release/official/2020/11/18/kubeflow-1.2-blog-post |
Kubeflow 1.3 | 23 April, 2021 | https://www.kubeflow.org/docs/releases/kubeflow-1.3/ | https://blog.kubeflow.org/kubeflow-1.3-release/ |
Kubeflow 1.4 | 12 October, 2021 | https://www.kubeflow.org/docs/releases/kubeflow-1.4/ | https://blog.kubeflow.org/kubeflow-1.4-release/ |
Kubeflow 1.5 | 10 March, 2022 | https://www.kubeflow.org/docs/releases/kubeflow-1.5/ | https://blog.kubeflow.org/kubeflow-1.5-release/ |
Kubeflow 1.6 | 7 September, 2022 | https://www.kubeflow.org/docs/releases/kubeflow-1.6/ | https://blog.kubeflow.org/kubeflow-1.6-release/ |
Kubeflow 1.7 | 29 March, 2023 | https://www.kubeflow.org/docs/releases/kubeflow-1.7/ | https://blog.kubeflow.org/kubeflow-1.7-release/ |
Kubeflow 1.8 | 1 November, 2023 | https://www.kubeflow.org/docs/releases/kubeflow-1.8/ | https://blog.kubeflow.org/kubeflow-1.8-release/ |
Kubeflow 1.9 | 22 July, 2024 | https://www.kubeflow.org/docs/releases/kubeflow-1.9/ | https://blog.kubeflow.org/kubeflow-1.9-release/ |
Kubeflow 1.10 | 1 April, 2025 | https://www.kubeflow.org/docs/releases/kubeflow-1.10/ | https://blog.kubeflow.org/kubeflow-1.10-release/ |
Notes
External links