Skip to content

Add more AI/ML Training Examples #2040

Open
@andreyvelich

Description

@andreyvelich

As we discussed previously: #2021 (comment) we want to add more AI/ML examples to the Kubeflow Training Operator. Right now, most of our examples have very basic and simple CNN training for MNIST. Since Training Operator is capable to train large-scale ML models, we would like to contribute more AI/ML use-cases.

We can make these examples Data Scientists friendly and re-use our Python SDK within Jupyter Notebooks to simplify the user submission.
I like the example structure of HF Transformers, so I propose the following path: examples/<framework>/<ml-use-case>

We can start with these examples (feel free to add more ML use-cases in this issue):

  • Language Modeling
  • Image Classification
  • Text Classification
  • Audio Classification
  • Question Answering
  • Speech Recognition
  • Text Generation
  • FSDP Example with PyTorch

We should investigate how to configure our CI/CD to make sure that these examples are functional.

cc @kuizhiqing @johnugeorge @tenzen-y @kubeflow/wg-training-leads

/help
/good-first-issue
/area example

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions