[feature] Support different PS/worker types

In some customer cases, users want to schedule one PS for one GPU machine, and place other PSes in CPU machines, like this:

```yaml
  tfReplicaSpecs:
    PS-1:
      replicas: 3
      template:
        spec:
          podAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                 matchExpressions:
                 - key: gpu-type
                    operator: In
                    values:
                    - true
               topologyKey: topology.kubernetes.io/zone
          containers:
            - name: tensorflow
              image: xxx
    PS-2:
      replicas: 5
      template:
        spec:
          podAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                 matchExpressions:
                 - key: gpu-type
                    operator: In
                    values:
                    - false
               topologyKey: topology.kubernetes.io/zone
          containers:
            - name: tensorflow
              image: xxx
```

/cc @zw0610 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature] Support different PS/worker types #1369

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feature] Support different PS/worker types #1369

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions