Closed
Description
In some customer cases, users want to schedule one PS for one GPU machine, and place other PSes in CPU machines, like this:
tfReplicaSpecs:
PS-1:
replicas: 3
template:
spec:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: gpu-type
operator: In
values:
- true
topologyKey: topology.kubernetes.io/zone
containers:
- name: tensorflow
image: xxx
PS-2:
replicas: 5
template:
spec:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: gpu-type
operator: In
values:
- false
topologyKey: topology.kubernetes.io/zone
containers:
- name: tensorflow
image: xxx
/cc @zw0610