Flaky e2e tests

While testing [e2e integration tests](https://github.com/kubeflow/training-operator/tree/master/sdk/python/test/e2e) on my local, I encountered a few scenarios when these tests fail.

* Sometimes tests fail because of the following condition.
```
  conditions = client.get_job_conditions(name, namespace, job_kind)
  if len(conditions) != 1:
      raise Exception(f"{job_kind} conditions are invalid: {conditions}")
```
https://github.com/kubeflow/training-operator/blob/master/sdk/python/test/e2e/utils.py#L22

I think the reason is if a container created by tests starts running instantaneously, then that test fails because that job will have two conditions. 

* Sometimes tests fail because of the following condition
```
  conditions = client.get_job_conditions(name, namespace, job_kind)
  if len(conditions) != 3:
      raise Exception(f"{job_kind} conditions are invalid: {conditions}")
```
https://github.com/kubeflow/training-operator/blob/master/sdk/python/test/e2e/utils.py#L40-L42

With these scenarios, I found that the running condition is missing from training job conditions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flaky e2e tests #1779

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flaky e2e tests #1779

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions