Skip to content

Flaky e2e tests #1779

Closed
Closed
@nagar-ajay

Description

@nagar-ajay

While testing e2e integration tests on my local, I encountered a few scenarios when these tests fail.

  • Sometimes tests fail because of the following condition.
  conditions = client.get_job_conditions(name, namespace, job_kind)
  if len(conditions) != 1:
      raise Exception(f"{job_kind} conditions are invalid: {conditions}")

https://github.com/kubeflow/training-operator/blob/master/sdk/python/test/e2e/utils.py#L22

I think the reason is if a container created by tests starts running instantaneously, then that test fails because that job will have two conditions.

  • Sometimes tests fail because of the following condition
  conditions = client.get_job_conditions(name, namespace, job_kind)
  if len(conditions) != 3:
      raise Exception(f"{job_kind} conditions are invalid: {conditions}")

https://github.com/kubeflow/training-operator/blob/master/sdk/python/test/e2e/utils.py#L40-L42

With these scenarios, I found that the running condition is missing from training job conditions.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions