Skip to content

[perf] MCAD takes a very long time to delete a large number of AppWrappers #477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kpouget opened this issue Jul 13, 2023 · 1 comment

Comments

@kpouget
Copy link

kpouget commented Jul 13, 2023

As part of the MCAD load test, I created 1000 AppWrappers not fitting into the cluster (they request a high amount of CPU).
Once all of these AppWrappers are in one of these states: [Queueing, HeadOfLine, Pending, Failed], the main test ends, and the cleanup starts.

All the AppWrapper are deleted with oc delete AppWrappers --all -n <namespace>.
The timing of this call is shown in blue in the figure below.

Once this call returns, I create a canary AppWrapper, and wait for it to be executed.
This step is show in red in the figure below.
The Ansible logs of this command confirm that most of the 23 minutes is spent before the .status.controllerfirsttimestamp even gets filled.

image

All the details of the scale test are at this address (files here). Mind that there was a typo in the code (wrong file read as part of the visualizer parsing) which make the clean up phase appear as 5 minutes long (this was the test length :D).

This other plot (from this test) confirm that none of the 1000 AppWrappers created in the first 5 minutes of the test are discovered in the first 25 minutes of the test:
image

@asm582
Copy link
Member

asm582 commented Sep 7, 2023

update, with recent PR merges we believe this issue has been fixed, re-running the experiment would confirm the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants