Skip to content

Extended test for registry garbage collector #14509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

miminar
Copy link

@miminar miminar commented Jun 7, 2017

Waiting for #14585
Resolves #15452

@miminar
Copy link
Author

miminar commented Jun 7, 2017

@dmage FYI

@miminar miminar force-pushed the hard-prune-extended-test branch from b0518d4 to b03fec2 Compare June 7, 2017 15:51
cleanUp.addImage(baseImg4, imageId, "")
//baseImg4Spec := fmt.Sprintf("%s/%s/A:new", registryURL, oc.Namespace())

childImg1, imageId, err := BuildAndPushChildImage(oc, dClient, baseImg1Spec, "A", "latest", 1, outSink, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/"A"/"C"/? In the next table childImg1 is mentioned in the image stream C.


})

func expectDeletions(deletedPaths []string, namespace string, expect expectedDeletions) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you going to get these paths?

Copy link
Author

@miminar miminar Jun 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to read them from the standard output of dockerregistry -prune command. Will something like this be available? Or will it be more like:

Deleted blob sha256:XXXX
Deleted layer link sha256:XXXX from repository NAMESPACE/REPO

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see it there: master...dmage:prune

}

return "", nil
return "", "", nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No result and no error? Looks like there is something wrong.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could be better structured. Nevertheless, the returned values are not wrong. This line is reached when a push failure is expected and the push really failed - so there's no usable output for the caller. Well, maybe imageID is of some use unless removeAfterPush is true.

@miminar miminar force-pushed the hard-prune-extended-test branch 3 times, most recently from d84e614 to f8563b6 Compare June 16, 2017 15:33
out, err := oc.SetNamespace(metav1.NamespaceDefault).AsAdmin().
Run("exec").Args("--stdin", podName, "--", "/bin/sh", "-s").
InputString(fmt.Sprintf(registryGCLauncherScript, dockerRegistryBinary)).Output()
if exitError, ok := err.(*exutil.ExitError); ok && strings.Contains(exitError.StdErr, "unable to upgrade connection") {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevekuznetsov have you seen this before?

Jun 16 12:27:17.529: INFO: Running 'oc exec --config=/tmp/openshift/core/openshift.local.config/master/admin.kubeconfig --namespace=default --stdin docker-registry-1-z3hj1 -- /bin/sh -s'
Jun 16 12:27:17.771: INFO: Error running &{/data/src/github.com/openshift/origin/_output/local/bin/linux/amd64/oc [oc exec --config=/tmp/openshift/core/openshift.local.config/master/admin.kubeconfig --namespace=default --stdin docker-registry-1-z3hj1 -- /bin/sh -s] []   error: unable to upgrade connection: container not found ("registry")
 error: unable to upgrade connection: container not found ("registry")
 [] <nil> 0xc420d582d0 exit status 1 <nil> <nil> true [0xc420030000 0xc420030040 0xc420030040] [0xc420030000 0xc420030040] [0xc420030020 0xc420030038] [0xde31c0 0xde32c0] 0xc421682240 <nil>}:
error: unable to upgrade connection: container not found ("registry")
failed to execute into registry pod docker-registry-1-z3hj1: exit status 1
Jun 16 12:27:17.822: INFO: Running 'oc exec --config=/tmp/openshift/core/openshift.local.config/master/admin.kubeconfig --namespace=default --stdin docker-registry-1-z3hj1 -- /bin/sh -s'
Jun 16 12:27:18.065: INFO: Error running &{/data/src/github.com/openshift/origin/_output/local/bin/linux/amd64/oc [oc exec --config=/tmp/openshift/core/openshift.local.config/master/admin.kubeconfig --namespace=default --stdin docker-registry-1-z3hj1 -- /bin/sh -s] []   error: unable to upgrade connection: container not found ("registry")
 error: unable to upgrade connection: container not found ("registry")
 [] <nil> 0xc4212150b0 exit status 1 <nil> <nil> true [0xc4202b76d8 0xc4202b7780 0xc4202b7780] [0xc4202b76d8 0xc4202b7780] [0xc4202b76f8 0xc4202b7760] [0xde31c0 0xde32c0] 0xc42168ef00 <nil>}:
error: unable to upgrade connection: container not found ("registry")

Do I use the exec wrong? The interesting part is that this fails roughly in one out of three runs - the first time this code is executed.
If this succeeds once, it will succeed next time as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there isn't a container registry in the docker-registry-1-z3hj1 pod? I haven't seen this before but it may mean the registry failed to start.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevekuznetsov my bad. The command attempts to execute into an old pod just after a redeployment. I just need to make sure to choose the most recent one.

// this shouldn't delete anything
o.Expect(deleted.Len()).To(o.Equal(0))

/* TODO: use a persistent storage for the registry to preserve data across re-deployments
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevekuznetsov Would it be possible to use persistent volume for the registry for the extended tests at least?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the context here? PV for test smells bad -- are there cross-test dependencies?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During a single test, I'd like to:

  1. upload some images
  2. redeploy the registry into a different configuration
  3. check whether the registry is usable with the images uploaded before

In this case the modified configuration is read-only, therefore I cannot upload any other images to test with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. You'll want to figure out how to get claimable PVs on the CI environment - what OpenShift Ansible options need to be set? Do we need to set up the filesystem in a special way? We've got logic to set up a VG for EmptyDir mounts already on XFS so we've got a place in origin-ci-tool to add more set up in case we need to. OpenShift Ansible inventory changes would land in aos-cd-jobs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers I'll look into it.

@miminar miminar force-pushed the hard-prune-extended-test branch 4 times, most recently from 2db7dd1 to b0a12a4 Compare June 20, 2017 18:24
@miminar miminar changed the title [WIP] Extended test for registry garbage collector [DO_NOT_MERGE] Extended test for registry garbage collector Jun 20, 2017
@miminar miminar force-pushed the hard-prune-extended-test branch 2 times, most recently from 72d2ac8 to 3ef2fc1 Compare June 26, 2017 15:39
@miminar miminar changed the title [DO_NOT_MERGE] Extended test for registry garbage collector Extended test for registry garbage collector Jun 26, 2017
@miminar
Copy link
Author

miminar commented Jun 26, 2017

@dmage this is ready for a cherry-pick on top of #14585 if you want.

@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 27, 2017
@miminar miminar force-pushed the hard-prune-extended-test branch from 3ef2fc1 to 749bd88 Compare June 27, 2017 08:26
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 27, 2017
@miminar miminar force-pushed the hard-prune-extended-test branch 2 times, most recently from 4b475f4 to 7d916b7 Compare June 27, 2017 15:00
@miminar
Copy link
Author

miminar commented Jun 27, 2017

Hopefully resolved possible flake caused by #13428.

@miminar
Copy link
Author

miminar commented Jun 28, 2017

@stevekuznetsov is something wrong with selinux policy:

+ make test-extended SUITE=core 'FOCUS=registry|imageapi|ImagePrune|ImageQuota'
test/extended/core.sh --ginkgo.focus="registry|imageapi|ImagePrune|ImageQuota"
[WARNING] No compiled `ginkgo` binary was found. Attempting to build one using:
[WARNING]   $ hack/build-go.sh vendor/github.com/onsi/ginkgo/ginkgo
++ Building go targets for linux/amd64: vendor/github.com/onsi/ginkgo/ginkgo
/data/src/github.com/openshift/origin/hack/build-go.sh took 71 seconds
[WARNING] No compiled `junitmerge` binary was found. Attempting to build one using:
[WARNING]   $ hack/build-go.sh tools/junitmerge
++ Building go targets for linux/amd64: tools/junitmerge
/data/src/github.com/openshift/origin/hack/build-go.sh took 3 seconds
[INFO] [CLEANUP] Cleaning up temporary directories
[INFO] Starting server
chcon: invalid context: /var/lib/openshift/openshift.local.volumes	system_u:object_r:openshift_var_lib_t:s0
[ERROR] PID 6461: test/extended/setup.sh:88: `${sudo} chcon "${label}" ${VOLUME_DIR}` exited with status 1.
[INFO] 		Stack Trace: 
[INFO] 		  1: test/extended/setup.sh:88: `${sudo} chcon "${label}" ${VOLUME_DIR}`
[INFO] 		  2: test/extended/core.sh:8: os::test::extended::setup
[INFO]   Exiting with code 1.
/data/src/github.com/openshift/origin/hack/lib/log/system.sh: line 31:  9141 Terminated              sar -A -o "${binary_logfile}" 1 86400 > /dev/null 2> "${stderr_logfile}"
[INFO] [CLEANUP] Beginning cleanup routines...
[INFO] [CLEANUP] Dumping cluster events to _output/scripts/core/artifacts/events.txt
error: A server URL must be specified
[INFO] [CLEANUP] Dumping etcd contents to _output/scripts/core/artifacts/etcd

?

@stevekuznetsov
Copy link
Contributor

@miminar #14937

@eparis
Copy link
Member

eparis commented Jun 28, 2017

[test]

@miminar miminar force-pushed the hard-prune-extended-test branch 2 times, most recently from 8a50257 to be2cc4d Compare June 29, 2017 13:24
Oleg Bulatov and others added 2 commits July 3, 2017 12:10
@miminar miminar force-pushed the hard-prune-extended-test branch from be2cc4d to b1c1518 Compare July 3, 2017 10:10
@openshift-bot
Copy link
Contributor

Evaluated for origin testextended up to b1c1518

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/testextended SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended/809/) (Base Commit: fc34104) (PR Branch Commit: b1c1518) (Extended Tests: core(registry|imageapi|ImagePrune|ImageQuota))

@miminar
Copy link
Author

miminar commented Jul 3, 2017

Flake #15015 re-[test]

@miminar
Copy link
Author

miminar commented Jul 3, 2017

Flake #12072. re-[test]

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to b1c1518

@openshift openshift deleted a comment from openshift-bot Jul 3, 2017
@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/2937/) (Base Commit: 9c5e7da) (PR Branch Commit: b1c1518)

@openshift openshift deleted a comment from eparis Jul 4, 2017
@openshift-merge-robot openshift-merge-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jul 24, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: miminar
We suggest the following additional approver: mfojtik

Assign the PR to them by writing /assign @mfojtik in a comment when ready.

Associated issue: 14585

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 28, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: miminar
We suggest the following additional approver: mfojtik

Assign the PR to them by writing /assign @mfojtik in a comment when ready.

Associated issue: 14585

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 28, 2017
@openshift-merge-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: miminar
We suggest the following additional approver: mfojtik

Assign the PR to them by writing /assign @mfojtik in a comment when ready.

Associated issue: 14585

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot
Copy link
Contributor

@miminar PR needs rebase

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 1, 2017
@miminar
Copy link
Author

miminar commented Aug 1, 2017

Closing in favor of #14585 which now embeds the extended test.

@miminar miminar closed this Aug 1, 2017
@miminar miminar deleted the hard-prune-extended-test branch October 10, 2017 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants