Skip to content

Cluster provision fails - registry deployment #14597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bparees opened this issue Jun 12, 2017 · 15 comments
Closed

Cluster provision fails - registry deployment #14597

bparees opened this issue Jun 12, 2017 · 15 comments
Assignees
Labels
component/install kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/P1 sig/cluster-lifecycle

Comments

@bparees
Copy link
Contributor

bparees commented Jun 12, 2017

TASK [openshift-registry : Deploy latest configuration of registry DC] *********
Monday 12 June 2017  11:59:39 +0000 (0:00:00.577)       0:18:41.437 *********** 
fatal: [ci-prtest-5a37c28-3006-ig-m-5g51]: FAILED! => {"changed": true, "cmd": ["oc", "deploy", "docker-registry", "--latest"], "delta": "0:00:00.271225", "end": "2017-06-12 07:59:39.671711", "failed": true, "rc": 1, "start": "2017-06-12 07:59:39.400486", "stderr": "Flag --latest has been deprecated, use 'oc rollout latest' instead\nerror: #1 is already in progress (Running).\nOptionally, you can cancel this deployment using 'oc rollout cancel dc/docker-registry'.", "stderr_lines": ["Flag --latest has been deprecated, use 'oc rollout latest' instead", "error: #1 is already in progress (Running).", "Optionally, you can cancel this deployment using 'oc rollout cancel dc/docker-registry'."], "stdout": "", "stdout_lines": []}
Failure summary:

  1. Host:     ci-prtest-5a37c28-3006-ig-m-5g51
     Play:     primary_master
     Task:     openshift-registry : Deploy latest configuration of registry DC
     Message:  ???

https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin_extended_conformance_gce/3006/consoleFull

@stevekuznetsov
Copy link
Contributor

Could this be #13995?

/cc @sdodson

@0xmichalis
Copy link
Contributor

Could this be #13995?

Or maybe #14589

@0xmichalis
Copy link
Contributor

Or maybe #14589

Sorry, wrong issue. The present issue seems similar to #14603

@sdodson
Copy link
Member

sdodson commented Jun 13, 2017

Is this a change in how deployments are managed? I believe we've always modified the deployment potentially before it finished but we've never observed an error before.

We should definitely fix this regardless.

@sdodson sdodson assigned abutcher and smarterclayton and unassigned abutcher and mtnbikenc Jun 13, 2017
@sdodson
Copy link
Member

sdodson commented Jun 13, 2017

This code is actually in origin-gce. @smarterclayton Can we not just use openshift-ansible's registry deployment?

@smarterclayton
Copy link
Contributor

It might be, the ref arch was doing something extra that made it work on GCE, before it was supported in openshift-ansible.

@0xmichalis
Copy link
Contributor

@0xmichalis
Copy link
Contributor

The api server seems to never turn ready.

@0xmichalis
Copy link
Contributor

@openshift/sig-master

@mfojtik
Copy link
Contributor

mfojtik commented Feb 8, 2018

Inspected the API logs, there is no error there. From the error in ansible log it seems like ansible can't reach the DNS: https://internal-api.prtest-5a37c28-15849.origin-ci-int-gce.dev.rhcloud.com:8443/healthz/ready as the "curl" command exit with 6 code which means CURLE_COULDNT_RESOLVE_HOST.

@mfojtik mfojtik removed the sig/master label Feb 8, 2018
@mfojtik
Copy link
Contributor

mfojtik commented Feb 8, 2018

Moving back to @sdodson as this is unlikely API server issue.

@0xmichalis
Copy link
Contributor

Opened a separate issue: #18525

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2018
@stevekuznetsov
Copy link
Contributor

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/install kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/P1 sig/cluster-lifecycle
Projects
None yet
Development

No branches or pull requests

10 participants