Skip to content

Use KubeletServer.NodeIP instead of KubeletServer.HostnameOverride to set node IP #6310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 14, 2016

Conversation

pravisankar
Copy link

No description provided.

@pravisankar
Copy link
Author

Depends on kubernetes/kubernetes#18541

@liggitt
Copy link
Contributor

liggitt commented Dec 15, 2015

Changing back could trigger more attempts by the node to delete and recreate its API object (which prevents the node from starting)

@liggitt
Copy link
Contributor

liggitt commented Dec 15, 2015

Need to resolve kubernetes/kubernetes#17731 before this can merge

@pravisankar
Copy link
Author

[test]

@pravisankar pravisankar changed the title [DO_NOT_MERGE] Use KubeletServer.NodeIP instead of KubeletServer.HostnameOverride to set node IP Use KubeletServer.NodeIP instead of KubeletServer.HostnameOverride to set node IP Jan 13, 2016
@pravisankar
Copy link
Author

@liggitt @DirectXMan12 @ncdc @eparis PTAL

@liggitt
Copy link
Contributor

liggitt commented Jan 13, 2016

change UPSTREAM: <drop>: ... to UPSTREAM: 18541: ... if there's an existing PR

@liggitt
Copy link
Contributor

liggitt commented Jan 13, 2016

can you outline the ways you tested this? Off the top of my head, I'd expect:

  • upgrade path with nodeIP set
    • launch 1.0, with nodeIP set in config
    • relaunch 1.1, with nodeIP set in config, against the same etcd (this may actually prevent the node from coming up)
    • relaunch with this PR, with nodeIP set in config, against the same etcd
    • expect node to come up, with warning, and pods to be able to be deployed
  • upgrade path without nodeIP set
    • launch 1.0, without nodeIP set in config
    • relaunch 1.1, without nodeIP set in config, against the same etcd
    • relaunch with this PR, without nodeIP set in config, against the same etcd
    • expect node to come up, with no warning
    • relaunch with this PR, with nodeIP set in config, against the same etcd
    • expect node to come up, with no warning, and pods to be able to be deployed

@DirectXMan12
Copy link
Contributor

For the part that I wrote:

Before my patch:

  1. launch node with nodeIP set (giving externalID == nodeIP)
  2. relaunch node without nodeIP set (giving externalID == hostname)
  3. Fail

After my patch:

  1. Try to launch node again
  2. It works, gives warning about externalID mismatch
  3. Delete node
  4. Launch node with same config (externalID == hostname)
  5. Relaunch node with nodeIP set (externalID == nodeIP)
  6. It works, gives warning about externalID mismatch

Ravi Sankar Penta and others added 4 commits January 13, 2016 07:57
…vider

Previously, if the kubelet tried to register itself with the API server,
and was rejected due to the external ID changing, it would delete the
node object and recreate it.  This commit causes it to tolerate
a change in ExternalID when the ExternalID is not being provided by a
cloud provider, assuming the new ExternalID is either the node's
(metadata) name, or one of node's addresses.
@pravisankar
Copy link
Author

Updated the PR:
Changed commit message: UPSTREAM: : ... to UPSTREAM: 18541: ..

For tolerating node externalID to have hostname or node IP in case of non-cloud provider, we just need to look at the currentNode.Spec.ExternalID (instead of node.Spec.ExternalID).
@DirectXMan12 please review?

Tested various combinations from v1.0.8 to v1.1.1 without fix and with fix. I think, I covered all the test cases mentioned by @liggitt and @DirectXMan12

@liggitt @ncdc @eparis please review/merge

@DirectXMan12
Copy link
Contributor

For tolerating node externalID to have hostname or node IP in case of non-cloud provider, we just need to look at the currentNode.Spec.ExternalID (instead of node.Spec.ExternalID).

@pravisankar LGTM. The logic you removed was for the "hostname -> nodeIP" case, but that should be solved by your other changes, correct?

@liggitt
Copy link
Contributor

liggitt commented Jan 13, 2016

LGTM

@eparis
Copy link
Member

eparis commented Jan 13, 2016

LGTM who is going to click the button?


if kl.cloud == nil {
// In the case where we're not using a cloud provider, the ExternalID could either have
// been the NodeIP, or the actual hostname. If we switch between the two, consider it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a link back to the PR that introduced the nodeIP issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not in the code, fine to link from this PR, which is reachable from the commit breadcrumb

@eparis
Copy link
Member

eparis commented Jan 13, 2016

[merge]

@ncdc
Copy link
Contributor

ncdc commented Jan 13, 2016

LGTM

1 similar comment
@smarterclayton
Copy link
Contributor

LGTM

@ncdc
Copy link
Contributor

ncdc commented Jan 13, 2016

https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin/8529/

1)  unauthenticated user should be able to log in
  - UnknownError: unknown error: [$injector:modulerr] Failed to instantiate module openshiftConsole due to:
ReferenceError: URI is not defined

@liggitt
Copy link
Contributor

liggitt commented Jan 13, 2016

@jwforres ^

@liggitt
Copy link
Contributor

liggitt commented Jan 13, 2016

[test]

@jwforres
Copy link
Member

Opened an issue for the test problem
#6651
On Jan 13, 2016 6:30 PM, "OpenShift Bot" [email protected] wrote:

continuous-integration/openshift-jenkins/test FAILURE (
https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin/8536/
)


Reply to this email directly or view it on GitHub
#6310 (comment).

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_requests_origin/4628/) (Image: devenv-rhel7_3144)

@liggitt
Copy link
Contributor

liggitt commented Jan 14, 2016

URI.js issue resolved in https://ci.openshift.redhat.com/jenkins/job/devenv_ami/3143/
[test][merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 7da6989

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 7da6989

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_origin/8543/)

openshift-bot pushed a commit that referenced this pull request Jan 14, 2016
@openshift-bot openshift-bot merged commit ad0da3a into openshift:master Jan 14, 2016
@smarterclayton
Copy link
Contributor

Did this ever get moved upstream?

@liggitt
Copy link
Contributor

liggitt commented Sep 4, 2016

the toleration to avoid deleting/recreating? no, though I doubt upstream cares... the node just deletes itself there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants