Skip to content

openshift/origin rebuilding default router from router/haproxy fails #14473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zjagust opened this issue Jun 5, 2017 · 10 comments
Closed

openshift/origin rebuilding default router from router/haproxy fails #14473

zjagust opened this issue Jun 5, 2017 · 10 comments

Comments

@zjagust
Copy link

zjagust commented Jun 5, 2017

I tried to build an image for OpenShift Router by pulling files from here: https://github.com/openshift/origin/tree/master/images/router/haproxy but it fails.

Version

oc version
oc v1.5.0
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Steps To Reproduce
  1. Remove a default haproxy router:
    oc delete pods $(oc get pods | grep router | awk '{print $1}')
    oc delete svc router
    oc delete serviceaccount router
    oc delete dc router

  2. Pull files from here: https://github.com/openshift/origin/tree/master/images/router/haproxy and create a router Docker image from Dockerfile:
    docker build -t=haproxy .

  3. Try and build a default haproxy router from image created in step 2:
    oadm router router --replicas=1 --selector='region=infra' --images=haproxy --service-account=router

Current Result

A creation of a default router fails, where "oc describe po router" gives the following output:
Back-off restarting failed docker container
Error syncing pod, skipping: failed to "StartContainer" for "router" with CrashLoopBackOff: "Back-off 10s restarting failed container=router pod=router

Output of the command "oc logs router" gives the following output:
<7>haproxy-systemd-wrapper: executing /usr/local/sbin/haproxy -p /run/haproxy.pid -f /usr/local/etc/haproxy/haproxy.cfg -Ds
[ALERT] 151/175956 (6) : Cannot open configuration file/directory /usr/local/etc/haproxy/haproxy.cfg : No such file or directory
<5>haproxy-systemd-wrapper: exit, haproxy RC=1

Expected Result

Operation above should have the same result as when running:
oadm router router --replicas=1 --selector='region=infra' --images=docker.io/openshift/origin-haproxy-router --service-account=router

@pweil-
Copy link

pweil- commented Jun 5, 2017

@zjagust the docker file you're using might help troubleshoot this if you can share it.

The oadm router command is not quite the same as just building and running an image with docker build, docker run. For an idea of what it produces you can use oadm router -o json.

I don't think that's the issue here though, it looks more like maybe the entrypoint is incorrect. In our file we use ENTRYPOINT ["/usr/bin/openshift-router"] which is a binary that we produce. This is to handle the processing of a template file in order to generate the haproxy config (from haproxy-config.template) based on the api objects.

@zjagust
Copy link
Author

zjagust commented Jun 6, 2017

@pweil- I have used Dockerfile from here: https://raw.githubusercontent.com/openshift/origin/master/images/router/haproxy/Dockerfile

The only modification I made is related to issue #14393 so I changed the original line:

yum install -y $INSTALL_PKGS && \

to this:

yum --disablerepo=origin-local-release install -y $INSTALL_PKGS && \

Everithing else is the same I did no other modifications. Now, if I execute (docker run) the image, it starts up successfully:

docker run -d -p 80:80 haproxy
docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                           NAMES
7b4b994bb2a4        haproxy             "/usr/bin/openshift-r"   7 seconds ago       Up 3 seconds        53/tcp, 443/tcp, 8443/tcp, 0.0.0.0:80->80/tcp   jolly_easley

If I execute the reload-haproxy script, HAProxy also starts successfully:

sh reload-haproxy 
 - Proxy protocol 'FALSE'.  Checking HAProxy /healthz on port 1936 ...
 - HAProxy port 1936 health check ok : 0 retry attempt(s).
ps axu
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
1001          1  1.6  0.2 625204 52280 ?        Ssl  08:37   0:02 /usr/bin/openshift-router
1001         16  0.2  0.0  15196  2000 ?        Ss   08:39   0:00 bash
1001         55  0.0  0.0  52188  5560 ?        Ss   08:39   0:00 /usr/sbin/haproxy -f /var/lib/haproxy/conf/haproxy.config -p /var/lib/haproxy/run/haproxy.pid

This is the output of oadm command with "-o json" option:

{
    "kind": "List",
    "apiVersion": "v1",
    "metadata": {},
    "items": [
        {
            "kind": "ServiceAccount",
            "apiVersion": "v1",
            "metadata": {
                "name": "router",
                "creationTimestamp": null
            }
        },
        {
            "kind": "ClusterRoleBinding",
            "apiVersion": "v1",
            "metadata": {
                "name": "router-router-role",
                "creationTimestamp": null
            },
            "userNames": [
                "system:serviceaccount:default:router"
            ],
            "groupNames": null,
            "subjects": [
                {
                    "kind": "ServiceAccount",
                    "namespace": "default",
                    "name": "router"
                }
            ],
            "roleRef": {
                "kind": "ClusterRole",
                "name": "system:router"
            }
        },
        {
            "kind": "DeploymentConfig",
            "apiVersion": "v1",
            "metadata": {
                "name": "router",
                "creationTimestamp": null,
                "labels": {
                    "router": "router"
                }
            },
            "spec": {
                "strategy": {
                    "type": "Rolling",
                    "rollingParams": {
                        "maxUnavailable": "25%",
                        "maxSurge": 0
                    },
                    "resources": {}
                },
                "triggers": [
                    {
                        "type": "ConfigChange"
                    }
                ],
                "replicas": 1,
                "test": false,
                "selector": {
                    "router": "router"
                },
                "template": {
                    "metadata": {
                        "creationTimestamp": null,
                        "labels": {
                            "router": "router"
                        }
                    },
                    "spec": {
                        "volumes": [
                            {
                                "name": "server-certificate",
                                "secret": {
                                    "secretName": "router-certs"
                                }
                            }
                        ],
                        "containers": [
                            {
                                "name": "router",
                                "image": "haproxy",
                                "ports": [
                                    {
                                        "containerPort": 80
                                    },
                                    {
                                        "containerPort": 443
                                    },
                                    {
                                        "name": "stats",
                                        "containerPort": 1936,
                                        "protocol": "TCP"
                                    }
                                ],
                                "env": [
                                    {
                                        "name": "DEFAULT_CERTIFICATE_DIR",
                                        "value": "/etc/pki/tls/private"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_HOSTNAME"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_HTTPS_VSERVER"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_HTTP_VSERVER"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_INSECURE",
                                        "value": "false"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_INTERNAL_ADDRESS"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_PARTITION_PATH"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_PASSWORD"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_PRIVKEY",
                                        "value": "/etc/secret-volume/router.pem"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_USERNAME"
                                    },
                                    {
                                        "name": "ROUTER_EXTERNAL_HOST_VXLAN_GW_CIDR"
                                    },
                                    {
                                        "name": "ROUTER_SERVICE_HTTPS_PORT",
                                        "value": "443"
                                    },
                                    {
                                        "name": "ROUTER_SERVICE_HTTP_PORT",
                                        "value": "80"
                                    },
                                    {
                                        "name": "ROUTER_SERVICE_NAME",
                                        "value": "router"
                                    },
                                    {
                                        "name": "ROUTER_SERVICE_NAMESPACE",
                                        "value": "default"
                                    },
                                    {
                                        "name": "ROUTER_SUBDOMAIN"
                                    },
                                    {
                                        "name": "STATS_PASSWORD",
                                        "value": "eDMBWUFSAY"
                                    },
                                    {
                                        "name": "STATS_PORT",
                                        "value": "1936"
                                    },
                                    {
                                        "name": "STATS_USERNAME",
                                        "value": "admin"
                                    }
                                ],
                                "resources": {
                                    "requests": {
                                        "cpu": "100m",
                                        "memory": "256Mi"
                                    }
                                },
                                "volumeMounts": [
                                    {
                                        "name": "server-certificate",
                                        "readOnly": true,
                                        "mountPath": "/etc/pki/tls/private"
                                    }
                                ],
                                "livenessProbe": {
                                    "httpGet": {
                                        "path": "/healthz",
                                        "port": 1936,
                                        "host": "localhost"
                                    },
                                    "initialDelaySeconds": 10
                                },
                                "readinessProbe": {
                                    "httpGet": {
                                        "path": "/healthz",
                                        "port": 1936,
                                        "host": "localhost"
                                    },
                                    "initialDelaySeconds": 10
                                },
                                "imagePullPolicy": "IfNotPresent"
                            }
                        ],
                        "nodeSelector": {
                            "region": "infra"
                        },
                        "serviceAccountName": "router",
                        "serviceAccount": "router",
                        "hostNetwork": true,
                        "securityContext": {}
                    }
                }
            },
            "status": {
                "latestVersion": 0,
                "observedGeneration": 0,
                "replicas": 0,
                "updatedReplicas": 0,
                "availableReplicas": 0,
                "unavailableReplicas": 0
            }
        },
        {
            "kind": "Service",
            "apiVersion": "v1",
            "metadata": {
                "name": "router",
                "creationTimestamp": null,
                "labels": {
                    "router": "router"
                },
                "annotations": {
                    "service.alpha.openshift.io/serving-cert-secret-name": "router-certs"
                }
            },
            "spec": {
                "ports": [
                    {
                        "name": "80-tcp",
                        "port": 80,
                        "targetPort": 80
                    },
                    {
                        "name": "443-tcp",
                        "port": 443,
                        "targetPort": 443
                    },
                    {
                        "name": "1936-tcp",
                        "protocol": "TCP",
                        "port": 1936,
                        "targetPort": 1936
                    }
                ],
                "selector": {
                    "router": "router"
                },
                "clusterIP": "172.30.178.91"
            },
            "status": {
                "loadBalancer": {}
            }
        }
    ]
}

When trying to build a default OpenShift router out of the same image using oadm command, the issue described above happens. Please tell me if something is missing, or you require any additional info, I just hope I'm doing something terribly wrong :)

@knobunc
Copy link
Contributor

knobunc commented Jun 6, 2017

@zjagust can you pull the pod yaml for the broken router please?

@zjagust
Copy link
Author

zjagust commented Jun 6, 2017

Is this it, or you need something else:

apiVersion: v1
items:
- apiVersion: v1
  kind: ServiceAccount
  metadata:
    creationTimestamp: null
    name: router
- apiVersion: v1
  groupNames: null
  kind: ClusterRoleBinding
  metadata:
    creationTimestamp: null
    name: router-router-role
  roleRef:
    kind: ClusterRole
    name: system:router
  subjects:
  - kind: ServiceAccount
    name: router
    namespace: default
  userNames:
  - system:serviceaccount:default:router
- apiVersion: v1
  kind: DeploymentConfig
  metadata:
    creationTimestamp: null
    labels:
      router: router
    name: router
  spec:
    replicas: 1
    selector:
      router: router
    strategy:
      resources: {}
      rollingParams:
        maxSurge: 0
        maxUnavailable: 25%
      type: Rolling
    template:
      metadata:
        creationTimestamp: null
        labels:
          router: router
      spec:
        containers:
        - env:
          - name: DEFAULT_CERTIFICATE_DIR
            value: /etc/pki/tls/private
          - name: ROUTER_EXTERNAL_HOST_HOSTNAME
          - name: ROUTER_EXTERNAL_HOST_HTTPS_VSERVER
          - name: ROUTER_EXTERNAL_HOST_HTTP_VSERVER
          - name: ROUTER_EXTERNAL_HOST_INSECURE
            value: "false"
          - name: ROUTER_EXTERNAL_HOST_INTERNAL_ADDRESS
          - name: ROUTER_EXTERNAL_HOST_PARTITION_PATH
          - name: ROUTER_EXTERNAL_HOST_PASSWORD
          - name: ROUTER_EXTERNAL_HOST_PRIVKEY
            value: /etc/secret-volume/router.pem
          - name: ROUTER_EXTERNAL_HOST_USERNAME
          - name: ROUTER_EXTERNAL_HOST_VXLAN_GW_CIDR
          - name: ROUTER_SERVICE_HTTPS_PORT
            value: "443"
          - name: ROUTER_SERVICE_HTTP_PORT
            value: "80"
          - name: ROUTER_SERVICE_NAME
            value: router
          - name: ROUTER_SERVICE_NAMESPACE
            value: default
          - name: ROUTER_SUBDOMAIN
          - name: STATS_PASSWORD
            value: Vz0emVy3Vc
          - name: STATS_PORT
            value: "1936"
          - name: STATS_USERNAME
            value: admin
          image: haproxy
          imagePullPolicy: IfNotPresent
          livenessProbe:
            httpGet:
              host: localhost
              path: /healthz
              port: 1936
            initialDelaySeconds: 10
          name: router
          ports:
          - containerPort: 80
          - containerPort: 443
          - containerPort: 1936
            name: stats
            protocol: TCP
          readinessProbe:
            httpGet:
              host: localhost
              path: /healthz
              port: 1936
            initialDelaySeconds: 10
          resources:
            requests:
              cpu: 100m
              memory: 256Mi
          volumeMounts:
          - mountPath: /etc/pki/tls/private
            name: server-certificate
            readOnly: true
        hostNetwork: true
        nodeSelector:
          region: infra
        securityContext: {}
        serviceAccount: router
        serviceAccountName: router
        volumes:
        - name: server-certificate
          secret:
            secretName: router-certs
    test: false
    triggers:
    - type: ConfigChange
  status:
    availableReplicas: 0
    latestVersion: 0
    observedGeneration: 0
    replicas: 0
    unavailableReplicas: 0
    updatedReplicas: 0
- apiVersion: v1
  kind: Service
  metadata:
    annotations:
      service.alpha.openshift.io/serving-cert-secret-name: router-certs
    creationTimestamp: null
    labels:
      router: router
    name: router
  spec:
    clusterIP: 172.30.178.91
    ports:
    - name: 80-tcp
      port: 80
      targetPort: 80
    - name: 443-tcp
      port: 443
      targetPort: 443
    - name: 1936-tcp
      port: 1936
      protocol: TCP
      targetPort: 1936
    selector:
      router: router
  status:
    loadBalancer: {}
kind: List
metadata: {}

@knobunc
Copy link
Contributor

knobunc commented Jun 6, 2017

Sorry, I'm looking for what's actually running. So, 'oc get pods' and then get the name of the router and do 'oc get pod router-... -o yaml'.

Thanks.

@zjagust
Copy link
Author

zjagust commented Jun 6, 2017

Here it is:

oc get po router-1-lq3rn -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/created-by: |
      {"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"default","name":"router-1","uid":"bb24518c-4aaf-11e7-8168-5254003f1cbf","apiVersion":"v1","resourceVersion":"414425"}}
    openshift.io/deployment-config.latest-version: "1"
    openshift.io/deployment-config.name: router
    openshift.io/deployment.name: router-1
    openshift.io/scc: hostnetwork
  creationTimestamp: 2017-06-06T12:00:35Z
  generateName: router-1-
  labels:
    deployment: router-1
    deploymentconfig: router
    router: router
  name: router-1-lq3rn
  namespace: default
  resourceVersion: "414492"
  selfLink: /api/v1/namespaces/default/pods/router-1-lq3rn
  uid: c0025793-4aaf-11e7-8168-5254003f1cbf
spec:
  containers:
  - env:
    - name: DEFAULT_CERTIFICATE_DIR
      value: /etc/pki/tls/private
    - name: ROUTER_EXTERNAL_HOST_HOSTNAME
    - name: ROUTER_EXTERNAL_HOST_HTTPS_VSERVER
    - name: ROUTER_EXTERNAL_HOST_HTTP_VSERVER
    - name: ROUTER_EXTERNAL_HOST_INSECURE
      value: "false"
    - name: ROUTER_EXTERNAL_HOST_INTERNAL_ADDRESS
    - name: ROUTER_EXTERNAL_HOST_PARTITION_PATH
    - name: ROUTER_EXTERNAL_HOST_PASSWORD
    - name: ROUTER_EXTERNAL_HOST_PRIVKEY
      value: /etc/secret-volume/router.pem
    - name: ROUTER_EXTERNAL_HOST_USERNAME
    - name: ROUTER_EXTERNAL_HOST_VXLAN_GW_CIDR
    - name: ROUTER_SERVICE_HTTPS_PORT
      value: "443"
    - name: ROUTER_SERVICE_HTTP_PORT
      value: "80"
    - name: ROUTER_SERVICE_NAME
      value: router
    - name: ROUTER_SERVICE_NAMESPACE
      value: default
    - name: ROUTER_SUBDOMAIN
    - name: STATS_PASSWORD
      value: Rrc4nkNmWd
    - name: STATS_PORT
      value: "1936"
    - name: STATS_USERNAME
      value: admin
    image: haproxy
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      httpGet:
        host: localhost
        path: /healthz
        port: 1936
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    name: router
    ports:
    - containerPort: 80
      hostPort: 80
      protocol: TCP
    - containerPort: 443
      hostPort: 443
      protocol: TCP
    - containerPort: 1936
      hostPort: 1936
      name: stats
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        host: localhost
        path: /healthz
        port: 1936
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      requests:
        cpu: 100m
        memory: 256Mi
    securityContext:
      capabilities:
        drop:
        - KILL
        - MKNOD
        - SETGID
        - SETUID
        - SYS_CHROOT
      privileged: false
      runAsUser: 1000000000
      seLinuxOptions:
        level: s0:c1,c0
    terminationMessagePath: /dev/termination-log
    volumeMounts:
    - mountPath: /etc/pki/tls/private
      name: server-certificate
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: router-token-3lgr4
      readOnly: true
  dnsPolicy: ClusterFirst
  hostNetwork: true
  imagePullSecrets:
  - name: router-dockercfg-vjn7m
  nodeName: node04
  nodeSelector:
    region: infra
  restartPolicy: Always
  securityContext:
    fsGroup: 1000000000
    seLinuxOptions:
      level: s0:c1,c0
    supplementalGroups:
    - 1000000000
  serviceAccount: router
  serviceAccountName: router
  terminationGracePeriodSeconds: 30
  volumes:
  - name: server-certificate
    secret:
      defaultMode: 420
      secretName: router-certs
  - name: router-token-3lgr4
    secret:
      defaultMode: 420
      secretName: router-token-3lgr4
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: 2017-06-06T12:00:37Z
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: 2017-06-06T12:00:37Z
    message: 'containers with unready status: [router]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: 2017-06-06T12:00:36Z
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://3ab86ecd0ad0d81898c2f06e93faae3deb723c3396536a3c8770c0ed7d919c7f
    image: haproxy
    imageID: docker-pullable://docker.io/haproxy@sha256:e1982cae1c82d648164e86190af1b20a8c22235edd16edd4f7f8b0437abd7692
    lastState:
      terminated:
        containerID: docker://4d44e61bf15ef71a5365ba35a34d257e27eb8c3abcba6a4e38a9538500eeec64
        exitCode: 1
        finishedAt: 2017-06-06T12:01:00Z
        reason: Error
        startedAt: 2017-06-06T12:01:00Z
    name: router
    ready: false
    restartCount: 2
    state:
      terminated:
        containerID: docker://3ab86ecd0ad0d81898c2f06e93faae3deb723c3396536a3c8770c0ed7d919c7f
        exitCode: 1
        finishedAt: 2017-06-06T12:01:21Z
        reason: Error
        startedAt: 2017-06-06T12:01:20Z
  hostIP: 172.16.1.14
  phase: Running
  podIP: 172.16.1.14
  startTime: 2017-06-06T12:00:37Z

@zjagust
Copy link
Author

zjagust commented Jun 8, 2017

Guys, any news on this? Thanks in advance.

@zjagust
Copy link
Author

zjagust commented Jun 12, 2017

Hi guys. A little update from my side.

I named my image "haproxy" which made me realize when using the following command:

oadm router router --replicas=1 --selector='region=infra' --images=haproxy --service-account=router

it will not use a local image I just created and named "haproxy", but it will pull an image form "docker.io/haproxy" whose default config actually is in "/usr/local/etc/haproxy/haproxy.cfg".

Please tell me if I'm wrong, but I'm pretty much sure I'm not. And if I'm right, this will take a whole new approach.

@pweil-
Copy link

pweil- commented Jun 12, 2017

@zjagust that could be it. You can try giving it a specific tag like haproxy:my-test and pass the tag to --images to confirm.

@zjagust
Copy link
Author

zjagust commented Jun 12, 2017

@pweil- I assumed right. Everything works, did a little modification to dockerfile itself, and now I have a default OpenShift router running HaProxy version 1.7. As far as I am concerned, this issue is resolved.

@pweil- pweil- closed this as completed Jun 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants