Skip to content

postgres-operator pod crashed when delete instance which enable or disable connect pooler #2108

Closed
@godzilla-s

Description

@godzilla-s

Please, answer some short questions which should help us to understand your problem / question better?

  • Which image of the operator are you using? e.g. registry.opensource.zalan.do/acid/postgres-operator:v1.7.1
  • Where do you run it - cloud, k3s
  • Are you running Postgres Operator in production? [yes]
  • Type of issue? [Bug report]

Hello:
I found uncertain issue: the operator pod crashed when delete instance that enable and disable connect pooler, the step as below:

  1. create a instance, enable connect pooler
enableConnectionPooler: true
enableReplicaConnectionPooler: true

it works fine after it created

  1. then update instance: disable connect pooler
enableConnectionPooler: false
enableReplicaConnectionPooler: false

it works fine after updated

  1. delete the instance, a few seconds later, the postgres-operator pod crashed, the logs like:
panic: rutnime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1a76754]

github.com/zalando/postgres-operator/pkg/cluster.(*Cluster).deleteSecrets
          /workspace/pkg/cluster/resources.go: 502 +0x154
github.com/zalando/postgres-operator/pkg/cluster.(*Cluster).Delete
         /workspace/pkg/cluster/cluster.go:903 +0x256

and I found some problem with the code:

func (c *Cluster) deleteSecrets() error {
	c.setProcessName("deleting secrets")
	var errors []string
	errorCount := 0
	for uid, secret := range c.Secrets {
               // check if secret is nil ???????
                // if not verify, it will crashed  ????
		err := c.deleteSecret(uid, *secret)
		if err != nil {
			errors = append(errors, fmt.Sprintf("%v", err))
			errorCount++
		}
	}

	if errorCount > 0 {
		return fmt.Errorf("could not delete all secrets: %v", errors)
	}

	return nil
}

func (c *Cluster) deleteSecret(uid types.UID, secret v1.Secret) error {
	c.setProcessName("deleting secret")
	secretName := util.NameFromMeta(secret.ObjectMeta)
	c.logger.Debugf("deleting secret %q", secretName)
	err := c.KubeClient.Secrets(secret.Namespace).Delete(context.TODO(), secret.Name, c.deleteOptions)
	if err != nil {
		return fmt.Errorf("could not delete secret %q: %v", secretName, err)
	}
	c.logger.Infof("secret %q has been deleted", secretName)
	c.Secrets[uid] = nil  // set to nil?  or just delete(c.Secrets, uid) ?

	return nil
}

I hope some one can help to make sure the issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions