Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic if instance was deleted in openstack manually #2404

Open
71g3pf4c3 opened this issue Jan 29, 2025 · 1 comment
Open

Panic if instance was deleted in openstack manually #2404

71g3pf4c3 opened this issue Jan 29, 2025 · 1 comment
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@71g3pf4c3
Copy link

/kind bug

What steps did you take and what happened:
In any setup create machinedeployment, wait machines and openstackmachines to be ready state and delete instance inside openstack project (via cli or horizon ui). CAPO controller crashes with nil pointer dereference.

Traced it to:
openstackmachine_controller.go reconcileNormal()

	var instanceStatus *compute.InstanceStatus
	if instanceStatus, err = computeService.GetInstanceStatus(*machineServer.Status.InstanceID); err != nil {
		return ctrl.Result{}, err
	}

	instanceNS, err := instanceStatus.NetworkStatus()  // <- here we got nil pointer dereference

Which leads to

instance.go

func (s *Service) GetInstanceStatus(resourceID string) (instance *InstanceStatus, err error) {
	if resourceID == "" {
		return nil, fmt.Errorf("resourceId should be specified to get detail")
	}

	server, err := s.getComputeClient().GetServer(resourceID)
	if err != nil {
		if capoerrors.IsNotFound(err) {
			return nil, nil     //  <- returns nil,nil. Which leads to instanceStatus.NetworkStatus() produce error

		}
		return nil, fmt.Errorf("get server %q detail failed: %v", resourceID, err)
	}

	return &InstanceStatus{server, s.scope.Logger()}, nil
}

Logs:

2025-01-29T18:06:39.883385751+03:00 panic: runtime error: invalid memory address or nil pointer dereference [recovered]
2025-01-29T18:06:39.883477120+03:00     panic: runtime error: invalid memory address or nil pointer dereference
2025-01-29T18:06:39.883490834+03:00 [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1b8f83a]
2025-01-29T18:06:39.883499159+03:00
2025-01-29T18:06:39.883505997+03:00 goroutine 350 [running]:
2025-01-29T18:06:39.883516576+03:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
2025-01-29T18:06:39.883523684+03:00     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:111 +0x1e5
2025-01-29T18:06:39.883570207+03:00 panic({0x1dccbe0?, 0x362a670?})
2025-01-29T18:06:39.883599681+03:00     /usr/local/go/src/runtime/panic.go:770 +0x132
2025-01-29T18:06:39.883625108+03:00 sigs.k8s.io/cluster-api-provider-openstack/pkg/cloud/services/compute.(*InstanceStatus).NetworkStatus(0x0)
2025-01-29T18:06:39.883630041+03:00     /workspace/pkg/cloud/services/compute/instance_types.go:138 +0x3a
2025-01-29T18:06:39.883634741+03:00 sigs.k8s.io/cluster-api-provider-openstack/controllers.(*OpenStackMachineReconciler).reconcileNormal(0xc000482300, {0x2440550, 0xc000a27860}, 0xc000a530b0, {0xc0002c4d08, 0x18}, 0xc000a04b08, 0xc000a86f08, 0xc000a86a08)
2025-01-29T18:06:39.883638961+03:00     /workspace/controllers/openstackmachine_controller.go:380 +0x1fc
2025-01-29T18:06:39.883643634+03:00 sigs.k8s.io/cluster-api-provider-openstack/controllers.(*OpenStackMachineReconciler).Reconcile(0xc000482300, {0x2440550, 0xc000a27860}, {{{0xc00053f710?, 0x0?}, {0xc00034d680?, 0xc000909d10?}}})
2025-01-29T18:06:39.883647757+03:00     /workspace/controllers/openstackmachine_controller.go:161 +0xbd8
2025-01-29T18:06:39.883661273+03:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x24467c8?, {0x2440550?, 0xc000a27860?}, {{{0xc00053f710?, 0xb?}, {0xc00034d680?, 0x0?}}})
2025-01-29T18:06:39.883665768+03:00     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114 +0xb7
2025-01-29T18:06:39.883670720+03:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0001242c0, {0x2440588, 0xc0006ddef0}, {0x1e96420, 0xc00045f980})
2025-01-29T18:06:39.883674642+03:00     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311 +0x3bc
2025-01-29T18:06:39.883682546+03:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0001242c0, {0x2440588, 0xc0006ddef0})
2025-01-29T18:06:39.883690454+03:00     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:261 +0x1be
2025-01-29T18:06:39.883695090+03:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
2025-01-29T18:06:39.883699069+03:00     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:222 +0x79
2025-01-29T18:06:39.883925257+03:00 created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 207
2025-01-29T18:06:39.883933278+03:00     /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:218 +0x486

What did you expect to happen:
That CAPO would handle the absence of instance and change status of OpenStackMachine by changing a condition or requeueing.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster API Provider OpenStack version (Or git rev-parse HEAD if manually built): v0.11.3
  • Cluster-API version: v1.8.4
  • OpenStack version: Antelope 2023.1
  • Kubernetes version (use kubectl version): v1.30.2
  • OS (e.g. from /etc/os-release): rocky-9
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 29, 2025
@EmilienM
Copy link
Contributor

This issue is valid and will be addressed within #2379.
However I suspect no backport will be possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
Status: Inbox
Development

No branches or pull requests

3 participants