Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce HealthyPanicThreshold for MeshCircuitBreaker policy #11704

Open
Icarus9913 opened this issue Oct 8, 2024 · 7 comments · May be fixed by #12860
Open

Introduce HealthyPanicThreshold for MeshCircuitBreaker policy #11704

Icarus9913 opened this issue Oct 8, 2024 · 7 comments · May be fixed by #12860
Assignees
Labels
kind/feature New feature triage/accepted The issue was reviewed and is complete enough to start working on it
Milestone

Comments

@Icarus9913
Copy link
Contributor

Description

I noticed that there's an implementation difference between our MeshCircuitBreaker and Istio's (https://github.com/istio/istio/blob/7b2d8b9dacb7989d75b3b3d860988cadb9f5a848/pilot/pkg/networking/core/cluster_traffic_policy.go#L416-L423),
maybe we could add one flag to disable panic threshold and it depends on the user's choice.

@Icarus9913 Icarus9913 added triage/pending This issue will be looked at on the next triage meeting kind/feature New feature labels Oct 8, 2024
@Icarus9913 Icarus9913 added this to the 2.10.x milestone Oct 8, 2024
@lahabana
Copy link
Contributor

lahabana commented Oct 10, 2024

Would that be something like:

https://kuma.io/docs/dev/policies/meshcircuitbreaker/

      outlierDetection:
        panicDisabled: true
        interval: 5s
        baseEjectionTime: 30s
        maxEjectionPercent: 20
        splitExternalAndLocalErrors: true

?

Also from looking at the docs I see no mention of panic, we should talk about it (kumahq/kuma-website#1953).

Is there any possible workaround to set configuration in a way that will avoid entering panic?

@Icarus9913
Copy link
Contributor Author

Is there any possible workaround to set configuration in a way that will avoid entering panic?

During load balancing, Envoy will generally only consider available (healthy or degraded) hosts in an upstream cluster. However, if the percentage of available hosts in the cluster becomes too low, Envoy will disregard health status and balance either amongst all hosts or no hosts.

If you want to avoid entering panic mode, which means the upstream clusters keep in the healthy state and do not turn into unhealth. I guess maybe we just need to disable the Outlier detection ? And the upstream cluster would keep in health even they returned lots of failures.

@Icarus9913
Copy link
Contributor Author

For now, I can just only use the MeshProxyPatch to disable the enovy-panic-threshold

apiVersion: kuma.io/v1alpha1
kind: MeshProxyPatch
metadata:
  name: circuit-breaker
  namespace: kuma-system
spec:
  default:
    appendModifications:
    - cluster:
        jsonPatches:
        - op: add
          path: /common_lb_config
          value:
            healthy_panic_threshold: {}
        match:
          name: httpbin-1_kuma-demo_svc_8000
        operation: Patch
  targetRef:
    kind: Mesh

@jakubdyszkiewicz
Copy link
Contributor

Do we add envoy panic threshold by default even when you don't use MeshCircuitBreaker? @Icarus9913

@lobkovilya
Copy link
Contributor

I think we have it in MeshHealthCheck policy

HealthyPanicThreshold *intstr.IntOrString `json:"healthyPanicThreshold,omitempty"`

@Icarus9913
Copy link
Contributor Author

@jakubdyszkiewicz I believe it's enabled by default in LoadBalancing of Envoy. And I was wondering if we could implement something like what Ilya put above.

https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/panic_threshold

@Icarus9913
Copy link
Contributor Author

Icarus9913 commented Oct 30, 2024

Maybe it's more proper to add a property HealthyPanicThresholdPercent *intstr.IntOrString like we did for the MeshHealthCheck policy rather than just use a bool type to enable/disable it, which lifts the restriction for the user to edit this property and write some explanation for the panic threshold. (If the user doesn't set it, the envoy would treat it with the default value 50%)


We should also check the Istio's CircuitBreaker policy properties to check the differences and its source codes implementations.

@Icarus9913 Icarus9913 changed the title Add one option to disable envoy panic threshold with MeshCircuitBreaker policy Introduce HealthyPanicThreshold for MeshCircuitBreaker policy Oct 31, 2024
@bartsmykla bartsmykla added triage/accepted The issue was reviewed and is complete enough to start working on it and removed triage/pending This issue will be looked at on the next triage meeting labels Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature triage/accepted The issue was reviewed and is complete enough to start working on it
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants