Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync etcd endpoints immediately after initializing the client #1573

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vldmit
Copy link

@vldmit vldmit commented Feb 13, 2025

Problem Statement

etcd client would only run Sync after AutoSyncInterval (30 seconds), which makes tikv client vulnerable to the failure of endpoints provided in addrs before the first sync happens. Specific failure scenario:

  1. tidb is initialized with a --path=endpoint
  2. tidb successfully established connection to the endpoint
  3. n < 30 seconds after, endpoint fails (e.g. k8s control plane is upgrading the pod)
  4. etcd client is no longer connected
  5. Safe checkpoint expires and CheckVisibility in KVStore start to error out

Fix

We explicitly synchronize client endpoints with endpoints from etcd membership during client initialization phase. etcd client would continue to do periodic sync, we just force first sync to happen in the init phase.

@ti-chi-bot ti-chi-bot bot added the dco-signoff: yes Indicates the PR's author has signed the dco. label Feb 13, 2025
Copy link

ti-chi-bot bot commented Feb 13, 2025

Welcome @vldmit!

It looks like this is your first PR to tikv/client-go 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to tikv/client-go. 😃

@ti-chi-bot ti-chi-bot bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 13, 2025
Copy link

ti-chi-bot bot commented Feb 14, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zyguan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Feb 14, 2025
Copy link

ti-chi-bot bot commented Feb 14, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-02-14 01:27:25.878162368 +0000 UTC m=+579088.274384427: ☑️ agreed by zyguan.

@ti-chi-bot ti-chi-bot bot added the approved label Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved dco-signoff: yes Indicates the PR's author has signed the dco. needs-1-more-lgtm Indicates a PR needs 1 more LGTM. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants