Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing "FATAL No shared unique key can be found after ALTER! Bailing out" #1452

Open
htmsousa opened this issue Sep 20, 2024 · 0 comments
Open

Comments

@htmsousa
Copy link

htmsousa commented Sep 20, 2024

While using gh-ost, I came across the error "FATAL No shared unique key can be found after ALTER! Bailing out" already reported in issue #647.

Consider the following table:

CREATE TABLE `txn` (
   `txnId` int(10) unsigned NOT NULL AUTO_INCREMENT,
   `submittedDate` date NOT NULL DEFAULT '0000-00-00',
   `merchantId` int(10) unsigned NOT NULL DEFAULT '0',
   `amt` decimal(19,6) DEFAULT NULL,
   PRIMARY KEY (`txnId`,`submittedDate`)
 ) 

If one tries to do the alter table to replace PRIMARY KEY (`txnId`) with PRIMARY KEY (`txnId`,`submittedDate`) with --alter="DROP PRIMARY KEY, ADD PRIMARY KEY (txnId, submittedDate)", gh-ost will abort the operation with the error "FATAL No shared unique key can be found after ALTER! Bailing out"

On the page https://github.com/github/gh-ost/blob/master/doc/shared-key.md it is defined what is allowed and what is not and if this error happens, it is mentioned that a 2-step migration is required.

The example above is one of the "not allowed" cases. Looking at the code, the logic mandated that both original and ghost tables must share at least one "unique key". This makes sense to a certain point, otherwise, inserts in ghost table could fail with duplicate unique keys.

However, if we look at the alter table above, it is not possible to have duplicate unique key errors because the original unique key is still part of the new key. If the data in original table was unique according to the original unique key, it will still be unique if the new unique key in ghost table has the columns of the original unique key, even if the new column in the new unique key can have duplicate values.

In my view, the current code should be changed. Instead of trying to find pairs of same key in both tables, it should check if each unique key in original table exists as a subset of a unique key in the list of unique keys in ghost table.

I created a pull request with the code change to fix this problem.

Note 1:
The proposed 2-step migration for the case above could be ok, but our table has half billion records, consuming 1.5 TB of space. When we run gh-ost on this table, it runs for almost 30 hours.. doing alter table in 2 steps would mean 60 hours migration.

Note 2:
At first glance, the above alter table doesn't look like a real world use case, but we need to do this so we can create partitions on the table using the column submittedDate and that is not possible without it being part of the primary key. The real alter table that we are running with ghost is something like this and with the code fix in the PR it works really well.

--alter="DROP PRIMARY KEY, ADD PRIMARY KEY (txnId, submittedDate), ADD COLUMN(newColumn1 varchar(255) DEFAULT NULL, newColumn2 varchar(255) DEFAULT NULL, ...) PARTITION BY RANGE (TO_DAYS(submittedDate)) (PARTITION p_20220101 VALUES LESS THAN (TO_DAYS('2022-01-01')), PARTITION p_20220102 VALUES LESS THAN (TO_DAYS('2022-01-02')), (...), PARTITION p_20241230 VALUES LESS THAN (TO_DAYS('2024-12-30')), PARTITION p_20241231 VALUES LESS THAN (TO_DAYS('2024-12-31')), PARTITION p_max VALUES LESS THAN MAXVALUE)"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@htmsousa and others