-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Bail during start-up if there are old background updates. #16397
base: develop
Are you sure you want to change the base?
Conversation
Putting up for review as I'd be interested in feedback on this. Is this approach worth pursuing? |
One thought I had was that we need to think a bit about how this interacts with security releases, i.e. if we put out a high severity fix would this potentially block large servers from upgrading (if they were still churning through a large BG update)? |
Similarly: suppose we write a bad background update that takes aaaaaaaaaaages to run. Months later we discover this, after we've bumped the schema version. Do we have the ability to patch the background update itself? (See e.g. the rebuild user directory background update, for example) |
I think the response to both those is the same -- you want the minimum background update version to trail ages behind your current release. So it really becomes a process thing. You could still have an issue where a deploy of Synapse is also very far behind and then needs to jump due to security releases though, I don't have a good answer to that.
I think the (annoying) solution you do is a point release before you bump the required version. People can then update to that version with the improved performance. |
|
BACKGROUND_UPDATES_COMPAT_VERSION = ( | ||
# The replace_stream_ordering_column from 6001 must have run. | ||
61 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking on it more, I wonder if this should not be just background updates, but also refuse to start if your database is older than this? Effectively making a minimum supported version to upgrade from?
# The replace_stream_ordering_column from 6001 must have run. | ||
61 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably needs upgrade notes to let folks know to double check this has ran before upgrading?
On startup, if there exists any background update (via the | ||
`background_updates.ordering` column) older than `BACKGROUND_UPDATES_COMPAT_VERSION`, | ||
Synpase will refuse to start. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annoyingly many updates (especially older ones) are inserted without an ordering
, which defaults to 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might need to somehow enforce that new updates at least have a non-zero ordering, if we're taking this path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might need to somehow enforce that new updates at least have a non-zero ordering, if we're taking this path.
I agree -- I'm unsure if that's a process thing or a database change. I guess we can update the column to not have a default so people at least need to provide one?
This is useful for adding delta files which assume background updates have | ||
finished; overall maintenance of Synapse (by allowing for removal of code | ||
supporting old background updates); among other things. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the realization today that this also lets us delete the handlers for older background updates, which convinces me more this is a reasonable approach.
I think this is ready for an initial review for real. |
* The Synapse codebase defines a constant `synapse.storage.schema.SCHEMA_COMPAT_VERSION` | ||
which represents the minimum database versions the current code supports. | ||
Whenever the Synapse code is updated to assume backwards-incompatible changes | ||
made to the database format, `synapse.storage.schema.SCHEMA_COMPAT_VERSION` is also updated | ||
so that administrators can not accidentally roll back to a too-old version of Synapse. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that defining this before db.schema_compat_version
is a bit backwards.
# ordering is an int based on when the background update was added: | ||
# | ||
# (schema version when added * 100) + (schema delta when added). | ||
update_schema_version = ordering // 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit concerned that the fact that the ordering is (schema version when added * 100) + N
is a bit of a loose, unenforced, convention.
In many ways I'd find it less confusing if, instead of using a COMPAT_VERSION
following the same system as SCHEMA_VERSION
etc, we instead had a MIN_PENDING_BACKGROUND_UPDATE_ORDER
which we compared with the raw ordering
of the pending updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is definitely a bit loose, using the raw minimum might make more sense.
This is a basic implementation to enforce that some old background updates have run.
My hope is that this could be used to avoid future situations like #16192. This would fix #16047 in a simplistic manner and doesn't attempt to differentiate between different types of background updates.
This still would require us to do thinking about what old updates are now required, but I think it would fix the issue where we have to continually create background jobs due to dependencies.
TODO
ordering
is filled in (non-nullable).Maybe try to fix thatBACKGROUND_UPDATES_COMPAT_VERSION < last full schema dump
?