Skip to content

Commit

Permalink
Handle transitions differently for MySQL (#399)
Browse files Browse the repository at this point in the history
* Inverse the order of writes on statesman transitions

StatesMan will update the old transitions to `most_recent:
nil` and then insert the new transition as `most_recent: true`.
However that can cause deadlocks on MySQL when running concurrent state
transitions.
On the update, MySQL will hold a gap lock to the unique indexes there are, so
other transactions cannot insert. After two updates on the most_recent
and two inserts, the second insert will fail with a deadlock.
For more explanation see the PR description that includes this commit.

* Make sure the initial most_recent state is false/nil

* Lock-lesser transitions without additional queries

#350

@arthurnn opened #350 to reduce the impact of Statesman taking gap locks
when using MySQL. The crux of the issue is that MySQL's implementation
of REPEATABLE READ can take wide locks when an update touches no rows,
which happens frequently on the first transition of Statesman.

By first creating the new transition, we can avoid issuing an update
that will take the large gap lock. This order of queries meant we added
an additional query to the transition step which could impact people who
rely on low-latency Statesman transitions.

This commit is another take on the same approach that collapses two
queries into one, by taking the update of the old and new transition's
most_recent column and updating them together. It's slightly janky but
if robust, would be a good alternative to avoid additional latency.

* fmt

* Replace Arel#or with raw SQL

* Attempt to fix typecasting

* Fix MySQL issue with ordered updates

This commit first refactors the construction of transition SQL
statements to use Arel. This is safer and hopefully more readable than
constructing large SQL statements using strings.

It also fixes a bug with transition updates where MySQL would throw
index violations. This was caused by MySQL validating index constraints
across a partially applied update, where Statesman would set the latest
transition's most_recent = 't' before unsetting the previous.

* Smooth-over breaking API change in Arel

rails/rails@7508284

When merged with Rails core, Arel removed an initialisation parameter
that was unused in all but a few of the Arel features. For now, provide
a helper local to the ActiveRecord adapter that can construct Manager
classes independent of the API change.

* Handle transitions differently for MySQL
To avoid a gap lock with MySQL, first insert the new transition with most_recent=false and then flip the values for the latest two transitions in order.

* MySQL specific gaplock protection is configurable
Instead of depending on the name of the ActiveRecord adapter to identify if the database is MySQL, gaplock protection is configurable, as suggested by @jurre on #399

* Loosen dependency on pg gem
This reflects a recent change in Rails (rails/rails@592358e)

Co-authored-by: Arthur Neves <[email protected]>
Co-authored-by: Lawrence Jones <[email protected]>
Co-authored-by: Grey Baker <[email protected]>
  • Loading branch information
4 people authored May 12, 2020
1 parent f07b491 commit 77c363c
Show file tree
Hide file tree
Showing 6 changed files with 208 additions and 36 deletions.
5 changes: 5 additions & 0 deletions lib/statesman.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,14 @@ module Adapters
def self.configure(&block)
config = Config.new(block)
@storage_adapter = config.adapter_class
@mysql_gaplock_protection = config.mysql_gaplock_protection
end

def self.storage_adapter
@storage_adapter || Adapters::Memory
end

def self.mysql_gaplock_protection?
@mysql_gaplock_protection
end
end
198 changes: 166 additions & 32 deletions lib/statesman/adapters/active_record.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@
module Statesman
module Adapters
class ActiveRecord
attr_reader :transition_class
attr_reader :parent_model

JSON_COLUMN_TYPES = %w[json jsonb].freeze

def self.database_supports_partial_indexes?
Expand All @@ -19,6 +16,10 @@ def self.database_supports_partial_indexes?
end
end

def self.adapter_name
::ActiveRecord::Base.connection.adapter_name.downcase
end

def initialize(transition_class, parent_model, observer, options = {})
serialized = serialized?(transition_class)
column_type = transition_class.columns_hash["metadata"].sql_type
Expand All @@ -29,12 +30,15 @@ def initialize(transition_class, parent_model, observer, options = {})
end

@transition_class = transition_class
@transition_table = transition_class.arel_table
@parent_model = parent_model
@observer = observer
@association_name =
options[:association_name] || @transition_class.table_name
end

attr_reader :transition_class, :transition_table, :parent_model

def create(from, to, metadata = {})
create_transition(from.to_s, to.to_s, metadata)
rescue ::ActiveRecord::RecordNotUnique => e
Expand Down Expand Up @@ -65,26 +69,53 @@ def last(force_reload: false)

private

# rubocop:disable Metrics/MethodLength
def create_transition(from, to, metadata)
transition_attributes = { to_state: to,
sort_key: next_sort_key,
metadata: metadata }

transition_attributes[:most_recent] = true

transition = transitions_for_parent.build(transition_attributes)
transition = transitions_for_parent.build(
default_transition_attributes(to, metadata),
)

::ActiveRecord::Base.transaction(requires_new: true) do
@observer.execute(:before, from, to, transition)
unset_old_most_recent
transition.save!

if mysql_gaplock_protection?
# We save the transition first with most_recent falsy, then mark most_recent
# true after to avoid letting MySQL acquire a next-key lock which can cause
# deadlocks.
#
# To avoid an additional query, we manually adjust the most_recent attribute
# on our transition assuming that update_most_recents will have set it to true

transition.save!

unless update_most_recents(transition.id).positive?
raise ActiveRecord::Rollback, "failed to update most_recent"
end

transition.assign_attributes(most_recent: true)
else
update_most_recents
transition.assign_attributes(most_recent: true)
transition.save!
end

@last_transition = transition
@observer.execute(:after, from, to, transition)
add_after_commit_callback(from, to, transition)
end

transition
end
# rubocop:enable Metrics/MethodLength

def default_transition_attributes(to, metadata)
{
to_state: to,
sort_key: next_sort_key,
metadata: metadata,
most_recent: not_most_recent_value,
}
end

def add_after_commit_callback(from, to, transition)
::ActiveRecord::Base.connection.add_transaction_record(
Expand All @@ -98,20 +129,92 @@ def transitions_for_parent
parent_model.send(@association_name)
end

def unset_old_most_recent
most_recent = transitions_for_parent.where(most_recent: true)
# Sets the given transition most_recent = t while unsetting the most_recent of any
# previous transitions.
def update_most_recents(most_recent_id = nil)
update = build_arel_manager(::Arel::UpdateManager)
update.table(transition_table)
update.where(most_recent_transitions(most_recent_id))
update.set(build_most_recents_update_all_values(most_recent_id))

# Check whether the `most_recent` column allows null values. If it
# doesn't, set old records to `false`, otherwise, set them to `NULL`.
#
# Some conditioning here is required to support databases that don't
# support partial indexes. By doing the conditioning on the column,
# rather than Rails' opinion of whether the database supports partial
# indexes, we're robust to DBs later adding support for partial indexes.
if transition_class.columns_hash["most_recent"].null == false
most_recent.update_all(with_updated_timestamp(most_recent: false))
# MySQL will validate index constraints across the intermediate result of an
# update. This means we must order our update to deactivate the previous
# most_recent before setting the new row to be true.
update.order(transition_table[:most_recent].desc) if mysql_gaplock_protection?

::ActiveRecord::Base.connection.update(update.to_sql)
end

def most_recent_transitions(most_recent_id = nil)
if most_recent_id
transitions_of_parent.and(
transition_table[:id].eq(most_recent_id).or(
transition_table[:most_recent].eq(true),
),
)
else
transitions_of_parent.and(transition_table[:most_recent].eq(true))
end
end

def transitions_of_parent
transition_table[parent_join_foreign_key.to_sym].eq(parent_model.id)
end

# Generates update_all Arel values that will touch the updated timestamp (if valid
# for this model) and set most_recent to true only for the transition with a
# matching most_recent ID.
#
# This is quite nasty, but combines two updates (set all most_recent = f, set
# current most_recent = t) into one, which helps improve transition performance
# especially when database latency is significant.
#
# The SQL this can help produce looks like:
#
# update transitions
# set most_recent = (case when id = 'PA123' then TRUE else FALSE end)
# , updated_at = '...'
# ...
#
def build_most_recents_update_all_values(most_recent_id = nil)
[
[
transition_table[:most_recent],
Arel::Nodes::SqlLiteral.new(most_recent_value(most_recent_id)),
],
].tap do |values|
# Only if we support the updated at timestamps should we add this column to the
# update
updated_column, updated_at = updated_column_and_timestamp

if updated_column
values << [
transition_table[updated_column.to_sym],
updated_at,
]
end
end
end

def most_recent_value(most_recent_id)
if most_recent_id
Arel::Nodes::Case.new.
when(transition_table[:id].eq(most_recent_id)).then(db_true).
else(not_most_recent_value).to_sql
else
Arel::Nodes::SqlLiteral.new(not_most_recent_value)
end
end

# Provide a wrapper for constructing an update manager which handles a breaking API
# change in Arel as we move into Rails >6.0.
#
# https://github.com/rails/rails/commit/7508284800f67b4611c767bff9eae7045674b66f
def build_arel_manager(manager)
if manager.instance_method(:initialize).arity.zero?
manager.new
else
most_recent.update_all(with_updated_timestamp(most_recent: nil))
manager.new(::ActiveRecord::Base)
end
end

Expand Down Expand Up @@ -170,7 +273,8 @@ def association_join_primary_key(association)
end
end

def with_updated_timestamp(params)
# updated_column_and_timestamp should return [column_name, value]
def updated_column_and_timestamp
# TODO: Once we've set expectations that transition classes should conform to
# the interface of Adapters::ActiveRecordTransition as a breaking change in the
# next major version, we can stop calling `#respond_to?` first and instead
Expand All @@ -184,15 +288,45 @@ def with_updated_timestamp(params)
ActiveRecordTransition::DEFAULT_UPDATED_TIMESTAMP_COLUMN
end

return params if column.nil?
# No updated timestamp column, don't return anything
return nil if column.nil?

timestamp = if ::ActiveRecord::Base.default_timezone == :utc
Time.now.utc
else
Time.now
end
[
column, ::ActiveRecord::Base.default_timezone == :utc ? Time.now.utc : Time.now
]
end

def mysql_gaplock_protection?
Statesman.mysql_gaplock_protection?
end

def db_true
value = ::ActiveRecord::Base.connection.type_cast(
true,
transition_class.columns_hash["most_recent"],
)
::ActiveRecord::Base.connection.quote(value)
end

def db_false
value = ::ActiveRecord::Base.connection.type_cast(
false,
transition_class.columns_hash["most_recent"],
)
::ActiveRecord::Base.connection.quote(value)
end

params.merge(column => timestamp)
# Check whether the `most_recent` column allows null values. If it doesn't, set old
# records to `false`, otherwise, set them to `NULL`.
#
# Some conditioning here is required to support databases that don't support partial
# indexes. By doing the conditioning on the column, rather than Rails' opinion of
# whether the database supports partial indexes, we're robust to DBs later adding
# support for partial indexes.
def not_most_recent_value
return db_false if transition_class.columns_hash["most_recent"].null == false

nil
end
end

Expand Down
23 changes: 22 additions & 1 deletion lib/statesman/config.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,35 @@

module Statesman
class Config
attr_reader :adapter_class
attr_reader :adapter_class, :mysql_gaplock_protection

def initialize(block = nil)
instance_eval(&block) unless block.nil?
end

def storage_adapter(adapter_class)
# If our adapter class suggests we're using mysql, enable gaplock protection by
# default.
enable_mysql_gaplock_protection if mysql_adapter?(adapter_class)

@adapter_class = adapter_class
end

def enable_mysql_gaplock_protection
@mysql_gaplock_protection = true
end

private

def mysql_adapter?(adapter_class)
adapter_name = adapter_name(adapter_class)
return false unless adapter_name

adapter_name.start_with?("mysql")
end

def adapter_name(adapter_class)
adapter_class.respond_to?(:adapter_name) && adapter_class&.adapter_name
end
end
end
4 changes: 3 additions & 1 deletion spec/statesman/adapters/active_record_queries_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@ def configure_new(klass, transition_class)
prepare_other_model_table
prepare_other_transitions_table

Statesman.configure { storage_adapter(Statesman::Adapters::ActiveRecord) }
Statesman.configure do
storage_adapter(Statesman::Adapters::ActiveRecord)
end
end

after { Statesman.configure { storage_adapter(Statesman::Adapters::Memory) } }
Expand Down
12 changes: 11 additions & 1 deletion spec/statesman/adapters/active_record_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,19 @@
before do
prepare_model_table
prepare_transitions_table

MyActiveRecordModelTransition.serialize(:metadata, JSON)

Statesman.configure do
# Rubocop requires described_class to be used, but this block
# is instance_eval'd and described_class won't be defined
# rubocop:disable RSpec/DescribedClass
storage_adapter(Statesman::Adapters::ActiveRecord)
# rubocop:enable RSpec/DescribedClass
end
end

before { MyActiveRecordModelTransition.serialize(:metadata, JSON) }
after { Statesman.configure { storage_adapter(Statesman::Adapters::Memory) } }

let(:observer) { double(Statesman::Machine, execute: nil) }
let(:model) { MyActiveRecordModel.create(current_state: :pending) }
Expand Down
2 changes: 1 addition & 1 deletion statesman.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Gem::Specification.new do |spec|
spec.add_development_dependency "bundler", "~> 2.1.4"
spec.add_development_dependency "gc_ruboconfig", "~> 2.3.9"
spec.add_development_dependency "mysql2", ">= 0.4", "< 0.6"
spec.add_development_dependency "pg", "~> 0.18"
spec.add_development_dependency "pg", ">= 0.18", "<= 1.1"
spec.add_development_dependency "pry"
spec.add_development_dependency "rails", ">= 5.2"
spec.add_development_dependency "rake", "~> 13.0.0"
Expand Down

0 comments on commit 77c363c

Please sign in to comment.