update entire initial_size matrix and/or migration rate matrix at once #2196
Replies: 6 comments
-
I'm not sure that we're going to speed things up much by doing that @connor-french, because we're still going to be storing separate objects representing the low-level events (e.g. here) However, I'm sure there must be some low-hanging fruit here perf-wise. Perhaps you could construct a standalone example with similar properties that takes (say) 10 seconds to run . If you post the code (and ideally a timed profile) we can see how to start speeding things up. There's no consideration at all given to performance at the moment, so I'm sure we can improve things. |
Beta Was this translation helpful? Give feedback.
-
Ah, that makes sense. Here is a standalone example that takes ~15 seconds to run, along with a timed profile generated by cProfile. I have a screenshot of my snakeviz visualization below for quick reference. It looks like
![]() |
Beta Was this translation helpful? Give feedback.
-
To add a bit more information here: I think this is the use case where there's a fixed sparse pattern of nonzero entries in the migration matrix, and we want to update many of those nonzero entries often. |
Beta Was this translation helpful? Give feedback.
-
This is very helpful, thanks @connor-french! I think the most useful option for you (which would allow us to optimise quite a bit) would be to make a new method Ideally, this would be passed directly through to the simulated, but this would require some more low-level plumbing. In general, there's a bunch of things we could do to make this application of a large 2D model that changes over time work better - I'm happy to work with you to get those changes into msprime, if you're interested! |
Beta Was this translation helpful? Give feedback.
-
I think that in cases where it's worth having this method (instead of just updating a bunch of rates individually), the migration matrix will be very sparse. Note that in this case the migration matrix is 1e4 x 1e4, but only ~4e4 entries are nonzero. So - you may be right that this is the way to start, but I'm trying to avoid putting in a bunch of work that doesn't solve the issue. |
Beta Was this translation helpful? Give feedback.
-
Agreed that it might still be worth checking which particular rates change, given how sparse the matrices typically are. I can think of a simple way to obtain the indices for rates that need to be updated, and update the migration matrix based on these indices. Maybe the original donor/recipient populations could be backtracked from these indices?
I'm definitely down to work on making these large 2D stepping stone models work better! |
Beta Was this translation helpful? Give feedback.
-
Hello all!
I'm running large 2D stepping stone models (deme arrays ~100 x 100 or larger) that change through time and have come across a time limiting step when setting up the Demography models. After initializing the Demography object with an initial_size matrix and directly assigning a matrix of migration rates to the the model.migration_matrix slot, I need to update the values at different time points in the past.
The current way I'm doing this is to use
add_population_parameters_change()
to update the initial_sizes andadd_migration_rate_change()
for the migration rates at discrete time points. However, this takes a very long time when updating many demes and migration rates (~2 minutes for a sparse initial_size array of size 110 x 117, and I stopped tracking time after 20 minutes for a fully filled initial_size array of the same size).Would it be possible to implement a feature where
add_population_parameters_change()
andadd_migration_rate_change()
are able to have arrays as an input, so multiple populations and multiple migration rates can be updated at a specific time point? This could drastically speed up model setup for these unreasonably large models I and possibly others are trying to run.Oh yeah, I meant to loop @petrelharp in on this
Beta Was this translation helpful? Give feedback.
All reactions