-
Notifications
You must be signed in to change notification settings - Fork 27
Processing
To provide any benefit, processing must eventually generate results. These can be held in-memory (with size bounded by system memory) or on disc (effectively unbounded).
To allow the writing of unbounded results we can create the "mirror" of an ArrayAdapter:
deferred_save = biggus.deferred_save(my_big_biggus_array, my_hdf5_variable)
Having declared multiple save bindings we can then realise those:
deferred_save2 = biggus.deferred_save(my_other_big_biggus_array, my_other_hdf5_variable)
biggus.evaluate([deferred_save, deferred_save2])
NB. We need to be able to mix unbounded results with bounded results. For example:
# Derive a big "speed" dataset from two big U and V datasets.
speed = biggus.sqrt(biggus.add(biggus.square(u), biggus.square(v)))
# Save the big "speed" dataset.
speed_save = biggus.deferred_save(speed, hdf5_speed_variable)
# Calculate the small time-mean of the speed.
mean_speed = biggus.mean(speed, axis=0)
mean = biggus.evaluate([speed_save, mean_speed])
# NB. At this point `hdf5_speed_variable` has been filled out, and `mean` is a numpy array.
Open questions:
- How to control the return of a masked/non-masked array?
- Can/should creating a result as a numpy array be regarded as a special case of an unbounded write to a numpy array?
deferred_save3 = biggus.deferred_save(a_small_biggus_array, my_ndarray)
biggus.evaluate([deferred_save3])
In general, not all evaluations are compatible. For example, two aggregations derived from a common input may require different scan patterns of the source data in order to keep their working state to a manageable size. So the high level process description is:
- Divide the requested expression tree(s) into maximal sub-trees where all the expressions have compatible scan orders. (i.e. It's not just spotting common sources.)
- Execute each expression sub-tree.
- Each source node should provide a preferred dimension order. (e.g. For a simple wrapper around a tzyx netCDF variable the preferred dimension order would just be tzyx.)
- Each aggregation node should combine the input data shape with the aggregation axis and a scratch space memory threshold to determine any constraints on the input scan pattern. (e.g. A mean-over-E of ETZYX might result in a scan pattern chunk size of just ZYX if TZYX is deemed too large.)
- Determine chunk size from natural dimension order (e.g. ETZYX).
- Determine scan order from remaining dimension (incl. any partially covered by a chunk) but with the aggregation axis as highest priority.
- e.g. Aggregate over E of ETZYX => chunk size T[:2]ZYX, scan order TE.
- Element-wise expression nodes are stateless so they have no effect on the preferred dimension order, or the scanning constraints.
Conceptually it might be simplest to create a thread per source node, and a thread for the output of each node, with limited length queues in-between. But this would result in non-deterministic execution.
A deterministic alternative would be to execute an expression tree with a single recursive function. But this would lose the performance benefits of concurrent I/O and calculation.