In this notebook, we demonstrate the experimental dask-yt particle loader in ./dask_chunking/gadget_da.py
The dask approach here attempts to wrap the loading and filtering of individual chunks with the dask.delayed
operator (to various degress of success...), resulting in a lazy load of a gadget
particle dataset that is automatically parallelized when running a dask.distributed.Client
.
from dask_chunking import gadget_da as gda
from dask import compute, visualize
import yt
import numpy as np
First, let's spin up a dask Client
:
from dask.distributed import Client
c = Client(threads_per_worker=2,n_workers=4)
c
Client
|
Cluster
|
and import a gadget
dataset.
ds = yt.load_sample("snapshot_033")
yt : [WARNING ] 2020-09-29 16:17:10,398 tqdm is not installed, progress bar can not be displayed. yt : [INFO ] 2020-09-29 16:17:10,930 Files located at /home/chavlin/hdd/data/yt_data/yt_sample_sets/snapshot_033.tar.gz.untar/snapshot_033/snap_033. yt : [INFO ] 2020-09-29 16:17:10,931 Default to loading snap_033.0.hdf5 for snapshot_033 dataset yt : [INFO ] 2020-09-29 16:17:10,992 Parameters: current_time = 4.343952725460923e+17 s
at present, we're going to use yt's current chunk
ing methods to distribute each base chunk (composed of a single hdf file with start and end indices within it) to a dask-delayed read function. We'll be using the experimental delayed_gadget
class within dask_chunking/gadget_da.py
, which we initialize with the ds
object, a dictionary of particle types and fields to import ptf
and optional subchunk_size
(explained later):
ptf = {'PartType0': ['Mass']}
delayed_reader = gda.delayed_gadget(ds, ptf, subchunk_size = None)
yt : [INFO ] 2020-09-29 16:17:11,050 Allocating for 4.194e+06 particles Loading particle index: 100%|██████████| 12/12 [00:00<00:00, 190.07it/s]
within the initialization, this class will call stage_chunks()
, which is where the dask magic happens:
self.delayed_chunks = []
for df in self.data_files:
# dask_delayed will need to serialize all objects passed -- can't handle df object, so
# let's just pull out what we need for this chunk:
df_dict = {key:getattr(df,key) for key in ['filename','start','end','total_particles']}
df_dict['var_mass']=self.var_mass
df_dict['_element_names']=self._element_names
self.delayed_chunks.append(delayed_chunk_read(self.ptf,df_dict,self.subchunk_size))
here, we're assembling a list of delayed objects. A single delayed_chunk_read
will read in a single chunk and is decorated with the @dask.delayed
decorator to signal to dask that we want to delay this function:
@dask.delayed
def delayed_chunk_read(ptf,df_dict,subchunk_size):
return chunk_reader(ptf,df_dict,subchunk_size).read()
The chunk_reader
object is a class for loading a single chunk, comprised of code from the gadget
front end hdf loader.
Note one very important point: all arguments to the delayed function must be pickle-serializable as dask will pickle the arguments to send to the different processors on execution. This is why the second argument to delayed_chunk_read
is df_dict
: self.data_files
is a list of ds.index.data_files[0]
objects (each is a yt.data_objects.static_output.ParticleFile
) and var_mas
and _element_names
are attributes of ds.index.io
. So far, the built in dask-pickling of yt
objects has failed (due to cython issues -- will touch on this more below when we get to filtering....), but at this point, we just need some pretty basic data to pass (e.g., the filename, start and end indices in each file) so we store that info in a dict
to pass to the delayed_chunk_read
.
Ok, so after initializing our delayed_reader
object, we'll have a list of delayed_chunks
:
delayed_reader.delayed_chunks
[Delayed('delayed_chunk_read-e80b8a5d-1f8f-4076-a36c-31e32767b910'), Delayed('delayed_chunk_read-330489c6-6ce8-4469-b3b5-82023af5cfd4'), Delayed('delayed_chunk_read-0d7fc985-65f0-4807-9b34-c654e9fe80a0'), Delayed('delayed_chunk_read-ef7075a8-c830-45e1-976f-8b6eb3ae67b4'), Delayed('delayed_chunk_read-9dbaba38-5522-49fd-bec1-cddb671fe25d'), Delayed('delayed_chunk_read-6f2f4385-b14c-4988-9ef0-335659a60e40'), Delayed('delayed_chunk_read-788e2d05-e7ca-4a1a-ae54-366a634710bf'), Delayed('delayed_chunk_read-6c6f217e-2cbc-4417-9743-e03a46f17df0'), Delayed('delayed_chunk_read-bbd70b46-8cb5-47de-ab01-bc6d6228988d'), Delayed('delayed_chunk_read-38554e82-f6a0-49ce-a80f-879fedd8bc1f'), Delayed('delayed_chunk_read-387e6bb1-b39b-4ebe-b0e1-5f9eb8956adf'), Delayed('delayed_chunk_read-db32222e-dfcf-442e-9493-4fde741b085a')]
A single delayed task can be visualized:
delayed_reader.delayed_chunks[0].visualize()
and then computed:
data = delayed_reader.delayed_chunks[0].compute()
data
({'PartType0': array([[ 9.0948925 , 18.5401268 , 13.50576115], [ 9.09940147, 18.55133247, 13.51190567], [ 9.08276558, 18.54066086, 13.49641895], ..., [ 9.94860458, 8.4767704 , 14.56663513], [ 9.94866085, 8.47825813, 14.56705093], [ 9.94791031, 8.47807693, 14.56690121]])}, {('PartType0', 'Mass'): array([0.01576188, 0.01663664, 0.01871505, 0.00970176, 0.01931598, 0.01710295, 0.01869999, 0.00868698, 0.01191492, 0.01526247, 0.00870004, 0.00864736, 0.00866873, 0.01210038, 0.00866515, 0.00865903, 0.00893035, 0.00864386, 0.00864437, 0.0095512 , 0.009749 , 0.00864466, 0.00952867, 0.00864396, 0.00865185, 0.00864417, 0.00865002, 0.00864616, 0.00864787, 0.0086467 , 0.00864758, 0.00865183, 0.00864775, 0.00939687, 0.0086449 , 0.00871795, 0.01162997, 0.01007061, 0.00868322, 0.01283415, 0.00864652, 0.00864493, 0.00864921, 0.00864537, 0.00867223, 0.00864423, 0.00864423, 0.00864427, 0.00864402, 0.00864405, 0.00864469, 0.00864398, 0.00864398, 0.00864617, 0.00865819, 0.00864479, 0.00865467, 0.00865023, 0.00864672, 0.00864549, 0.0086541 , 0.00864614, 0.0086621 , 0.00864856, 0.00866107, 0.00864497, 0.00864437, 0.00872628, 0.00865319, 0.00864782, 0.00867574, 0.00865113, 0.00864541, 0.00864583, 0.00865423, 0.00864672, 0.0094079 , 0.00865236, 0.00865042, 0.00864537, 0.0095771 , 0.00864497, 0.0086466 , 0.00864951, 0.00864564, 0.00864719, 0.00973641, 0.01123166, 0.0086561 , 0.00865673, 0.00867016, 0.00866977, 0.00865894, 0.00864647, 0.00864547, 0.00864431, 0.0111323 , 0.00864911, 0.00864662, 0.00868544, 0.00871773, 0.00865634, 0.0086528 , 0.0086698 , 0.00867322, 0.01034854, 0.00867792, 0.00864386, 0.0086446 , 0.0086485 , 0.00867851, 0.00864389, 0.00864397, 0.00864393, 0.00865264, 0.0086439 , 0.00864573, 0.00864641, 0.00865118, 0.00884463, 0.00864407, 0.00871034, 0.00865471, 0.00867828, 0.008648 , 0.00864427, 0.00868855, 0.00864828, 0.00867095, 0.00866964, 0.00866721, 0.00865462, 0.0086743 , 0.00866969, 0.00864895, 0.00865275, 0.00864571, 0.00865658, 0.00865294, 0.00867992, 0.00867557, 0.00868229, 0.00865193, 0.00865912, 0.00886692, 0.00868004, 0.0086862 , 0.019488 , 0.01122443, 0.02183115, 0.00866625, 0.00871245, 0.00867322, 0.00865103, 0.00864984, 0.00868173, 0.00872087, 0.02139982, 0.00966472, 0.01720018, 0.0188228 , 0.00870653, 0.00874153, 0.00868849, 0.02225389, 0.01747124, 0.01951511, 0.010462 , 0.01086188, 0.00889738, 0.02430551, 0.01698648, 0.00949531, 0.00874803, 0.00882819, 0.01417806, 0.0155667 , 0.01420717, 0.00888111, 0.00911644, 0.01714752, 0.01734492, 0.01464341, 0.00908168, 0.02144803, 0.00971754, 0.0148535 , 0.0099078 , 0.01139312, 0.01358491, 0.01857252, 0.01024381, 0.00927314, 0.01057635, 0.00979624, 0.01658888, 0.01408297, 0.00916107, 0.01534646, 0.00989786, 0.01286644, 0.00963002, 0.00885178, 0.00875167, 0.01189861, 0.01009631, 0.00870869, 0.00879007, 0.01150377, 0.00997297, 0.0087672 , 0.00869129, 0.00868801, 0.00867148, 0.01129076, 0.00868363, 0.00864965, 0.01027128, 0.00874708, 0.00876408, 0.0093245 , 0.00882076, 0.00898408, 0.00865777, 0.01579425, 0.00865272, 0.00906334, 0.00910297, 0.00948077, 0.02122764, 0.01258226, 0.02637874, 0.01447314, 0.00951241, 0.0102029 , 0.00874167, 0.00866766, 0.00865206, 0.00881413, 0.00890438, 0.00881805, 0.01753351, 0.01460347, 0.01022284, 0.01129906, 0.00892293, 0.01131135, 0.01149893, 0.00896504, 0.00892386, 0.0088651 , 0.00881195, 0.00879243, 0.00865685, 0.01000845, 0.00872331, 0.00981471, 0.00867905, 0.00896335, 0.00958879, 0.00891158, 0.0094134 , 0.01036824, 0.00924235, 0.00927313, 0.01221033, 0.01330525, 0.02035448, 0.00941199, 0.0100465 , 0.01363682, 0.01297417, 0.01663146, 0.01457284, 0.00903581, 0.0101647 , 0.0157372 , 0.00878596, 0.00913168, 0.0107145 , 0.00977161, 0.00956479, 0.00882804, 0.00884613, 0.01691047, 0.00893549, 0.00903166, 0.00894935, 0.00877558, 0.01156324, 0.00924913, 0.01358276, 0.01894272, 0.00955947, 0.00942666, 0.00875606, 0.00881323, 0.00956746, 0.00883421, 0.01629753, 0.0086669 , 0.00876433, 0.0096875 , 0.01188321, 0.00955884, 0.00873123, 0.00865462, 0.00867205, 0.00865692, 0.0086477 , 0.00879077, 0.00901418, 0.00893167, 0.00881205, 0.00865754, 0.00935456, 0.00976939, 0.00866351, 0.00867221, 0.0143476 , 0.00867235, 0.00870647, 0.01151083, 0.01523547, 0.00873427, 0.02577072, 0.02243926, 0.00878209, 0.00868718, 0.0086521 , 0.01103147, 0.02126098, 0.00865098, 0.01706288, 0.00923983, 0.01008528, 0.0086549 , 0.0086628 , 0.00865432, 0.00866331, 0.00864874, 0.00865209, 0.02152279, 0.00866388, 0.00865532, 0.00864429, 0.00864659, 0.00868481, 0.01024063, 0.00865658, 0.00865499, 0.00864694, 0.00867339, 0.01285803, 0.00866809, 0.00866094, 0.0086523 , 0.00864416, 0.00865817, 0.00864684, 0.00865339, 0.00864393, 0.00866137, 0.0108283 , 0.00865872, 0.00865904, 0.01000292, 0.00865042, 0.01252308, 0.0156426 , 0.00864781, 0.00865091, 0.00864583, 0.00895159, 0.00867303, 0.01230804, 0.01055389, 0.01355155, 0.01290767, 0.01301 , 0.00865055, 0.00864436, 0.00867914, 0.00864389, 0.00864386, 0.00864974, 0.00864396, 0.0086496 , 0.00864521, 0.00864442, 0.00864768, 0.00864387, 0.00864543, 0.00865471, 0.00864542, 0.00885514, 0.02166013, 0.01593382, 0.03830643, 0.02097851, 0.02344084, 0.02224258, 0.01855106, 0.01791457, 0.01292829, 0.01000241, 0.01100658, 0.01275772, 0.0119765 , 0.00996338, 0.02194469, 0.01048275, 0.01669842, 0.01753752, 0.02722569, 0.01266365, 0.01107106, 0.01257765, 0.01240593], dtype=float32)}, {'PartType0': 419}, {'PartType0': array([0.0362274 , 0.04072027, 0.03312379, 0.02896558, 0.02205147, 0.00877114, 0.02944615, 0.05218949, 0.04479058, 0.04834923, 0.05498597, 0.04909486, 0.06056419, 0.05916185, 0.06536201, 0.0655411 , 0.07402892, 0.08249536, 0.06964301, 0.07268751, 0.07055246, 0.07013194, 0.06199848, 0.0656049 , 0.05608417, 0.06475606, 0.06629061, 0.07440273, 0.06142577, 0.06010155, 0.06914758, 0.07620735, 0.07339235, 0.07072945, 0.06843527, 0.0668426 , 0.07417636, 0.06568062, 0.06325129, 0.05857513, 0.05703314, 0.05622455, 0.06062964, 0.0561012 , 0.06240107, 0.05941654, 0.05907911, 0.06473345, 0.0650272 , 0.06583549, 0.08296753, 0.07280328, 0.07204369, 0.08683529, 0.08404635, 0.0888662 , 0.08750486, 0.07622379, 0.07694837, 0.08925033, 0.0933044 , 0.09438443, 0.09164637, 0.08713006, 0.08212744, 0.08015949, 0.09371071, 0.08351221, 0.07779962, 0.07042104, 0.07774556, 0.07544592, 0.06854992, 0.07128523, 0.0756572 , 0.07094415, 0.06158028, 0.07236401, 0.07635707, 0.07759096, 0.08299293, 0.07545715, 0.09372611, 0.08793347, 0.08378398, 0.07942198, 0.09281966, 0.08754937, 0.08050711, 0.08336712, 0.09307314, 0.0723382 , 0.07306948, 0.07770362, 0.07114083, 0.07283483, 0.05471307, 0.07389059, 0.06519151, 0.06373846, 0.06376646, 0.07601401, 0.07296657, 0.07870615, 0.08051294, 0.08569593, 0.08913077, 0.07907289, 0.08016539, 0.08965013, 0.07939333, 0.08652394, 0.08368821, 0.09370465, 0.08823942, 0.09525933, 0.08731857, 0.07737144, 0.07954582, 0.07683637, 0.09273514, 0.08429405, 0.07646301, 0.07249814, 0.08083472, 0.06946927, 0.06482527, 0.06914237, 0.04874399, 0.06462259, 0.07467827, 0.07808919, 0.07417724, 0.07948452, 0.07750621, 0.07887023, 0.07469898, 0.0635175 , 0.07534111, 0.06475607, 0.06666034, 0.05508586, 0.07060304, 0.0638115 , 0.05223786, 0.06709495, 0.06123501, 0.06320112, 0.06081865, 0.06781058, 0.076626 , 0.08280164, 0.06663618, 0.07467308, 0.07235598, 0.06975453, 0.06586163, 0.03802168, 0.04314238, 0.05293893, 0.04910813, 0.03643692, 0.03174896, 0.01563721, 0.01464684, 0.00705728, 0.00844126, 0.003637 , 0.00700946, 0.00468495, 0.00468506, 0.00480814, 0.00876328, 0.01127337, 0.01188874, 0.00671419, 0.00639295, 0.0064003 , 0.00590129, 0.00600383, 0.00633316, 0.00669372, 0.00609435, 0.00547613, 0.00497977, 0.00465166, 0.00378585, 0.00572562, 0.00619436, 0.01667102, 0.00620233, 0.00542352, 0.00491 , 0.0037913 , 0.00595162, 0.00387002, 0.00459971, 0.00545381, 0.00708293, 0.00773352, 0.00602002, 0.00624634, 0.00616982, 0.00828608, 0.00708508, 0.00467211, 0.0057542 , 0.00875203, 0.00923868, 0.00805847, 0.0077056 , 0.00748745, 0.00805753, 0.01527414, 0.01525134, 0.0167823 , 0.01676001, 0.00828551, 0.00676108, 0.00768243, 0.00647049, 0.00640104, 0.0313032 , 0.03243444, 0.03455401, 0.03763789, 0.01178556, 0.00573116, 0.00720229, 0.00811889, 0.01672093, 0.01295587, 0.01542056, 0.01643852, 0.01129668, 0.00957788, 0.01151714, 0.03219613, 0.03451392, 0.01773368, 0.01375548, 0.01400819, 0.00457728, 0.00552251, 0.00548933, 0.00564196, 0.00541974, 0.00687447, 0.00584219, 0.01065555, 0.00739247, 0.00890108, 0.00761655, 0.01243902, 0.0103886 , 0.00894922, 0.00662579, 0.01072801, 0.01050928, 0.00722813, 0.00763576, 0.00589884, 0.00546007, 0.00556586, 0.00624686, 0.00556092, 0.00500055, 0.00548144, 0.00415254, 0.00414063, 0.00410838, 0.00407228, 0.00462903, 0.00576377, 0.00424026, 0.00496545, 0.00462801, 0.00410257, 0.00458128, 0.00471121, 0.00435966, 0.00414989, 0.00427084, 0.00470036, 0.00563183, 0.00603742, 0.0070658 , 0.01065387, 0.00656062, 0.00464363, 0.00543656, 0.00482806, 0.006045 , 0.0047382 , 0.00456503, 0.00455985, 0.00497938, 0.00526886, 0.0059534 , 0.00542501, 0.00593828, 0.00441934, 0.00679043, 0.00768412, 0.00478962, 0.00497975, 0.0096066 , 0.00931533, 0.00647827, 0.01122897, 0.00814056, 0.00886422, 0.0085073 , 0.00892586, 0.01026801, 0.00671302, 0.01081401, 0.01183196, 0.0130468 , 0.01662675, 0.02419041, 0.01684287, 0.0181704 , 0.01506921, 0.01469795, 0.01510171, 0.01064657, 0.00912229, 0.01318832, 0.02469332, 0.02548047, 0.02698232, 0.03244451, 0.01577784, 0.03815224, 0.03508397, 0.04572237, 0.04062009, 0.03922674, 0.04328152, 0.04045059, 0.03814987, 0.03539773, 0.03825895, 0.05366944, 0.05259725, 0.04932721, 0.04996442, 0.05926653, 0.05825708, 0.04762925, 0.05079945, 0.05384446, 0.0520142 , 0.05647924, 0.06862377, 0.082898 , 0.06551883, 0.06457354, 0.06285927, 0.05532062, 0.06260139, 0.06386063, 0.05833791, 0.04712606, 0.05040953, 0.03517448, 0.03794625, 0.02638085, 0.04859691, 0.05473974, 0.05040754, 0.05122105, 0.0418124 , 0.0412233 , 0.02709608, 0.03020647, 0.03555377, 0.05101901, 0.04354959, 0.04748858, 0.05974634, 0.04787937, 0.0653081 , 0.07386712, 0.05966709, 0.06532861, 0.07046203, 0.0742046 , 0.07327932, 0.07168894, 0.07655903, 0.08079273, 0.08997273, 0.08147647, 0.00367353, 0.00323596, 0.00556057, 0.00267067, 0.00220054, 0.00235112, 0.00239005, 0.00257686, 0.00271898, 0.00263264, 0.00296338, 0.00251245, 0.00264074, 0.00287286, 0.00268097, 0.00183029, 0.00230579, 0.00183512, 0.00167413, 0.00162726, 0.00196856, 0.00309351, 0.00218892, 0.00213684])})
data[0]
{'PartType0': array([[ 9.0948925 , 18.5401268 , 13.50576115], [ 9.09940147, 18.55133247, 13.51190567], [ 9.08276558, 18.54066086, 13.49641895], ..., [ 9.94860458, 8.4767704 , 14.56663513], [ 9.94866085, 8.47825813, 14.56705093], [ 9.94791031, 8.47807693, 14.56690121]])}
the returned object is a tuple of dicts containing the coordinates (data[0]
), field values (data[1]
), total particles (data[2]
) and smoothing array (data[3]
) organized by particle type. S
Now let's visualize the full list of delayed objects using dask.visualize
visualize(*delayed_reader.delayed_chunks)
which shows that each process is independent (good for parallelizing!) and we can bring the entire dataset into memory with:
%%time
all_data = compute(*delayed_reader.delayed_chunks)
CPU times: user 76.2 ms, sys: 68.6 ms, total: 145 ms Wall time: 1.97 s
all_data[:2]
(({'PartType0': array([[ 9.0948925 , 18.5401268 , 13.50576115], [ 9.09940147, 18.55133247, 13.51190567], [ 9.08276558, 18.54066086, 13.49641895], ..., [ 9.94860458, 8.4767704 , 14.56663513], [ 9.94866085, 8.47825813, 14.56705093], [ 9.94791031, 8.47807693, 14.56690121]])}, {('PartType0', 'Mass'): array([0.01576188, 0.01663664, 0.01871505, 0.00970176, 0.01931598, 0.01710295, 0.01869999, 0.00868698, 0.01191492, 0.01526247, 0.00870004, 0.00864736, 0.00866873, 0.01210038, 0.00866515, 0.00865903, 0.00893035, 0.00864386, 0.00864437, 0.0095512 , 0.009749 , 0.00864466, 0.00952867, 0.00864396, 0.00865185, 0.00864417, 0.00865002, 0.00864616, 0.00864787, 0.0086467 , 0.00864758, 0.00865183, 0.00864775, 0.00939687, 0.0086449 , 0.00871795, 0.01162997, 0.01007061, 0.00868322, 0.01283415, 0.00864652, 0.00864493, 0.00864921, 0.00864537, 0.00867223, 0.00864423, 0.00864423, 0.00864427, 0.00864402, 0.00864405, 0.00864469, 0.00864398, 0.00864398, 0.00864617, 0.00865819, 0.00864479, 0.00865467, 0.00865023, 0.00864672, 0.00864549, 0.0086541 , 0.00864614, 0.0086621 , 0.00864856, 0.00866107, 0.00864497, 0.00864437, 0.00872628, 0.00865319, 0.00864782, 0.00867574, 0.00865113, 0.00864541, 0.00864583, 0.00865423, 0.00864672, 0.0094079 , 0.00865236, 0.00865042, 0.00864537, 0.0095771 , 0.00864497, 0.0086466 , 0.00864951, 0.00864564, 0.00864719, 0.00973641, 0.01123166, 0.0086561 , 0.00865673, 0.00867016, 0.00866977, 0.00865894, 0.00864647, 0.00864547, 0.00864431, 0.0111323 , 0.00864911, 0.00864662, 0.00868544, 0.00871773, 0.00865634, 0.0086528 , 0.0086698 , 0.00867322, 0.01034854, 0.00867792, 0.00864386, 0.0086446 , 0.0086485 , 0.00867851, 0.00864389, 0.00864397, 0.00864393, 0.00865264, 0.0086439 , 0.00864573, 0.00864641, 0.00865118, 0.00884463, 0.00864407, 0.00871034, 0.00865471, 0.00867828, 0.008648 , 0.00864427, 0.00868855, 0.00864828, 0.00867095, 0.00866964, 0.00866721, 0.00865462, 0.0086743 , 0.00866969, 0.00864895, 0.00865275, 0.00864571, 0.00865658, 0.00865294, 0.00867992, 0.00867557, 0.00868229, 0.00865193, 0.00865912, 0.00886692, 0.00868004, 0.0086862 , 0.019488 , 0.01122443, 0.02183115, 0.00866625, 0.00871245, 0.00867322, 0.00865103, 0.00864984, 0.00868173, 0.00872087, 0.02139982, 0.00966472, 0.01720018, 0.0188228 , 0.00870653, 0.00874153, 0.00868849, 0.02225389, 0.01747124, 0.01951511, 0.010462 , 0.01086188, 0.00889738, 0.02430551, 0.01698648, 0.00949531, 0.00874803, 0.00882819, 0.01417806, 0.0155667 , 0.01420717, 0.00888111, 0.00911644, 0.01714752, 0.01734492, 0.01464341, 0.00908168, 0.02144803, 0.00971754, 0.0148535 , 0.0099078 , 0.01139312, 0.01358491, 0.01857252, 0.01024381, 0.00927314, 0.01057635, 0.00979624, 0.01658888, 0.01408297, 0.00916107, 0.01534646, 0.00989786, 0.01286644, 0.00963002, 0.00885178, 0.00875167, 0.01189861, 0.01009631, 0.00870869, 0.00879007, 0.01150377, 0.00997297, 0.0087672 , 0.00869129, 0.00868801, 0.00867148, 0.01129076, 0.00868363, 0.00864965, 0.01027128, 0.00874708, 0.00876408, 0.0093245 , 0.00882076, 0.00898408, 0.00865777, 0.01579425, 0.00865272, 0.00906334, 0.00910297, 0.00948077, 0.02122764, 0.01258226, 0.02637874, 0.01447314, 0.00951241, 0.0102029 , 0.00874167, 0.00866766, 0.00865206, 0.00881413, 0.00890438, 0.00881805, 0.01753351, 0.01460347, 0.01022284, 0.01129906, 0.00892293, 0.01131135, 0.01149893, 0.00896504, 0.00892386, 0.0088651 , 0.00881195, 0.00879243, 0.00865685, 0.01000845, 0.00872331, 0.00981471, 0.00867905, 0.00896335, 0.00958879, 0.00891158, 0.0094134 , 0.01036824, 0.00924235, 0.00927313, 0.01221033, 0.01330525, 0.02035448, 0.00941199, 0.0100465 , 0.01363682, 0.01297417, 0.01663146, 0.01457284, 0.00903581, 0.0101647 , 0.0157372 , 0.00878596, 0.00913168, 0.0107145 , 0.00977161, 0.00956479, 0.00882804, 0.00884613, 0.01691047, 0.00893549, 0.00903166, 0.00894935, 0.00877558, 0.01156324, 0.00924913, 0.01358276, 0.01894272, 0.00955947, 0.00942666, 0.00875606, 0.00881323, 0.00956746, 0.00883421, 0.01629753, 0.0086669 , 0.00876433, 0.0096875 , 0.01188321, 0.00955884, 0.00873123, 0.00865462, 0.00867205, 0.00865692, 0.0086477 , 0.00879077, 0.00901418, 0.00893167, 0.00881205, 0.00865754, 0.00935456, 0.00976939, 0.00866351, 0.00867221, 0.0143476 , 0.00867235, 0.00870647, 0.01151083, 0.01523547, 0.00873427, 0.02577072, 0.02243926, 0.00878209, 0.00868718, 0.0086521 , 0.01103147, 0.02126098, 0.00865098, 0.01706288, 0.00923983, 0.01008528, 0.0086549 , 0.0086628 , 0.00865432, 0.00866331, 0.00864874, 0.00865209, 0.02152279, 0.00866388, 0.00865532, 0.00864429, 0.00864659, 0.00868481, 0.01024063, 0.00865658, 0.00865499, 0.00864694, 0.00867339, 0.01285803, 0.00866809, 0.00866094, 0.0086523 , 0.00864416, 0.00865817, 0.00864684, 0.00865339, 0.00864393, 0.00866137, 0.0108283 , 0.00865872, 0.00865904, 0.01000292, 0.00865042, 0.01252308, 0.0156426 , 0.00864781, 0.00865091, 0.00864583, 0.00895159, 0.00867303, 0.01230804, 0.01055389, 0.01355155, 0.01290767, 0.01301 , 0.00865055, 0.00864436, 0.00867914, 0.00864389, 0.00864386, 0.00864974, 0.00864396, 0.0086496 , 0.00864521, 0.00864442, 0.00864768, 0.00864387, 0.00864543, 0.00865471, 0.00864542, 0.00885514, 0.02166013, 0.01593382, 0.03830643, 0.02097851, 0.02344084, 0.02224258, 0.01855106, 0.01791457, 0.01292829, 0.01000241, 0.01100658, 0.01275772, 0.0119765 , 0.00996338, 0.02194469, 0.01048275, 0.01669842, 0.01753752, 0.02722569, 0.01266365, 0.01107106, 0.01257765, 0.01240593], dtype=float32)}, {'PartType0': 419}, {'PartType0': array([0.0362274 , 0.04072027, 0.03312379, 0.02896558, 0.02205147, 0.00877114, 0.02944615, 0.05218949, 0.04479058, 0.04834923, 0.05498597, 0.04909486, 0.06056419, 0.05916185, 0.06536201, 0.0655411 , 0.07402892, 0.08249536, 0.06964301, 0.07268751, 0.07055246, 0.07013194, 0.06199848, 0.0656049 , 0.05608417, 0.06475606, 0.06629061, 0.07440273, 0.06142577, 0.06010155, 0.06914758, 0.07620735, 0.07339235, 0.07072945, 0.06843527, 0.0668426 , 0.07417636, 0.06568062, 0.06325129, 0.05857513, 0.05703314, 0.05622455, 0.06062964, 0.0561012 , 0.06240107, 0.05941654, 0.05907911, 0.06473345, 0.0650272 , 0.06583549, 0.08296753, 0.07280328, 0.07204369, 0.08683529, 0.08404635, 0.0888662 , 0.08750486, 0.07622379, 0.07694837, 0.08925033, 0.0933044 , 0.09438443, 0.09164637, 0.08713006, 0.08212744, 0.08015949, 0.09371071, 0.08351221, 0.07779962, 0.07042104, 0.07774556, 0.07544592, 0.06854992, 0.07128523, 0.0756572 , 0.07094415, 0.06158028, 0.07236401, 0.07635707, 0.07759096, 0.08299293, 0.07545715, 0.09372611, 0.08793347, 0.08378398, 0.07942198, 0.09281966, 0.08754937, 0.08050711, 0.08336712, 0.09307314, 0.0723382 , 0.07306948, 0.07770362, 0.07114083, 0.07283483, 0.05471307, 0.07389059, 0.06519151, 0.06373846, 0.06376646, 0.07601401, 0.07296657, 0.07870615, 0.08051294, 0.08569593, 0.08913077, 0.07907289, 0.08016539, 0.08965013, 0.07939333, 0.08652394, 0.08368821, 0.09370465, 0.08823942, 0.09525933, 0.08731857, 0.07737144, 0.07954582, 0.07683637, 0.09273514, 0.08429405, 0.07646301, 0.07249814, 0.08083472, 0.06946927, 0.06482527, 0.06914237, 0.04874399, 0.06462259, 0.07467827, 0.07808919, 0.07417724, 0.07948452, 0.07750621, 0.07887023, 0.07469898, 0.0635175 , 0.07534111, 0.06475607, 0.06666034, 0.05508586, 0.07060304, 0.0638115 , 0.05223786, 0.06709495, 0.06123501, 0.06320112, 0.06081865, 0.06781058, 0.076626 , 0.08280164, 0.06663618, 0.07467308, 0.07235598, 0.06975453, 0.06586163, 0.03802168, 0.04314238, 0.05293893, 0.04910813, 0.03643692, 0.03174896, 0.01563721, 0.01464684, 0.00705728, 0.00844126, 0.003637 , 0.00700946, 0.00468495, 0.00468506, 0.00480814, 0.00876328, 0.01127337, 0.01188874, 0.00671419, 0.00639295, 0.0064003 , 0.00590129, 0.00600383, 0.00633316, 0.00669372, 0.00609435, 0.00547613, 0.00497977, 0.00465166, 0.00378585, 0.00572562, 0.00619436, 0.01667102, 0.00620233, 0.00542352, 0.00491 , 0.0037913 , 0.00595162, 0.00387002, 0.00459971, 0.00545381, 0.00708293, 0.00773352, 0.00602002, 0.00624634, 0.00616982, 0.00828608, 0.00708508, 0.00467211, 0.0057542 , 0.00875203, 0.00923868, 0.00805847, 0.0077056 , 0.00748745, 0.00805753, 0.01527414, 0.01525134, 0.0167823 , 0.01676001, 0.00828551, 0.00676108, 0.00768243, 0.00647049, 0.00640104, 0.0313032 , 0.03243444, 0.03455401, 0.03763789, 0.01178556, 0.00573116, 0.00720229, 0.00811889, 0.01672093, 0.01295587, 0.01542056, 0.01643852, 0.01129668, 0.00957788, 0.01151714, 0.03219613, 0.03451392, 0.01773368, 0.01375548, 0.01400819, 0.00457728, 0.00552251, 0.00548933, 0.00564196, 0.00541974, 0.00687447, 0.00584219, 0.01065555, 0.00739247, 0.00890108, 0.00761655, 0.01243902, 0.0103886 , 0.00894922, 0.00662579, 0.01072801, 0.01050928, 0.00722813, 0.00763576, 0.00589884, 0.00546007, 0.00556586, 0.00624686, 0.00556092, 0.00500055, 0.00548144, 0.00415254, 0.00414063, 0.00410838, 0.00407228, 0.00462903, 0.00576377, 0.00424026, 0.00496545, 0.00462801, 0.00410257, 0.00458128, 0.00471121, 0.00435966, 0.00414989, 0.00427084, 0.00470036, 0.00563183, 0.00603742, 0.0070658 , 0.01065387, 0.00656062, 0.00464363, 0.00543656, 0.00482806, 0.006045 , 0.0047382 , 0.00456503, 0.00455985, 0.00497938, 0.00526886, 0.0059534 , 0.00542501, 0.00593828, 0.00441934, 0.00679043, 0.00768412, 0.00478962, 0.00497975, 0.0096066 , 0.00931533, 0.00647827, 0.01122897, 0.00814056, 0.00886422, 0.0085073 , 0.00892586, 0.01026801, 0.00671302, 0.01081401, 0.01183196, 0.0130468 , 0.01662675, 0.02419041, 0.01684287, 0.0181704 , 0.01506921, 0.01469795, 0.01510171, 0.01064657, 0.00912229, 0.01318832, 0.02469332, 0.02548047, 0.02698232, 0.03244451, 0.01577784, 0.03815224, 0.03508397, 0.04572237, 0.04062009, 0.03922674, 0.04328152, 0.04045059, 0.03814987, 0.03539773, 0.03825895, 0.05366944, 0.05259725, 0.04932721, 0.04996442, 0.05926653, 0.05825708, 0.04762925, 0.05079945, 0.05384446, 0.0520142 , 0.05647924, 0.06862377, 0.082898 , 0.06551883, 0.06457354, 0.06285927, 0.05532062, 0.06260139, 0.06386063, 0.05833791, 0.04712606, 0.05040953, 0.03517448, 0.03794625, 0.02638085, 0.04859691, 0.05473974, 0.05040754, 0.05122105, 0.0418124 , 0.0412233 , 0.02709608, 0.03020647, 0.03555377, 0.05101901, 0.04354959, 0.04748858, 0.05974634, 0.04787937, 0.0653081 , 0.07386712, 0.05966709, 0.06532861, 0.07046203, 0.0742046 , 0.07327932, 0.07168894, 0.07655903, 0.08079273, 0.08997273, 0.08147647, 0.00367353, 0.00323596, 0.00556057, 0.00267067, 0.00220054, 0.00235112, 0.00239005, 0.00257686, 0.00271898, 0.00263264, 0.00296338, 0.00251245, 0.00264074, 0.00287286, 0.00268097, 0.00183029, 0.00230579, 0.00183512, 0.00167413, 0.00162726, 0.00196856, 0.00309351, 0.00218892, 0.00213684])}), ({'PartType0': array([[ 7.66010618, 11.80109692, 0.5202446 ], [ 7.65904951, 11.80893993, 0.50201857], [ 7.65729189, 11.80535507, 0.53346169], ..., [ 2.39935374, 4.34765959, 18.09743118], [ 2.38080192, 4.43609762, 18.03917885], [ 2.71342826, 4.45228958, 17.99451065]])}, {('PartType0', 'Mass'): array([0.009135 , 0.00873734, 0.0098292 , ..., 0.00864386, 0.00864386, 0.00864386], dtype=float32)}, {'PartType0': 244445}, {'PartType0': array([0.03090909, 0.02949233, 0.03074851, ..., 0.52712107, 0.57937199, 0.4726522 ])}))
here's a screenshot of the Task Stream graph on the dask client dashboard from this execution (see the Dask Dahboard walkthrough tutorial for details):
which shows the processor and thread activity throughout the compute()
: each row is a different thread (our client is using 4 workers with 2 threads per workers here, so 8 rows) and each chunk is a different delayed_chunk_read
.
Ok, so now we have a delayed workflow to load in chunks -- but we want to be able to do at least two additional processing steps:
and we want to do the bulk of the computation in parallel without loading the full dataset into memory -- meaning we want to string together delayed computations to be executed across the processors. Let's start with the simpler problem: computing derived quantities.
Calculating derived quantities on each chunk is fairly straightforward. Let's write a delayed function to return some attributes and simple function calls so that we can return some array attribute values like min()
, max()
, size
given a particle type and field:
import dask
@dask.delayed
def npmeth(chunk,ptype_field,meths=['min']):
results = []
if len(chunk)>0:
if ptype_field in chunk[1].keys():
x = chunk[1][ptype_field]
for meth in meths:
this_meth = getattr(x,meth)
if callable(this_meth):
results.append(this_meth())
else:
results.append(this_meth)
return results
Now we call this method using a single delayed chunk for a given particle type and field:
one_chunk_quantities = npmeth(delayed_reader.delayed_chunks[0],('PartType0','Mass'),meths=['min','max','sum'])
one_chunk_quantities
Delayed('npmeth-5d8b6122-b024-45f6-839b-1fe9e00a890d')
and when we call compute, we'll get the min
, max
, and sum
of ('PartType0','Mass')
on a single chunk:
one_chunk_quantities.compute()
[0.008643857, 0.038306426, 4.4755173]
so to compute a derived quantity in parallel, we simply construct a list of delayed objects to calculate the quantitity on each chunk and then agreggate over the chunks:
meths = ['min','max','sum']
ptypefield = ('PartType0','Mass')
derived_qs = [npmeth(chunk,ptypefield,meths=meths) for chunk in delayed_reader.delayed_chunks]
derived_qs
[Delayed('npmeth-19dd50f9-17e7-449e-8ca2-acf718a4d2ec'), Delayed('npmeth-4586596f-79f5-4281-b5dd-5e9b1111076a'), Delayed('npmeth-cf78ce48-bd58-4d6c-8a71-842c1aeaa6ed'), Delayed('npmeth-22366299-9e71-4962-b684-74c2c4cf2874'), Delayed('npmeth-c7dfa8f8-9c81-493f-b835-73d065b6b508'), Delayed('npmeth-dcca79f8-3606-4593-9cb9-f2a353349798'), Delayed('npmeth-f6f64f02-8529-43a4-9889-9041dacd03af'), Delayed('npmeth-f75f00c3-0e92-4a6d-bd9f-9d5f69a2d443'), Delayed('npmeth-b830366c-e7e7-48da-9d52-b12fbf7e437f'), Delayed('npmeth-55ab6dba-aab9-4444-8b8a-a3b81952aca3'), Delayed('npmeth-b6b4796b-3b42-4c83-a784-3de0032ffa63'), Delayed('npmeth-6eb54c6b-28cd-49b2-af7e-2860312a8726')]
visualize(derived_qs)
Now we see that after each delayed_chunk_read
we'll be applything this npmeth
function, so so when we call
%%time
computed_derived = compute(*derived_qs)
CPU times: user 30.5 ms, sys: 212 µs, total: 30.8 ms Wall time: 44.8 ms
And we see the additional npmeth
calls in the Task Stream:
some of the loads do not have accompanying reads because not all the chunks contain data for the given particle type, which we can see by checking out the full data set that we've already loaded:
[ad[2]['PartType0'] for ad in all_data]
[419, 244445, 262144, 233206, 239908, 0, 227868, 225819, 255819, 0, 0, 251598]
So that, along with the 5-10ms downtime between some of the tasks indicates that there's certainly some optimization to focus on.... but we'll leave that for later.
So let's return to a derived quantities question: how to calculate global derived quantities across chunks? In the case of the simple quantities we're calculating here, we can manually aggregate across chunks easily. To calculate a mean across chunks, for example we can return the sum and count for each chunk then aggregate:
%%time
meths = ['size','sum']
ptypefield = ('PartType0','Mass')
derived_qs = [npmeth(chunk,ptypefield,meths=meths) for chunk in delayed_reader.delayed_chunks]
derived_qs = compute(*derived_qs)
# collect and compute mean
derived_qs = np.array([l for l in derived_qs if len(l)>0]) # remove empty chunks
global_mean = derived_qs[:,1].sum() / derived_qs[:,0].sum()
global_mean
CPU times: user 25.6 ms, sys: 3.24 ms, total: 28.8 ms Wall time: 45 ms
0.008771999763740388
The derived quantities could be more involved, and we'll need to think more carefully about more complex quantities that require data from other chunks during computation, but let's move on to filtering....
What we want to be able to do is to use a yt selection data container, for example:
sp = ds.sphere(ds.domain_center,(2,'code_length'))
and build a delayed process that will return only the subset of coordinates within each chunk falling within the selection. The conceptual approach is simple enough, we'd just write another delayed function to string onto the chunk reading, but in practice we run into some difficulty. The object that we really need is sp.selector
:
sp.selector
<yt.geometry.selection_routines.SphereSelector at 0x7f94ac423f10>
now, if we were to do something like:
@dask.delayed
def get_chunk_masks(ptf,chunk,selector):
dask will try to serialize the selector object with pickle, which fails:
import pickle
pickle.dumps(sp.selector)
returns the following traceback:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-85-5c86acc10898> in <module>
1 import pickle
----> 2 pickle.dumps(sp.selector)
~/src/yt/yt/geometry/selection_routines.cpython-37m-x86_64-linux-gnu.so in yt.geometry.selection_routines.SphereSelector.__reduce_cython__()
TypeError: no default __reduce__ due to non-trivial __cinit__
which points to an issue in serializing the cython selection routine. When trying to implement the get_chunk_masks
above, dask returns an error message stating that the selector object cannot be serialized (distributed.protocol.core - CRITICAL - Failed to Serialize
).
As a first approach, we'll look at what is required for initializing the sphere
selector:
sel = sp.selector
sel
<yt.geometry.selection_routines.SphereSelector at 0x7f94ac423f10>
ad = ds.all_data()
n0 = 100000
n1 = 500000
hsmls = 0
mask = sel.select_points(
ad['x'][n0:n1], ad['y'][n0:n1], ad['z'][n0:n1], hsmls
)
print(mask[mask].shape)
(5109,)
YTSelectionContainer in yt/data_objects/selection_objects/data_selection_objects.py
is where the selector attribute is set:
@property
def selector(self):
if self._selector is not None:
return self._selector
s_module = getattr(self, "_selector_module", yt.geometry.selection_routines)
sclass = getattr(s_module, f"{self._type_name}_selector", None)
if sclass is None:
raise YTDataSelectorNotImplemented(self._type_name)
if self._data_source is not None:
self._selector = compose_selector(
self, self._data_source.selector, sclass(self)
)
else:
self._selector = sclass(self)
return self._selector
s_module = yt.geometry.selection_routines
s_module
<module 'yt.geometry.selection_routines' from '/home/chavlin/src/yt/yt/geometry/selection_routines.cpython-37m-x86_64-linux-gnu.so'>
sp._type_name
'sphere'
sclass = s_module.sphere_selector
sclass
yt.geometry.selection_routines.SphereSelector
sp_sel = sclass(sp)
let's create a MockSphere
class that pulls out only the minimal attributes needed to initialize the sphere selector:
class MockDs(object):
def __init__(self,ds):
self.domain_left_edge = ds.domain_left_edge
self.domain_right_edge = ds.domain_right_edge
self.periodicity = ds.periodicity
class MockSphere(object):
# a stripped down sphere that records only the attributes required to initialize the sphere Selector Object
def __init__(self,sp):
self.center = sp.center
self.radius = sp.radius
self.ds = MockDs(sp.ds)
sp_M = gda.MockSphere(sp)
sp_M.center
unyt_array([12.5, 12.5, 12.5], 'code_length')
Now let's initialize our sphere selector with the mock class:
sel_M = sclass(sp_M)
n0 = 100000
n1 = 500000
ad = ds.all_data()
mask_2 = sel_M.select_points(
ad['x'][n0:n1], ad['y'][n0:n1], ad['z'][n0:n1], hsmls
)
print(mask_2[mask_2].shape)
(5109,)
np.all(mask_2==mask)
True
so looks like our sphere selector built from the mock class is behaving as expected. and our sp_M
object should be easily pickleable:
import pickle
sp_M_pi = pickle.dumps(sp_M)
which can be loaded back in (which dask does behind the scenes for each processor):
sp_M_unpi = pickle.loads(sp_M_pi) # dask would do this in the backend
So within the delayed function for calculating masks, we can instantiate our selector using the mock class:
sel_M = yt.geometry.selection_routines.SphereSelector(sp_M_unpi)
at present, our exploratory delayed_gadget
class implements this approach just for the sphere selector (different selector objects required different attributes for initialization). But let's build a new delayed_reader
supplying our mock sphere:
delayed_reader = gda.delayed_gadget(ds, ptf, mock_selector = sp_M, subchunk_size = None)
now we have two more lists of dealyed objects: masks
and masked_chunks
. The masks
are delayed objects which return the boolean masks:
delayed_reader.masks
[Delayed('get_chunk_masks-9ff69aba-e485-45bc-8159-595ef57436c1'), Delayed('get_chunk_masks-561607e7-4c20-4381-90fd-9481053a45b4'), Delayed('get_chunk_masks-bf11afeb-f947-4d7d-8574-c0d886345b75'), Delayed('get_chunk_masks-58f9e8ea-c177-4d07-9a46-49d49da1c0c9'), Delayed('get_chunk_masks-20d22784-944c-40eb-b986-0f7d14e4c77e'), Delayed('get_chunk_masks-c6e13077-e40c-4c32-b985-7230442a05cd'), Delayed('get_chunk_masks-becb6a22-ac27-4b82-8d57-8a5a61071253'), Delayed('get_chunk_masks-563d578f-406d-4f2d-9ab4-47566581f7c4'), Delayed('get_chunk_masks-e4537e80-bbed-40be-94ab-c926f302f2f4'), Delayed('get_chunk_masks-00872ba7-9d46-4f80-91d3-68766b5cf037'), Delayed('get_chunk_masks-1849634a-c79b-4a22-909b-96c07bf098d3'), Delayed('get_chunk_masks-6c2dbfe8-ccd8-4c8f-99a0-3ad575a2b03b')]
chunk_1_mask = delayed_reader.masks[0].compute()
chunk_1_mask
{'PartType0': (None, None)}
so we could compute those masks and apply them to the delayed chunks, but the masked_chunks
list contains delayed functions that load the chunks, masks and then returns just the masked values:
masked_chunk_1 = delayed_reader.masked_chunks[0].compute()
masked_chunk_1
({'PartType0': array([[[ 9.0948925 , 18.5401268 , 13.50576115], [ 9.09940147, 18.55133247, 13.51190567], [ 9.08276558, 18.54066086, 13.49641895], ..., [ 9.94860458, 8.4767704 , 14.56663513], [ 9.94866085, 8.47825813, 14.56705093], [ 9.94791031, 8.47807693, 14.56690121]]])}, {('PartType0', 'Mass'): array([[0.01576188, 0.01663664, 0.01871505, 0.00970176, 0.01931598, 0.01710295, 0.01869999, 0.00868698, 0.01191492, 0.01526247, 0.00870004, 0.00864736, 0.00866873, 0.01210038, 0.00866515, 0.00865903, 0.00893035, 0.00864386, 0.00864437, 0.0095512 , 0.009749 , 0.00864466, 0.00952867, 0.00864396, 0.00865185, 0.00864417, 0.00865002, 0.00864616, 0.00864787, 0.0086467 , 0.00864758, 0.00865183, 0.00864775, 0.00939687, 0.0086449 , 0.00871795, 0.01162997, 0.01007061, 0.00868322, 0.01283415, 0.00864652, 0.00864493, 0.00864921, 0.00864537, 0.00867223, 0.00864423, 0.00864423, 0.00864427, 0.00864402, 0.00864405, 0.00864469, 0.00864398, 0.00864398, 0.00864617, 0.00865819, 0.00864479, 0.00865467, 0.00865023, 0.00864672, 0.00864549, 0.0086541 , 0.00864614, 0.0086621 , 0.00864856, 0.00866107, 0.00864497, 0.00864437, 0.00872628, 0.00865319, 0.00864782, 0.00867574, 0.00865113, 0.00864541, 0.00864583, 0.00865423, 0.00864672, 0.0094079 , 0.00865236, 0.00865042, 0.00864537, 0.0095771 , 0.00864497, 0.0086466 , 0.00864951, 0.00864564, 0.00864719, 0.00973641, 0.01123166, 0.0086561 , 0.00865673, 0.00867016, 0.00866977, 0.00865894, 0.00864647, 0.00864547, 0.00864431, 0.0111323 , 0.00864911, 0.00864662, 0.00868544, 0.00871773, 0.00865634, 0.0086528 , 0.0086698 , 0.00867322, 0.01034854, 0.00867792, 0.00864386, 0.0086446 , 0.0086485 , 0.00867851, 0.00864389, 0.00864397, 0.00864393, 0.00865264, 0.0086439 , 0.00864573, 0.00864641, 0.00865118, 0.00884463, 0.00864407, 0.00871034, 0.00865471, 0.00867828, 0.008648 , 0.00864427, 0.00868855, 0.00864828, 0.00867095, 0.00866964, 0.00866721, 0.00865462, 0.0086743 , 0.00866969, 0.00864895, 0.00865275, 0.00864571, 0.00865658, 0.00865294, 0.00867992, 0.00867557, 0.00868229, 0.00865193, 0.00865912, 0.00886692, 0.00868004, 0.0086862 , 0.019488 , 0.01122443, 0.02183115, 0.00866625, 0.00871245, 0.00867322, 0.00865103, 0.00864984, 0.00868173, 0.00872087, 0.02139982, 0.00966472, 0.01720018, 0.0188228 , 0.00870653, 0.00874153, 0.00868849, 0.02225389, 0.01747124, 0.01951511, 0.010462 , 0.01086188, 0.00889738, 0.02430551, 0.01698648, 0.00949531, 0.00874803, 0.00882819, 0.01417806, 0.0155667 , 0.01420717, 0.00888111, 0.00911644, 0.01714752, 0.01734492, 0.01464341, 0.00908168, 0.02144803, 0.00971754, 0.0148535 , 0.0099078 , 0.01139312, 0.01358491, 0.01857252, 0.01024381, 0.00927314, 0.01057635, 0.00979624, 0.01658888, 0.01408297, 0.00916107, 0.01534646, 0.00989786, 0.01286644, 0.00963002, 0.00885178, 0.00875167, 0.01189861, 0.01009631, 0.00870869, 0.00879007, 0.01150377, 0.00997297, 0.0087672 , 0.00869129, 0.00868801, 0.00867148, 0.01129076, 0.00868363, 0.00864965, 0.01027128, 0.00874708, 0.00876408, 0.0093245 , 0.00882076, 0.00898408, 0.00865777, 0.01579425, 0.00865272, 0.00906334, 0.00910297, 0.00948077, 0.02122764, 0.01258226, 0.02637874, 0.01447314, 0.00951241, 0.0102029 , 0.00874167, 0.00866766, 0.00865206, 0.00881413, 0.00890438, 0.00881805, 0.01753351, 0.01460347, 0.01022284, 0.01129906, 0.00892293, 0.01131135, 0.01149893, 0.00896504, 0.00892386, 0.0088651 , 0.00881195, 0.00879243, 0.00865685, 0.01000845, 0.00872331, 0.00981471, 0.00867905, 0.00896335, 0.00958879, 0.00891158, 0.0094134 , 0.01036824, 0.00924235, 0.00927313, 0.01221033, 0.01330525, 0.02035448, 0.00941199, 0.0100465 , 0.01363682, 0.01297417, 0.01663146, 0.01457284, 0.00903581, 0.0101647 , 0.0157372 , 0.00878596, 0.00913168, 0.0107145 , 0.00977161, 0.00956479, 0.00882804, 0.00884613, 0.01691047, 0.00893549, 0.00903166, 0.00894935, 0.00877558, 0.01156324, 0.00924913, 0.01358276, 0.01894272, 0.00955947, 0.00942666, 0.00875606, 0.00881323, 0.00956746, 0.00883421, 0.01629753, 0.0086669 , 0.00876433, 0.0096875 , 0.01188321, 0.00955884, 0.00873123, 0.00865462, 0.00867205, 0.00865692, 0.0086477 , 0.00879077, 0.00901418, 0.00893167, 0.00881205, 0.00865754, 0.00935456, 0.00976939, 0.00866351, 0.00867221, 0.0143476 , 0.00867235, 0.00870647, 0.01151083, 0.01523547, 0.00873427, 0.02577072, 0.02243926, 0.00878209, 0.00868718, 0.0086521 , 0.01103147, 0.02126098, 0.00865098, 0.01706288, 0.00923983, 0.01008528, 0.0086549 , 0.0086628 , 0.00865432, 0.00866331, 0.00864874, 0.00865209, 0.02152279, 0.00866388, 0.00865532, 0.00864429, 0.00864659, 0.00868481, 0.01024063, 0.00865658, 0.00865499, 0.00864694, 0.00867339, 0.01285803, 0.00866809, 0.00866094, 0.0086523 , 0.00864416, 0.00865817, 0.00864684, 0.00865339, 0.00864393, 0.00866137, 0.0108283 , 0.00865872, 0.00865904, 0.01000292, 0.00865042, 0.01252308, 0.0156426 , 0.00864781, 0.00865091, 0.00864583, 0.00895159, 0.00867303, 0.01230804, 0.01055389, 0.01355155, 0.01290767, 0.01301 , 0.00865055, 0.00864436, 0.00867914, 0.00864389, 0.00864386, 0.00864974, 0.00864396, 0.0086496 , 0.00864521, 0.00864442, 0.00864768, 0.00864387, 0.00864543, 0.00865471, 0.00864542, 0.00885514, 0.02166013, 0.01593382, 0.03830643, 0.02097851, 0.02344084, 0.02224258, 0.01855106, 0.01791457, 0.01292829, 0.01000241, 0.01100658, 0.01275772, 0.0119765 , 0.00996338, 0.02194469, 0.01048275, 0.01669842, 0.01753752, 0.02722569, 0.01266365, 0.01107106, 0.01257765, 0.01240593]], dtype=float32)})
and now when we compute the full list, dask will distribute the chunk reading and masking to individual processes:
visualize(delayed_reader.masked_chunks)
visualize(delayed_reader.masked_chunks[0])
data_subset = compute(*delayed_reader.masked_chunks)
And we now see the reading (purple), retrieving of masks (green) and applying the masks (yellow) in the task stream:
and our data_subset
contains only the masked data:
data_subset
(({'PartType0': array([[[ 9.0948925 , 18.5401268 , 13.50576115], [ 9.09940147, 18.55133247, 13.51190567], [ 9.08276558, 18.54066086, 13.49641895], ..., [ 9.94860458, 8.4767704 , 14.56663513], [ 9.94866085, 8.47825813, 14.56705093], [ 9.94791031, 8.47807693, 14.56690121]]])}, {('PartType0', 'Mass'): array([[0.01576188, 0.01663664, 0.01871505, 0.00970176, 0.01931598, 0.01710295, 0.01869999, 0.00868698, 0.01191492, 0.01526247, 0.00870004, 0.00864736, 0.00866873, 0.01210038, 0.00866515, 0.00865903, 0.00893035, 0.00864386, 0.00864437, 0.0095512 , 0.009749 , 0.00864466, 0.00952867, 0.00864396, 0.00865185, 0.00864417, 0.00865002, 0.00864616, 0.00864787, 0.0086467 , 0.00864758, 0.00865183, 0.00864775, 0.00939687, 0.0086449 , 0.00871795, 0.01162997, 0.01007061, 0.00868322, 0.01283415, 0.00864652, 0.00864493, 0.00864921, 0.00864537, 0.00867223, 0.00864423, 0.00864423, 0.00864427, 0.00864402, 0.00864405, 0.00864469, 0.00864398, 0.00864398, 0.00864617, 0.00865819, 0.00864479, 0.00865467, 0.00865023, 0.00864672, 0.00864549, 0.0086541 , 0.00864614, 0.0086621 , 0.00864856, 0.00866107, 0.00864497, 0.00864437, 0.00872628, 0.00865319, 0.00864782, 0.00867574, 0.00865113, 0.00864541, 0.00864583, 0.00865423, 0.00864672, 0.0094079 , 0.00865236, 0.00865042, 0.00864537, 0.0095771 , 0.00864497, 0.0086466 , 0.00864951, 0.00864564, 0.00864719, 0.00973641, 0.01123166, 0.0086561 , 0.00865673, 0.00867016, 0.00866977, 0.00865894, 0.00864647, 0.00864547, 0.00864431, 0.0111323 , 0.00864911, 0.00864662, 0.00868544, 0.00871773, 0.00865634, 0.0086528 , 0.0086698 , 0.00867322, 0.01034854, 0.00867792, 0.00864386, 0.0086446 , 0.0086485 , 0.00867851, 0.00864389, 0.00864397, 0.00864393, 0.00865264, 0.0086439 , 0.00864573, 0.00864641, 0.00865118, 0.00884463, 0.00864407, 0.00871034, 0.00865471, 0.00867828, 0.008648 , 0.00864427, 0.00868855, 0.00864828, 0.00867095, 0.00866964, 0.00866721, 0.00865462, 0.0086743 , 0.00866969, 0.00864895, 0.00865275, 0.00864571, 0.00865658, 0.00865294, 0.00867992, 0.00867557, 0.00868229, 0.00865193, 0.00865912, 0.00886692, 0.00868004, 0.0086862 , 0.019488 , 0.01122443, 0.02183115, 0.00866625, 0.00871245, 0.00867322, 0.00865103, 0.00864984, 0.00868173, 0.00872087, 0.02139982, 0.00966472, 0.01720018, 0.0188228 , 0.00870653, 0.00874153, 0.00868849, 0.02225389, 0.01747124, 0.01951511, 0.010462 , 0.01086188, 0.00889738, 0.02430551, 0.01698648, 0.00949531, 0.00874803, 0.00882819, 0.01417806, 0.0155667 , 0.01420717, 0.00888111, 0.00911644, 0.01714752, 0.01734492, 0.01464341, 0.00908168, 0.02144803, 0.00971754, 0.0148535 , 0.0099078 , 0.01139312, 0.01358491, 0.01857252, 0.01024381, 0.00927314, 0.01057635, 0.00979624, 0.01658888, 0.01408297, 0.00916107, 0.01534646, 0.00989786, 0.01286644, 0.00963002, 0.00885178, 0.00875167, 0.01189861, 0.01009631, 0.00870869, 0.00879007, 0.01150377, 0.00997297, 0.0087672 , 0.00869129, 0.00868801, 0.00867148, 0.01129076, 0.00868363, 0.00864965, 0.01027128, 0.00874708, 0.00876408, 0.0093245 , 0.00882076, 0.00898408, 0.00865777, 0.01579425, 0.00865272, 0.00906334, 0.00910297, 0.00948077, 0.02122764, 0.01258226, 0.02637874, 0.01447314, 0.00951241, 0.0102029 , 0.00874167, 0.00866766, 0.00865206, 0.00881413, 0.00890438, 0.00881805, 0.01753351, 0.01460347, 0.01022284, 0.01129906, 0.00892293, 0.01131135, 0.01149893, 0.00896504, 0.00892386, 0.0088651 , 0.00881195, 0.00879243, 0.00865685, 0.01000845, 0.00872331, 0.00981471, 0.00867905, 0.00896335, 0.00958879, 0.00891158, 0.0094134 , 0.01036824, 0.00924235, 0.00927313, 0.01221033, 0.01330525, 0.02035448, 0.00941199, 0.0100465 , 0.01363682, 0.01297417, 0.01663146, 0.01457284, 0.00903581, 0.0101647 , 0.0157372 , 0.00878596, 0.00913168, 0.0107145 , 0.00977161, 0.00956479, 0.00882804, 0.00884613, 0.01691047, 0.00893549, 0.00903166, 0.00894935, 0.00877558, 0.01156324, 0.00924913, 0.01358276, 0.01894272, 0.00955947, 0.00942666, 0.00875606, 0.00881323, 0.00956746, 0.00883421, 0.01629753, 0.0086669 , 0.00876433, 0.0096875 , 0.01188321, 0.00955884, 0.00873123, 0.00865462, 0.00867205, 0.00865692, 0.0086477 , 0.00879077, 0.00901418, 0.00893167, 0.00881205, 0.00865754, 0.00935456, 0.00976939, 0.00866351, 0.00867221, 0.0143476 , 0.00867235, 0.00870647, 0.01151083, 0.01523547, 0.00873427, 0.02577072, 0.02243926, 0.00878209, 0.00868718, 0.0086521 , 0.01103147, 0.02126098, 0.00865098, 0.01706288, 0.00923983, 0.01008528, 0.0086549 , 0.0086628 , 0.00865432, 0.00866331, 0.00864874, 0.00865209, 0.02152279, 0.00866388, 0.00865532, 0.00864429, 0.00864659, 0.00868481, 0.01024063, 0.00865658, 0.00865499, 0.00864694, 0.00867339, 0.01285803, 0.00866809, 0.00866094, 0.0086523 , 0.00864416, 0.00865817, 0.00864684, 0.00865339, 0.00864393, 0.00866137, 0.0108283 , 0.00865872, 0.00865904, 0.01000292, 0.00865042, 0.01252308, 0.0156426 , 0.00864781, 0.00865091, 0.00864583, 0.00895159, 0.00867303, 0.01230804, 0.01055389, 0.01355155, 0.01290767, 0.01301 , 0.00865055, 0.00864436, 0.00867914, 0.00864389, 0.00864386, 0.00864974, 0.00864396, 0.0086496 , 0.00864521, 0.00864442, 0.00864768, 0.00864387, 0.00864543, 0.00865471, 0.00864542, 0.00885514, 0.02166013, 0.01593382, 0.03830643, 0.02097851, 0.02344084, 0.02224258, 0.01855106, 0.01791457, 0.01292829, 0.01000241, 0.01100658, 0.01275772, 0.0119765 , 0.00996338, 0.02194469, 0.01048275, 0.01669842, 0.01753752, 0.02722569, 0.01266365, 0.01107106, 0.01257765, 0.01240593]], dtype=float32)}), ({'PartType0': array([[13.43093872, 11.19056606, 12.9906044 ], [13.43668842, 11.18202591, 13.00140095], [13.44595337, 11.1965723 , 13.00923061], ..., [11.85178089, 10.12987709, 12.46292496], [13.38823414, 11.19161415, 12.92356396], [13.38854218, 11.19038105, 12.92326355]])}, {('PartType0', 'Mass'): array([0.00884863, 0.00882724, 0.00864468, ..., 0.00945342, 0.02861722, 0.01246947], dtype=float32)}), ({'PartType0': array([[13.28926563, 11.24862957, 12.7988348 ], [13.30125618, 11.25296021, 12.79409981], [13.2896452 , 11.23968029, 12.84126377], ..., [13.47682667, 11.07727623, 12.63766289], [13.45073509, 11.04586411, 12.60851288], [13.46500587, 11.05492592, 12.64425945]])}, {('PartType0', 'Mass'): array([0.00896836, 0.00868008, 0.00865726, ..., 0.00876143, 0.00889264, 0.0086753 ], dtype=float32)}), ({'PartType0': array([[12.40897083, 11.28258801, 11.81107712], [12.41020012, 11.28173542, 11.81221294], [12.41038513, 11.28312969, 11.81461811], ..., [13.11116409, 11.26555443, 12.98612309], [13.38894558, 11.18954372, 12.92433167], [13.38833237, 11.18933868, 12.92260551]])}, {('PartType0', 'Mass'): array([0.01105031, 0.01259088, 0.01301601, ..., 0.00866676, 0.03573753, 0.0319961 ], dtype=float32)}), ({'PartType0': array([[13.43947315, 11.1077652 , 12.9115057 ], [13.43997383, 11.1078558 , 12.90948868], [13.43338394, 11.11088848, 12.91359043], ..., [13.42452526, 11.17721558, 12.9738245 ], [13.40465546, 11.17043495, 12.97396851], [13.40785408, 11.17961884, 12.97327042]])}, {('PartType0', 'Mass'): array([0.01265707, 0.03210822, 0.0094103 , ..., 0.00870406, 0.00879111, 0.00881443], dtype=float32)}), ({'PartType0': array([], shape=(1, 0, 3), dtype=float64)}, {}), ({'PartType0': array([[13.39521313, 11.19988251, 12.92331028], [13.39372635, 11.20076847, 12.91552067], [13.38780212, 11.19334316, 12.9240427 ], ..., [13.42351151, 11.19343662, 13.0007782 ], [13.40781403, 11.18992615, 12.98962116], [13.39047337, 11.1887331 , 12.92683125]])}, {('PartType0', 'Mass'): array([0.01203093, 0.00882288, 0.01812755, 0.09704798, 0.01313368, 0.03285612, 0.01957441, 0.02674876, 0.01098251, 0.00923867, 0.00899523, 0.00899033, 0.00893849, 0.00891161, 0.00882139, 0.00944025, 0.00870398, 0.00877806, 0.00871626, 0.00870836, 0.00880212, 0.00930818, 0.00886443, 0.00928235, 0.00875006, 0.00870678, 0.00884044, 0.01033679, 0.0088478 , 0.00866768, 0.00874628, 0.0089016 , 0.01014718, 0.00944654, 0.0086832 , 0.00875447, 0.00893992, 0.00884991, 0.0089059 , 0.00875688, 0.00883372, 0.01142054, 0.00956721, 0.01628112, 0.0091363 , 0.00897317, 0.01035234, 0.00872913, 0.00967621, 0.00966599, 0.00869202, 0.01048123, 0.00889965, 0.00958413, 0.00875794, 0.00909611, 0.00890253, 0.008756 , 0.00880695, 0.00869773, 0.0088725 , 0.00873408, 0.01006394, 0.00963769, 0.01023244, 0.00903017, 0.00868421, 0.00877307, 0.00871271, 0.00869814, 0.00938541, 0.00870607, 0.00874158, 0.00987726, 0.00868072, 0.00889448, 0.00887321, 0.00867826, 0.00907872, 0.00866725, 0.00874164, 0.00882037, 0.00869153, 0.01020411, 0.00866115, 0.00867086, 0.00871024, 0.00880474, 0.00886343, 0.00890287, 0.00865637, 0.00866516, 0.00865121, 0.00864927, 0.00871735, 0.00884125, 0.00926972, 0.0088592 , 0.00873864, 0.00964638, 0.00908021, 0.00883721, 0.00866603, 0.00870935, 0.00867811, 0.00864657, 0.00865668, 0.00878043, 0.00865264, 0.00868016, 0.00866103, 0.00866239, 0.00870312, 0.00865389, 0.0086508 , 0.00989787, 0.00865483, 0.00864632, 0.00867351, 0.00865269, 0.00864494, 0.00864656, 0.00865772, 0.00866503, 0.0086741 , 0.0086517 , 0.00864655, 0.00866203, 0.00864669, 0.00890101, 0.00864809, 0.00961686, 0.00864914, 0.008694 , 0.00868662, 0.00896892, 0.00870718, 0.008682 , 0.00870205, 0.00874101, 0.00873744, 0.00916183, 0.00955473, 0.00873588, 0.00901161, 0.008939 , 0.00880357, 0.00968182, 0.01290366, 0.00881046, 0.00911532, 0.00901558, 0.00882433, 0.00883096, 0.00876668, 0.00905134, 0.00864699, 0.00882495, 0.01073011, 0.00936707, 0.00874935, 0.00907302, 0.00879584, 0.00892107, 0.00866073, 0.00868277, 0.00874908, 0.00869675, 0.00878325, 0.00865738, 0.00933214, 0.00868655, 0.00867728, 0.00879737, 0.00882823, 0.00873398, 0.0087144 , 0.00879571, 0.00914419, 0.00887893, 0.00867258, 0.0091319 , 0.008701 , 0.00893433, 0.00873664, 0.00886017, 0.00876832, 0.00876004, 0.00876043, 0.00874676, 0.00864896, 0.00865531, 0.00876568, 0.00866495, 0.00866952, 0.00869993, 0.00867675, 0.00916402, 0.00905003, 0.00866342, 0.00868235, 0.00896684, 0.00876822, 0.00924851, 0.00867193, 0.00982375, 0.00870061, 0.00865871, 0.00876008, 0.00866068, 0.00864587, 0.00864438, 0.00865156, 0.00864443, 0.00866326, 0.00866363, 0.00866585, 0.00869578, 0.00864615, 0.00865266, 0.00910017, 0.0087096 , 0.00865182, 0.00866087, 0.00866448, 0.00864516, 0.00865118, 0.00864631, 0.00864779, 0.00864549, 0.00865328, 0.00864611, 0.00864808, 0.00865296, 0.00875413, 0.00869133, 0.00866995, 0.00865028, 0.00901158, 0.0087286 , 0.00864713, 0.00868909, 0.00874365, 0.00864559, 0.00877111, 0.0086518 , 0.00871893, 0.00869103, 0.00871341, 0.00867477, 0.00878077, 0.00876209, 0.00864886, 0.00882232, 0.00868635, 0.0087137 , 0.00869438, 0.00867599, 0.00876006, 0.00867494, 0.00864447, 0.00869653, 0.00864598, 0.00892699, 0.00888338, 0.00959347, 0.00864403, 0.00872051, 0.00868824, 0.00865858, 0.00864397, 0.00864481, 0.00865195, 0.00864596, 0.00876382, 0.00879695, 0.00864803, 0.00868434, 0.00864664, 0.00870971, 0.0086554 , 0.00864534, 0.00864753, 0.00888581, 0.00967348, 0.00883506, 0.00868367, 0.00866216, 0.00871572, 0.00869022, 0.00871313, 0.00874807, 0.00876788, 0.00923704, 0.00871691, 0.00899592, 0.00869392, 0.00875113, 0.00882208, 0.00878247, 0.0087584 , 0.00871955, 0.0086804 , 0.00876417, 0.00869908, 0.00877622, 0.0086981 , 0.00883226, 0.00867931, 0.00883125, 0.00866077, 0.00885404, 0.00871046, 0.00877254, 0.00873536, 0.00877649, 0.00868185, 0.00918181, 0.0086527 , 0.0087715 , 0.00891042, 0.00870223, 0.00868787, 0.00887731, 0.00886218, 0.00880006, 0.00867987, 0.00884889, 0.0088745 , 0.00864824, 0.00865565, 0.0086617 , 0.00867107, 0.00876228, 0.00865519, 0.00864736, 0.00869823, 0.0086486 , 0.00864438, 0.00867763, 0.00865534, 0.00864471, 0.0086469 , 0.00876905, 0.00866138, 0.00864426, 0.0086503 , 0.00866874, 0.00870558, 0.00868498, 0.0092262 , 0.00868068, 0.00867059, 0.00864974, 0.00869882, 0.00885761, 0.00872128, 0.00882509, 0.00865995, 0.0086457 , 0.00865564, 0.00868505, 0.00959161, 0.00892541, 0.00866263, 0.00864892, 0.00878631, 0.00976356, 0.00870429, 0.00881173, 0.00869607, 0.00883245, 0.00866649, 0.00866251, 0.0086637 , 0.00879855, 0.00864934, 0.00866011, 0.00872501, 0.00867285, 0.00868581, 0.00879146, 0.00868083, 0.00894334, 0.00865406, 0.00869168, 0.00867943, 0.00865867, 0.00873563, 0.00870161, 0.00871691, 0.00940083, 0.00867154, 0.00876287, 0.00882982, 0.00873961, 0.00895397, 0.00866214, 0.00873045, 0.00905387, 0.00866149, 0.00873944, 0.00876095, 0.00864876, 0.00869037, 0.00950852, 0.00865531, 0.00881079, 0.00869398, 0.00887109, 0.00870613, 0.00879226, 0.00902033, 0.01184652, 0.01101478, 0.0150399 , 0.01014263, 0.01091627, 0.00873162, 0.00950557, 0.00880134, 0.00865875, 0.008763 , 0.0091825 , 0.01004407, 0.00892587, 0.00865865, 0.00870995, 0.00865768, 0.00878465, 0.00870174, 0.00873989, 0.0086711 , 0.00895921, 0.00868311, 0.00867799, 0.00867407, 0.00869631, 0.00925512, 0.0090023 , 0.00873555, 0.00866125, 0.00867573, 0.00865635, 0.00870393, 0.00866756, 0.01000759, 0.00868898, 0.0096409 , 0.00864618, 0.00867007, 0.00865202, 0.00866651, 0.00868688, 0.00869466, 0.00896448, 0.00866669, 0.00872678, 0.00866407, 0.00866447, 0.00898271, 0.00904266, 0.00866892, 0.00894086, 0.00874062, 0.00866783, 0.00904381, 0.0087747 , 0.00866502, 0.00907539, 0.00879642, 0.00865868, 0.00889166, 0.00915072, 0.00865272, 0.00864481, 0.00867476, 0.00872816, 0.00878476, 0.00912634, 0.00894622, 0.00882892, 0.00864542, 0.00865445, 0.00864524, 0.00864627, 0.00864562, 0.00868555, 0.00876571, 0.00871714, 0.00872101, 0.00864781, 0.00867894, 0.00864451, 0.00864505, 0.00874473, 0.00865246, 0.00864729, 0.00865292, 0.00874277, 0.00869719, 0.00866853, 0.00869315, 0.01085332, 0.00876114, 0.00968153, 0.00886534, 0.00869854, 0.00873785, 0.00883327, 0.00866658, 0.00897269, 0.01576388, 0.00874511, 0.00878712, 0.00914866, 0.00874413, 0.0086578 , 0.00874343, 0.00882145, 0.00888411, 0.00946762, 0.00874286, 0.00866455, 0.00903582, 0.00865258, 0.00866445, 0.00874822, 0.00884121, 0.0095798 , 0.00866635, 0.00882196, 0.00864599, 0.00864687, 0.008701 , 0.01287949, 0.00886486, 0.00873135, 0.00880416, 0.00864825, 0.00864657, 0.00880771, 0.00895446, 0.00881167, 0.00871066, 0.00869427, 0.00864491, 0.0088043 , 0.00864845, 0.00864613, 0.0086487 , 0.00865456, 0.0086541 , 0.00865372, 0.00872631, 0.0087514 , 0.06022028], dtype=float32)}), ({'PartType0': array([[12.776124 , 11.70685959, 12.95094872], [12.77933407, 11.65993309, 12.96052647], [12.74176502, 11.59843731, 12.9240551 ], ..., [12.50014687, 11.03889656, 10.83100605], [12.63121414, 11.09275723, 10.91062832], [12.58743954, 10.9895792 , 10.86112881]])}, {('PartType0', 'Mass'): array([0.00864398, 0.00865749, 0.00864481, ..., 0.0088496 , 0.00895973, 0.00864386], dtype=float32)}), ({'PartType0': array([[13.4385376 , 11.04317665, 12.57679367], [13.4385128 , 11.05965042, 12.59449577], [13.47732162, 11.07425117, 12.56700706], ..., [13.38940811, 11.19235039, 12.9252739 ], [13.38981056, 11.19226837, 12.92547035], [13.3897171 , 11.19199467, 12.92544365]])}, {('PartType0', 'Mass'): array([0.0089119 , 0.00896248, 0.00876433, ..., 0.01497461, 0.01153182, 0.02685248], dtype=float32)}), ({'PartType0': array([], shape=(1, 0, 3), dtype=float64)}, {}), ({'PartType0': array([], shape=(1, 0, 3), dtype=float64)}, {}), ({'PartType0': array([[12.57719803, 10.98426628, 10.982687 ], [12.50411797, 11.02719498, 11.00050163], [12.57489491, 10.9623003 , 11.12458515], ..., [13.41440487, 11.1604948 , 12.90496349], [13.42183781, 11.17457771, 12.8966856 ], [13.4048357 , 11.17023563, 12.89961433]])}, {('PartType0', 'Mass'): array([0.00892981, 0.00864386, 0.00864386, ..., 0.00867115, 0.00939662, 0.00985094], dtype=float32)}))
The [pickleAbleSelects] branch contains a draft of how to make the cython selector objects pickleable. At present, it uses the pickle hooks __setstate__
and __getstate__
to record and apply the selector object hashes. The implementation currently requires all hash attributes to be public cython objects, so right now only works with the sphere selector, but after building that branch, we no longer need the MockSphere
object and can pass the full selector
object:
sp = ds.sphere(ds.domain_center,(2,'code_length'))
delayed_reader = gda.delayed_gadget(ds, ptf, mock_selector = sp.selector, subchunk_size = None)
data_subset = compute(*delayed_reader.masked_chunks)
The base SelectorObject
is also pickleable, but the other selector objects will require adjustments depending on decisions for h
What's this subchunk_size
? When reading a single chunk, we're reading an index range in a single hdf file. In the delayed_gadget
class, if subchunk_size
is None, we just read the full index range into a numpy array. If it is not None, it will return dask arrays split by subchunk_size
. So far, this generally slows things down:
delayed_reader = gda.delayed_gadget(ds, ptf, subchunk_size = 10000)
%%time
all_data = compute(*delayed_reader.delayed_chunks)
CPU times: user 56.8 ms, sys: 89.2 ms, total: 146 ms Wall time: 181 ms
which is slower than without the subchunking by about 30 ms. The derived quantity calculate also slows down as dask has a bit more overhead communication to manage the different chunks
%%time
meths = ['size','sum']
ptypefield = ('PartType0','Mass')
derived_qs = [npmeth(chunk,ptypefield,meths=meths) for chunk in delayed_reader.delayed_chunks]
derived_qs = compute(*derived_qs)
CPU times: user 53.6 ms, sys: 7.55 ms, total: 61.1 ms Wall time: 82.3 ms
so for the present test gadget dataset and current exploratory loader, we don't get any speed up (though if we had memory issues in loading single chunks, the subchunking should help that).
As an additional aside, I made one failed attempt to use dask's hdf dataframe reader to import the gadget hdf files. The underlying pandas hdf reader has a limited expected structure for hdf files and can't handle the gadget h5py hdf files, so we need h5py
to load and access data....