feat: read and write zarr methods to replace read_pickle#1908
Conversation
Adds: - `read_zarr` and `write_zarr` methods which fall back to pickle if no zarr store is available. - Warnings that read_pickle methods will be deprecated for future GlacierDirectories. - Zarr as a core dependency. Refs: OGGM#1903
|
Thanks! Looking promising A few quick thoughts:
|
|
Notes based on internal meeting:
Next on agenda.
Agreed. model_flowlines: Centerline = Centerline(...)
write_store(model_flowlines, "model_flowlines")
model_flowlines: Centerline = read_store("model_flowlines")where internally def write_store(self, obj, filename, filesuffix, **kwargs):
data: xr.DataTree = _validate_store(obj) # converts shapely objects etc. into zarr-compatible types
write_zarr(data: xr.DataTree=data, group:str=filename, **kwargs)
Zarr-related code will be placed under |
This PR currently adds:
read_store,read_zarrandwrite_zarrmethods which fall back to pickle if no zarr store is availableRefs: #1903
Points for discussion
All pickles are now stored as
xr.DataTreesin a single zarr file, which I'm currently naming "data_store". The filename inread_zarr(filename)therefore reads the group, rather than entire zarr file. This avoids having multiple small zarr files, and we can maintain cross-compatibility with DTCG.The default replacement for
read_pickleis nowread_store, which handles both pickle (read_pickle) and zarr (read_zarr).read_storefalls back toread_pickleif no zarr store exists. This preserves backwards compatibility with existing gdirs if a user selects an older URL.Zarr does not support all the data structures used by certain pickles (e.g. shapely objects). oggm-zarr converts these to a compatible type. When using
read_store, these objects are converted back into the type expected by OGGM. This should minimise rewrites across the rest of the codebase.I'd like to upload a small sample zarr to oggm-sample-data, so this can be included in
init_hef, and I can then add tests for reading a zarr from a gdir directly.I'm currently keeping code for converting from pickle to zarr as a dual-licensed package, as some of this includes code from DTCG which is not compatible with OGGM's license. I'll see if I can rework this, but it may be simpler to handle conversion via
dtcg, since this is already set up and we can build directly on the existingGeoZarrHandler.You can view a conversion workflow here.
Closes #1903
whats-new.rst