-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: Design questions for HDFStore.append #60920
Comments
Thanks for the report! I think what you're doing is raising the question of whether or not there are certain deficiencies that should be corrected or merely documented. That's the right approach, but pandas uses
Can you detail these?
This goes back to I don't understand why the name
Can you detail why you find this misleading?
For each parameter, what do you believe the default should be?
I think this is a convenience to do e.g. |
Ah, I confused I think this also clears up my confusion on unused parameters. I haven't checked, but could the signature include unused parameters for a consistent signature across many functions, others of which do use the parameters? I have not checked. |
I have put together the following notes regarding the parameters for the append function.
There are 3 functions to write data to a hdf5 file, of which they all use the same underlying function. Below shows a rough diagram of the function calls for these methods (Note: only type "table"): flowchart TB
subgraph s2["HDFStore"]
n3["append_to_multiple"] --> n1["append"]
n2["put"] --> n4["_write_to_group"]
n1 --> n4
end
n5["df.to_hdf"] --> n1 & n2
n4 --> n6["write"]
subgraph s1["AppendableTable"]
n6 --> n7["_create_axes"] & n8["create_description"] & n9["write_data"]
end
Take these opinions with salt as I have little experience in large, community driven projects :) |
I'm new and am trying to put together a PR for documentation improvements surrounding
HDFStore
. I've been looking atHDFStore.append()
and have encountered several instances where I am unsure of what the best practice is:min_itemsize
that defines the max size of string columns andaxes
that change the axis of appending (though technically correct can be misleading). Do we keep them as is and document them as normal or modify the names throughout?None
which are later overwritten by sub-function default values (e.g.nan_rep
andchunksize
). Do we back-propagate the default values into all sub-functions such that we can document correct default values?append()
andput()
which act as near identical wrappers for the same function_write_to_group()
?Any opinions or resources regarding this will be helpful.
The text was updated successfully, but these errors were encountered: