-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FR: Improve/expand the 1MB file auto-snaphot heuristic #5278
Comments
Even better idea - to just measure the sum of all new files to be snapshotted, although idk about changes in tracked files - but this will cover the case where "we add 1mb but also delete 1mb, so the heuristic did not trip" |
It no longer fails since 168c797 but I agree that the heuristic can be improved. |
Huh, in context of 168c797 my idea makes a bit less sense - since I was thinking about aborting the snapshot completely if the sum of changes is >1MB, and I'm not sure how it can be molded into the new "print a warning and don't track the offending file" behavior - which btw makes my example with "pfew a file in target/ was large enough for the entire target/ to not get snapshotted" moot Could be a config option too?. |
In #323 (comment), I was obliquely proposing that we can treat untracked files similarly to Git LFS files. In that case, we could start tracking the files without tracking their contents (and leave the contents in the working copy). You'd have to eventually either ignore such files or start tracking their contents (perhaps as a part of trying to push?), but the content store wouldn't be bloated until such time as you explicitly start tracking their contents. Do you think a workflow like that would work for you? |
Yeah I wanted to say that your idea there sounds incredible as a replacement (adjustment?) of the current 1mb heuristic. This issue is about if a dir with a ton of small files was added - I don't think your proposal covers it? Yes, this does sound very interesting then. |
Yeah, my thought was that we probably don't want to descend into untracked directories and instead want to leave an opaque 'untracked tree' in the commit in all circumstances, just from a practical perspective.
|
Is your feature request related to a problem? Please describe.
Related to auto-tracking (#323).
As far as I understand, currently the snapshotting process get aborted with an error if it encounters a file that's >1mb (or configurable) in size.
However, that heuristic will miss a situation when there's a folder full of small files that get added - in my head when the heuristic is tripped I always think "pfew, good that the target/ folder had large enough files".
Describe the solution you'd like
Instead of checking for individual file sizes, the snapshotting process could measure the total logical size of all files in working copy and compare that to the total logical size of @ (not sure how well that'll work with things like watchman though, this is just an idea).
Now if that differs by >1mb (or configurable) we sound the alarm, "are you sure" type of deal.
It may also catch a large deletion, although not like that's of any concern with the oplog :)
I think I had a
.direnv
folder snapshotted once or twice where it would probably have been caught by this more generic heuristic.The most important part - for me, anyway, as I always cared about the size of the content store, not so much about secrets or whatever other auto-tracking concerns there are, is that with this algorithm no snapshot can add more than 1MB to the store
Describe alternatives you've considered
Keep it as-is 🤷
The text was updated successfully, but these errors were encountered: