Replies: 2 comments 4 replies
-
A probably-better approach is to avoid additional special cases, and store everything as commits instead. If instead we stored the state of the repository during a conflict as a commit (with the data about which files are in conflict stored out-of-band), then it would be a lot easier to reconcile it into the current event log system. Additionally, it would naturally lend an interface for users to defer a conflict resolution for later, or go back in time to a previous conflict resolution. Storing conflicts in commits directly is one of the design goals for https://github.com/martinvonz/jj. Now I've come to appreciate the benefits of that approach... There are some complications, like the question of how we actually re-enter a conflict resolution state when checking out a commit (using the |
Beta Was this translation helpful? Give feedback.
-
According to Google trends, it seems like undoing the operation of
|
Beta Was this translation helpful? Give feedback.
-
Motivation
git undo
can't undo everything, but we can get close. As of c3f4f3c, we can run arbitrary code after agit
command (assuming that the user has set up an alias). So it should be possible to track staged changes after eachgit
command, and allow them to be undone as well.The end goal is that we should be able to drop back into a merge conflict resolution (as part of a rebase or merge) and continue from there, if we decided that we did it wrong previously. That is, you should be able to run
git undo
, and then rungit status
and see a file marked as "both modified" which needs to be resolved. Then you should be able to run e.g.git rebase --continue
to resolve the conflict and continue rebasing.Background
Git's object database is a content-addressed key-value store. The key is the hash of the content, and the value can be one of these types of objects:
The index is an on-disk structure which is used to liaise between the working copy and the repository contents. It contains a sorted list of files (but no directories!), each of which has the following interesting attributes for our purposes:
man git-merge
for more details. It's possible for the same file path to exist multiple times in the index with different stage values.Since the index stores the merge stage, it contains more information than can be represented straightforwardly in a tree object, which means that storing snapshots of the index will have to be more clever.
Design
Data
The idea is to store staging events in the event-log database. The important information is Git's index file, plus any auxiliary status files under the
.git
directory (like the rebase plan) that may be present. Collectively, let us call these files a "stage".We can reuse Git's object database to store snapshots of the relevant files in the stage. None of these files are tracked by the repository normally.
However, the index file may be too big to copy and store into the object database every time we run a command (tens of megabytes in practice). So instead of storing the entire index, we'll store the parent commit ID, plus the differences from that commit's tree to the current index. Experiments indicate that calculating the diff doesn't take too long (<1s on my maxed-out 2019 Macbook), and we can also optimize by not calculating the diff if the index doesn't appear to have been modified since the most recent stage.
There can only be one stage active for a given worktree at a time.
Logically, a stage which we're persisting has these fields:
There is technically no requirement that the tree be associated with a commit of its own, but the easiest implementation would create a commit for each stage with the logical base commit as its parent. It would then automatically render in the correct place in the smartlog, and we could ensure the stage is kept live by the existing garbage collection integration.
Events
In the event log, we'll store stage events which contain the above fields. The active stage for a given point in time is determined by the most recent stage event. This contains the parent commit plus its diff to be able to recreate the stage.
There is no Git hook which triggers when the index changes. For now, the best we can do is wrap the
git
command and check the stage after each command invocation, and store an event if it appears to have changed. Note that since the stage contains a reference to the parent commit, then if the checked-out commit changes, we will need to emit another stage event (or derive the changed parent commit after the fact).It would also be reasonable to use the same kind of event to track both commits and stages.
Undo
The inverse of a stage event is the stage event which preceded it. Unlike all other events at the time of this writing, this inverse event is not determined entirely by a given event's contents; it is sensitive to the events which have happened before it.
To undo to a previous stage, we need to check out to the corresponding commit, and then apply the diff. This means overwriting files in the
.git
directory. After doing this,git status
will show that the working copy files are different from their staged versions. Unless we commit to also tracking unstaged changes (!), we can't restore the working copy files to their old versions. We should either leave them untouched, or update them to their staged versions, or possibly attempt to merge them together.Beta Was this translation helpful? Give feedback.
All reactions