Using Git's rerere feature to escape recurring conflict hell

gist.github.com

94 points by ankitg12 1 day ago

dools 1 day ago

I never get conflicts during a merge because I only ever merge in one direction. I get all my conflicts on branches because I rebase before merging. I started doing this years and years ago because I kept coming across these mysterious silent regressions with my team. I searched something like "git merge silent regressions" and came across this stackoverflow answer:

https://stackoverflow.com/a/28510260

That completely fixed the problem. Now I only ever get conflicts on my feature branches. The rule is always: rebase away from main, and merge towards main. All conflicts are then on your branches, never on main, and always from rebase, never from merge. Then I set the pull behaviour to rebase, too.

I've never had a silent regression since, and never had a problematic conflict scenario.

I did recently learn about ORIG_HEAD though which was very cool, because I had accidentally rebased to main instead of to a dev branch from which I had created a bunch of worktrees, then when I merged back to the dev branch all hell broke loose, and I learned that I could revert a merge by checking out ORIG_HEAD:

https://icinga.com/blog/undo-git-reset-hard/

ubercore 1 day ago

I've never even seen someone suggest a rebase master onto feature workflow! TIL.
- dools 1 day ago
  
  I think the terminology would be the other way around, like you're rebasing the feature onto the main:
  git checkout feature
  git rebase main
  git checkout main
  git merge feature
  that way you get all your conflicts on the feature branch during rebase and your merge is always clean.
- barbazoo 1 day ago
  
  > I get all my conflicts on branches because I rebase before merging
  Pretty sure it's the other way around. You're on the branch and rebase it atop current master. If you merge after that, you won't have merge conflicts.
barbazoo 1 day ago

rerere is still useful here to handle merge conflicts after repeated rebases.
- mqus 1 day ago
  
  As someone who tried rerere and didn't see the point:
  How? Usually I rebase the same branch multiple times onto different, but successive commits of the master branch. But after I solved a bunch of conflicts of the first rebase, I shouldn't have the same conflicts again in a second one, since the rebased branch contains the merged conflict. Rebasing again could only turn up new conflicts (with newer, other commits on the master branch).
  How can I have the same conflict again for repeated rebases?
  
  barbazoo 1 day ago
  
  I know what you mean but doesn't that require squashing as well? If I have a branch with 5 commits, I think rerere helps me by only having to fix the conflict once, not potentially multiple times. I might be wrong here though.
  
  kazinator 1 day ago
  
  The point is that some organizations have a chaotic development process consisting of numerous similar branches. Often there is a main trunk, and then branches that were made for particular product variants (like piece of hardware or whatever) and cut at a particular point in time, in order to isolate from the trunk.
  What then happens is that when a bug is found that affects all branches, it must be cherry picked into all of them. If that cherry pick runs into conflicts, it is often the same conflicts, over and over again on each branch.
  Of course, the fix is not to do that, but it's easier to say that than to get away from that kind of workflow once you are steeped in it up to the chin.
  
  Sohcahtoa82 22 hours ago
  
  > Often there is a main trunk, and then branches that were made for particular product variants (like piece of hardware or whatever)
  I worked at a place that did this.
  The code was written in C, and I always thought the better solution would have been to use #define/#ifdef to flag certain blocks of code out of the compilation.
  A branch for each product was a nightmare when there were 10+ products, some with multiple variations, each on its own branch. Backporting a bug fix meant cherry-picking into 20+ branches. What made it especially stupid was office politics from each product having its own PM, and then the PM for one of the products would decide the bug wasn't significant enough to spend the time doing the cherry-pick and testing. This happened too often when it came to security fixes when a PM didn't understand the issue.
  
  kazinator 20 hours ago
  
  Yes; #ifdef and #endif is basically branching, but it's in one branch of the CM system.
  The benefit that everything is integrated, so there are no games with having to cherry pick things this way and that and losing fixes.
  The apparent downside is that there is no isolation. Inside some of those #ifdefs is code that you are not building. But changes you are making can break that code for someone else.
  While this may seem risky, it's actually better. Breaking something now is better than someone cherry picking your fix 8 months later into their branch and then dealing with the breakage.
  Fixes to common code never get left behind; everyone working off the trunk instantly gets them.
  The #ifdefs are immediately and constantly visible, telling you where the code is that is or is not part of what you are doing, and reminding you of its existence. They greatly discourage refactoring parts that you cannot test.
  There is pressure to keep those #ifdefs clean, whereas people go hog wild when they have their own branch, thinking they can rewrite whatever they want to suit what they are doing.
  
  chuckadams 21 hours ago
  
  Happens all the time with long term dev branches, which tend to arise when the PR review process is such a bottleneck that work piles up. Anything cherry-picked out of those branches tends to run into similar conflicts. ReRe is pretty handy in those circumstances.
techwizrd 1 day ago

This is what I've been doing for years. It's remarkably stress-free!
acallaghan 1 day ago

I'm also like this, rebasing feature branches onto main - I however have one suggestion when it comes to the push back up to origin
Instead of
`git push --force`
always use
`git push --force-with-lease`
https://git-scm.com/docs/git-push
This probably should be the default in git (as in there should be a `git push --force-without-lease` instead) and asks git to make sure the commits locally on your branch are up-to-date with those on remote/origin. It then fails if you try to overwrite commits that you haven't seen, and has saved me a few times when working between computers on the same project when i could have lost history on the remote that i failed to fetch.
- kazinator 1 day ago
  
  --force-with-lease serves no purpose.
  If you are sure that the repo you are pushing to is a stable target (nobody else is accessing it), you just use --force.
  If the repo you are pushing to is a moving target, you ... don't force push to it. Or else you warn all the repo users that you are about to rewrite history. Which means they not only should refrain from pushing, but have to be prepared for a second announcement which informs them that the rewrite is done; they must then fetch the rewritten head and fix up their unpublished work against the non-fast-forward change.
  Now it may be that --force-with-lease allows you to sneak in non-fast-forward changes without losing newly introduced upstream changes: but that assumes it's a good idea to be doing that sort of thing without communicating with your team. I.e. as long as we can sneak in a non-fast-forward change without accidentally/unknowingly deleting anyone's work, we are peachy; no need to coordinate.
  
  twodave 20 hours ago
  
  I wouldn’t say it serves no purpose. It is useful when rewrites are tolerable and loss of history is not. It’s the default when using tools like jj, because the expected workflow wraps git in a way that force pushes are frequent and expected, but blowing away someone else’s work by mistake is not.
  
  nlawalker 18 hours ago
  
  >If you are sure
  --force-with-lease exists for the scenario where you are sure, but wrong.
  
  kazinator 17 hours ago
  
  I don't see how you could be wrong in knowing whether you're the only user with push access to the remote repo, or there are others.
  
  bathtub365 16 hours ago
  
  You could have multiple computers, checkouts, or worktrees pointing at the same branch
  
  kazinator 9 hours ago
  
  You'd have to interrupt your own activity of synchronizing one of your downstream repos with your upstream, and force-publishing something back upstream, by switching to another one of the downstreams and publishing something into the upstream, which is then clobbered when you resume the original activity. Basically, split personality disorder where some of the personalities are not aware of the others.
  All of this still overlooks the fact that the changes are not lost. Say someone (like one of the personalities in your head you don't know about) publishes a change which you unknowingly clobber with a "git push --force". That someone will notice when they fetch the repo: it has diverged from their clone and when they look at the history of master vs origin/master, they will see that their commit which they are sure they pushed does not appear in origin/master.
  If you have multiple downstream checkouts and manage to clobber something with force pushes, you can recover. Then have a word with yourself and work smarter going forward.
  
  robertlagrant 7 hours ago
  
  All force-with-lease does it stop you from clobbering rather than you having to realise somehow that you did that. It seems like a no-brainer. What's the problem with it?
  
  acallaghan 8 hours ago
  
  My point is really that the default --force is dangerous for new/sleep deprived users, or ones that kinda understand but don't - where as --force-with-lease is not and is always safe. --force-with-lease should just be default
lelandfe 1 day ago

If you squash merge PRs, this is equivalent to merging master back into your feature branch before merging to master.
I do that a lot to avoid commits mutating mid-review, so you avoid having to force push over reviewed commits (which is a sin)
- dolni 16 hours ago
  
  Squash merging PRs makes your commit log is far less useful.
  The PR reviewer isn't the only person who will ever review your code, you know.
  
  lelandfe 5 hours ago
  
  I felt similarly. But I'm also usually the only person Tim Pope'ing my commits.
  If you police atomic feature branch PRs instead of atomic feature branch commits, though, things actually work out OK.
  (By the way, the even more compelling next step to your argument is: pull requests aren't artifacts! I've worked on projects that have emigrated from GitHub, and was left with just the commit log)
overtomanu 1 day ago

I follow this approach and still get the same merge conflicts coming repeatedly while doing rebase.
Let's say in my feature branch in the first commit I change a line in a file which also gets changed in the main branch. Then I have done 3-4 change commits doing changes in the same file. Now while doing rebase, I will have to resolve this conflict 3–4 times again and again while git re-applies commit one by one, during rebase.
I think I get this sometimes even if rerere is enabled, I am doing rebase using Intellij though, so maybe rerere doesn't get used here somehow or maybe diff context changes, so rerere is not applicable.
- alexsmirnov 23 hours ago
  
  We usually squash feature branches before merge. To squash before rebase, I use git reset --soft $(git merge-base develop HEAD) && git commit && git rebase develop - you have to resolve final conflicts only
ruszki 23 hours ago

That’s a funny Stackoverflow answer. That explanation cannot cause code loss. At least not with plain Git.
What I would check is hooks, or any other customizations. Especially on Windows, data loss is absolutely possible with misconfigured hooks, but it has nothing to do with when a commit was made.

0123456789ABCDE 1 day ago

  #~/.config/git/config
  [rerere]
      enabled = true
      autoUpdate = true

while you're editing git config, consider these:

    [pull]
      rebase = true

  [rebase]
      autoSquash = true
      autoStash = true

  [merge]
      # zdiff3 adds original text markers and removes matching lines from conflict regions
      # https://git-scm.com/docs/git-config#Documentation/git-config.txt-mergeconflictStyle
      conflictStyle = zdiff3
      autoStash = true

  [push]
      autoSetupRemote = true
      default = simple

  [init]
      defaultBranch = main

0123456789ABCDE 1 day ago

addendum
these are changes tacked to my git config over time. they're the result of working in a constant crunch state, where smooth forward move was more import than the state of the git history you left behind. these options remove small annoying road bumps. you should avoid some of them if you're working under different constraints, or the commit process differs such that these options don't apply.
it all starts with `pull.rebase=true` it makes `git pull --rebase` the default behavior. then comes `rebase.autoStash`, which just wraps the rebase with a stash push/pop envelop. if rebase is not your thing, and you prefer merge, `merge.autoStash=true` works the same. finally `push.autoSetupRemote=true` will skip asking you to set a remote tracking branch, it makes `git push` default to `git push --set-upstream`.
Izkata 1 day ago
```
    [pull]
      rebase = true
```
Don't use this one unless you really know what you're doing. Multiple co-workers have gotten into really bizarre rebases because of it (like rebasing 70+ commits from master on top of their branch instead of the other way around), it seems to cause more problems than it solves.
The man page for "git pull" even has a warning about using this flag.
- 0123456789ABCDE 20 hours ago
  
  trying to parse your comment, and the story is only getting more convoluted, the more i reread it.
  "rebasing 70+ commits from master on top of their branch" is not a real thing
  like, i want to believe you. it just doesn't work the way you're describing it.
  
  Izkata 18 hours ago
  
  I didn't catch the exact commands they were doing, but their branch was made that far back, that's how it was that many commits.
  Support was putting that setting in ~/.gitconfig for all new laptops, and we didn't know it was there. I saw one of them going through this huge rebase and was suspicious about how a pull was causing a rebase at all since this wasn't the default (and this particular co-worker wasn't the kind to explore these settings), so we went looking for it. After removing the setting, the command they were running worked exactly as expected without any surprises. They'd also said they thought they were using git wrong somehow because that wasn't the first time it happened.
  
  0123456789ABCDE 17 hours ago
  
  so, your colleagues merge features into `master` locally?
  no pull-requests, no branch protections for `master`. i mean this works, but you probably need those rails in place.
  
  Izkata 16 hours ago
  
  No, master is protected. I think they were trying to update their local master so they could resolve a merge conflict.
beart 1 day ago

I believe push default simple is the default in git now and does not need to be explicitly set.
- 0123456789ABCDE 1 day ago
  
  > This mode is the default since Git 2.0, and is the safest option suited for beginners.
  the docs checkout, ty

adithyassekhar 1 day ago

Do people really merge left and right between branches? Tell me if I got this wrong, this is how I work.

You got 4 devs. Each branched off from master. And we never merge from each other. Suppose 3 other people merged to master I pull it from master and fix only those conflicts. I’m not bringing your code into my branch unless it’s already finished and on master. If I need something you’re working on and it’s not on master when I need it, it’s a much larger planning issue.

If you have multiple environments before a stable master, do it only in one direction: feature > dev > staging > master. Don’t merge branches straight into staging or master.

I thought this was how everyone worked.

powersjcb 1 day ago

Yeah, I absolutely never run into this problem.
Sometimes we will have a huge stack of changes one of us is "finished but not clear to merge".
either:
- We just swap ownership of the branches and eng 2 now commits directly in branch 1. We review the final content together and typically pull in a 3rd person to review our combined work. Eng 1 either pairs with eng 2 until its finished or starts on a task that is decoupled from those changes.
- We use an integration branch that gets threated like the temporary master branch until the feature is ready to merge.
convolvatron 1 day ago

when you're not just doing small patches (like Greenfield work), and you have multiple developers, it can be unreasonable and wasteful to do what you describe. say we're redoing the way memory management it being done, or changing a foundational api. A and B could go through master, but they'd have to redo a whole bunch of work. Or we could make an integration branch where A and B hash things out together and only push to master when their work is done. you can also see how this allows a lot more back and forth about the places where the designs interact rather than 'I got in master before you, so you get to eat a multi-day merge and I'm outta here'
this kind of interaction shows up more in a 'release' based model than CI environment, and one where the cost of testing is non-zero (because we're actually doing testing). I know that's not a very popular model, but not all software development is CI websites.
- skydhash 1 day ago
  
  Those big refactoring should be discussed with the whole teams so everyone understands how they’re going to be affected. And rework like this should probably be hidden behind an abstraction (to do it gradually).
  
  convolvatron 1 day ago
  
  yes of course they should, that doesn't mean that integration branches aren't a good way to manage the tactical process. and often this kind of situation comes up when you're trying to do exactly that, introduce that abstraction that solves a whole bunch of issues, but requires cross-cutting changes.
  the problem comes up the other way too. the person making the refactor can get stuck in a perpetual hell of constantly reapplying the changes to smaller deltas that are being committed daily to master.
  
  skydhash 1 day ago
  
  I don’t remember which books exactly, but it was about legacy code. The author was talking about seams. If you’re trying to complete a refactor in one go, it will usually be arduous and maybe fail. What you want is to find seams in the codebase and maybe create some, where you can gradually decouple the architecture. After that you’ll have contracts and independent modules that can be reworked.
  It’s not easy though and you need to have a very good understanding of the codebase. But the nice thing is that you can do it without having to maintain a long lived branch (which is an antipattern).
legorobot 1 day ago

I thought this was how everyone worked.
I wish it were :)
This is the right way to do it. Whether using trunk-based development, git-flow, etc -- you're controlling the flow of merges in a particular direction.
However, I think the "larger planning issue" is harder to easily avoid when you have more devs, more changes, or the AI-boosted output we have today. If feature B requires feature A, and feature A isn't up-to-date with main, I could rebase feature A to main, then feature B to feature A. When feature A is merged and we're ready to PR feature B, I can rebase to main again then make my PR.
- sheept 1 day ago
  
  That sounds like a lot of work. If you rebase A on main then B on main, you end up having to resolve A's conflicts on main twice. But if you stick with merge commits, A's conflicts are handled once in the merge from main.

jstimpfle 7 hours ago

Rebasing all the time will most likely result in intermediate commits that don't build anymore (because who builds all the intermediate commits after a rebase), and which don't make sense anymore as a "history" of changes.

So my simple resolution has become: don't rebase anything beyond trivial. I use rebase -i to merge, delete, or reorder individual commits on an insolated feature branch. Anything else? Not worth the effort in my opinion. Just merge, in any direction. I even merge between feature branches sometimes, though one must be aware that this essentially glues the two feature branches, meaning either both or none can get merged to master.

Something worth stressing: The most important commit is always the one HEAD points to. Older commits don't matter so much to the degree that cleaning them up later is wasted time. The most value of git commit graph comes as a support structure for git merge. In most cases, nobody gives a shit that all intermediate commits are perfect for some definition of "perfect". Most commits will never be read again.

mamcx 1 day ago

I use rerere when "forced" by team to use rebase. It not even work that much at the end (you can't control workflows outside yourself, that is why git ux is wrong: it desperately need total discipline).

THEN, I move to jujutsu. Only has a few problems at start trying to use it as git, but after get the idea, all fine.

BTW, this was with the same team and they never know, so JJ is in fact better: It survive OTHERS workflows.

nchmy 1 day ago

No idea why this was downvoted. This is one of many ways in which jj is far better than raw dogging git
kazinator 1 day ago

You should not be seeing the same conflicts over and over again in a situation in which you are on a single branch, just rebasing your unpublished commits.
You might get conflicts regularly, if people are touching the same area of the code that you are touching, but not the same ones.
rerere mainly comes into play when people have to backport commits into multiple similar branches, due to having a heavily forked git landscape.
saghm 21 hours ago

Agreed; making conflict resolution separate for each commit rather than requiring a stateful flow where interruptions need to abort the entire process is one of the many killer features of jj.

davidelettieri 1 day ago

I'm using these options https://blog.gitbutler.com/how-git-core-devs-configure-git and I'm happy with it

barchar 23 hours ago

You can use the rerere-train script in the git repo to populate rerere from existing merges. I use this when something in a merge has regressed a big feature branch and I need to bisect. I can train rerere then rebase on the 2nd to latest merge-base, all while still doing no extra work if there isn't a regression.

cautiouscat 1 day ago

Every time I think I am adept in git, something like this is shown to me. I really should read into it more lol.