Hi, I’m Erika Rowland (a.k.a. erikareads). Hi, I’m Erika. I’m an Ops-shaped Software Engineer, Toolmaker, and Resilience Engineering fan. I like Elixir and Gleam, Reading, and Design. She/Her. Constellation Webring Published on Modified on

Git Spelunking with Bisect

Today, I continued on the git spelunking that prompted the post from yesterday. I’ve been diving deeper in the Erlang codebase today, and so I needed some new tools.

git bisect

git-bisect lets you search for a commit using binary search. I wanted to find the last commit that contained a particular Erlang function in the otp codebase. My process looked something like this:Thank you to Yannick for helping me learn how to use this.You can read more about git-bisect here.

First I start the bisect

git bisect start

Then I specify a commit that didn’t have the function, in this case HEAD:

git bisect bad

Then I specify a commit that does have the function. In this case I knew that the function was in OTP_R14A. git-bisect lets me specify a tag here:

git bisect good OTP_R14A

At this point git-bisect took me automatically to a commit that was roughly halfway between those two commits. While I’m here, I can run whatever checks I want to on the command line:

rg '<function-to-find>'

If the function is there, I mark git bisect good. If it’s not, then I mark git bisect bad. Either way, git-bisect moves me to the next pivot commit in the search. otp has tens of thousands of commits, but binary search meant that it took around 15 steps to get the commit I wanted.

Once I was done, I could run:

git bisect reset

To get back to where I started.

It’s great to know that I can run whatever arbitrary manual commands I need at each manual step, but in this case I was running the same command each time.

In this case, that’s where git bisect run comes in:

git bisect run

git bisect run lets you run a script to check for good/bad at each step of the binary search, instead of having to do it manually.Thank you to Mikkel for suggesting this to me.You can read more about git-bisect run here.

The initial suggestion I got was to use git grep as in:

git bisect start
git bisect bad
git bisect good "[tag]"
git bisect run git grep "[function name]"

But I knew that there were two versions of is_system_process/1 at various times in the Erlang codebase. I only wanted to know about the older one, so I used this modified version instead:

git bisect start
git bisect bad
git bisect good OTP_R14A
git bisect run git grep "[function name]" "path/to/folder/*"

Where path/to/folder/* was pointing the folder I knew the function was in.I initially tried to use the exact filename, but in failing commits the file didn’t exist throwing an error that git bisect run didn’t know how to deal with. The glob seemed to avoid this problem.

With this, the binary search took a few seconds and quickly returned the same commit that I found manually.

Now that I had the commit in hand, I wanted to know what the next tagged release that contained that commit was. Enter git-describe:

git describe

git-describe is purpose built for finding tags from commits. By default, it finds the tag that immediately predates the commit. But if you use the --contains option, it will find the tag that “contains” the commit, that is the commit I want to find.Credit to Stack Overflow for helping me find this one.

If the tag is exactly the commit, then it will only return the tag. Otherwise, it will have a suffix that shows:Source where you can read more about git-describe.

the number of additional commits on top of the tagged object and the abbreviated object name of the most recent commit.

I can use sed to strip out the suffix, since I only want the tag name:

git describe --contains "<commit>" | sed 's/~.*//'

Another way to accomplish this is with git tag --contains:Thanks to Miccah for this suggestion.Read more about git-tag here.

git tag --contains "<commit>" --sort=creatordate

Which will return all of the tags that contain the commit, sorted by their creation date.

Now I have the tag I want, but when was it created? I found a couple techniques that work:

Technique 1: git log -1

git log will display information about the parents of a given commit. But if you use the -1 option, it will limit it to 1 commit, only the one we pass to it. And we can pass a tag to it:Credit to Stack Overflow for this one too.Read more about git-log here.

git log -1 "<tag>"

This will show the default information about the commit behind the tag.

If I only wanted the date, I could use:

git log -1 --format=%ai "<tag>"

Technique 2: git for-each-ref

git for-each-ref will iterate over all refs that match a given pattern. It also lets you format information from that ref.I found information on this one from two Stack Overflow answers: one and two. The latter from a comment on the accepted answer.Read more about git for-each-ref here.

That pattern can be as specific as a single tag:

git for-each-ref \
  --format="%(refname:short) | %(creatordate)" \
  "refs/tags/OTP_17.0-rc1"

I like this one because I can easy generalize a solution to ask other questions. When was every tag released?

git for-each-ref \
  --format="%(refname:short) | %(creatordate)" \
  "refs/tags"

By default, it seems to sort alphabetically by refname. If I want them in chronological order, I can add --sort=creatordate on the end.

Takeaways

I came out of this expedition with a rich collection of tools that I can use in future spelunking. All of these tools are built into git and have excellent documentation in the reference manual.

I often read source code in order to get a better understanding of the tools, libraries, and software that I use. These git tools allow me to explore that source code in specific historical context, and understand how codebases evolve and change over time.

Bonus: A Simpler Search Method

When I asked about how to find the commit I needed in a Recursers chat, I got more answers after I had already found the commit I was looking for. Here is one that was simpler than git-bisect:Thanks to Nathan and Benjamin for this suggestion.

git log -S

You can use git log -S "<name of function>" to find commits that touch the string <name of function>. I found this didn’t take much longer than git bisect run on the Erlang codebase. And, I didn’t have to find an early “good” commit before searching.


Constellation Webring