A note on commit messages

April 25, 2020 · 16 min read

Max Mulatz

Whitespace wrestler on a functional fieldtrip

This post is based on the talk “My Message on Commit Messages” I gave at the Ruby User Group Berlin.

Git 🌳

Peeking around the world of software development, one may sooner or later come across the term git. If you havenʼt yet, you have done so now and are prepared to continue this journey here ⛵️.

Software development is a collaborative process where we create, edit and delete text files that constitute our project. With multiple people doing this in parallel in different corners of the project, things can get confusing. We need to keep track of what is happening and inform the others about changes which may impact their work and the changes they are planning. Put into technical terms, we need a so called Version Control System (VCS) (or “Revision Control System”) if we want to build software together. Git is one flavour of these:

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.¹

Git was developed by Linus Torvalds to facilitate working on the Linux kernel with other developers. They initially managed changes to their project by passing around patches and archived files before introducing a proprietary version control system named “Bitkeeper”. Git was born in 2005 when Bitkeeper stopped offering their tool free of charge to the community.² Git is open source, continuously improved and the de facto (industry-) standard when it comes to version control systems nowadays.

Git in 2020 📠

Why are version control systems like Git still relevant in our fast-moving and hyper-modern times of 2020? Building and maintaining software is still a collaborative effort. People need to know about the changes others did to the project and why they did it. Without this, itʼs hard to make good decisions for oneʼs own changes and the project may already be doomed to end up as a digital junkyard. Access to this information at any time without having to ask everyone in the team individually is a key building brick for developing software together.

Without version control systems, we would send around tarballs of source code or mob-access the same files on a shared server. The first is tedious and uninclusive and the second turns developing software into an “open heart surgery” with people yelling “donʼt touch file xzy, I am working on it” through offices, living in constant fear of overwriting and breaking each otherʼs work and somewhere in this struggle naming files algorithm_final_final_2.js. Version control systems give teams structures and processes to “control” the “version” of their project.

Git with its ecosystem of tools and established workflows to document, decide on and reason about code changes has proven to be a good idea. To the point where there is hardly a way around it in todayʼs world of programming.³

A Commit 📸

Before diving into commit “messages”, we need to know about commits:

A “commit” is a snapshot of your files.⁴

A commit is a uniquely identifiable snapshot of the project at a certain moment. It contains the projectʼs current state encoded as information about how it differs from a previous state, along with a log message from the author describing the changes.⁵ The latter part is referred to as the commit “message”.

Commit messages 💌

Many developers treat commit messages more like an annoying “are you sure” confirmation dialogue on the otherwise interruption free autobahn to get their code out into the world and can easily get by with that forever. Commit messages arenʼt absolutely necessary for building software, but they can make life so much easier.

Communication cut ✂️

Software projects often have people working on different tasks in parallel. These tasks are rarely fully independent and can affect each other. Not caring about commit messages, one may experience scenarios where important information about a decision or change in the codebase is missing or not communicated yet. This knowledge gap can have negative impact on following tasks.

Imagine backend and frontend developers collaborating on a feature and neither can read code from the othersʼ domain. The backend people frequently update the codebase with changes labelled “fix”, “polish”, “make work” or “f*ck linters”. Now imagine yourself in the group of frontend developers, waiting for an API specification. How can you know there is something ready to base your work on? Is the API ready, broken or work in progress? Unable to reason about the raw diff, you constantly have to nag the backend developers about these missing pieces of information. This makes collaboration unnecessarily hard, cuts down possibilities for parallelism and introduces hard dependencies on personal availability, mood and memory. Coming in after the weekend, the backend team doesnʼt know about the current state from looking at their commit history either. They have to dig into the code to find out where they left things.

Useful commit messages can help to prevent situations of missing important information. Unlike a Slack message or a coffee chat in the kitchen, they are persisted with the code change as an always available way to retrieve contextual information about it - for your teammates or for yourself, coming back after the weekend.

Information hunt 🔍

Or imagine you need to adapt a piece of code in a legacy codebase. Looking at it for hours, you still donʼt understand the code and why it is as weird as it is. You donʼt dare to touch it as it could impact other areas of the project. With no one around to ask, you look up the commit that introduced the change to see whether it can enlighten you. Depending on the grade of “legacy-ness”, the message of the commit may read:

commit e3aa57ee56e8d9aed89560d1ab4702068ce65d12
Author: Hannah Operator <hannah@dev.internetz>
Date:   Mon May 4 22:44:05 2005 +0200

    make things finally wrk also linter and change UI, fix logging 🍻

This does not help. There is no useful contextual information in this commit message. The author of the code may either long be gone or - surprise - was you a couple of years ago. Nevertheless, you have to continue your search for information elsewhere and eventually just leave the piece of code untouched (and rotting). A good commit message would have been helpful here.

Commit messages can be a useful tool to provide contextual information about a change for others or your future self reading the code. Ignoring them when committing changes is like throwing a boomerang - it will come back at you:

Through its lifetime, code is read far more often then it is written…⁶

Mindset 🧠

Writing good commit messages is more a question of awareness than skill. It needs the right mindset to understand the value and why itʼs worth the effort.

With collaboration no longer tied to physical presence and teams distributed across the globe, good habits of communication - especially asynchronous communication - are more important than ever. Version control systems act as a medium for asynchronous communication among developers working on the same codebase. A Git commit links changes in the code to the people behind and the plans and intentions they had. It stays with the change forever, even if the authors may have left long ago. It can act as a powerful communication channel, even between generations of developers on a codebase.

The diff of a commit states “what” changed, the commit messages the context, the “how” and “why”. Both are important for anyone (including the authorʼs future self) reasoning about the code. A clean commit history with clearly scoped commits and meaningful commit messages acts as automated documentation for the life cycle of the project. It allows people to git blame a line of code to find out when, why and by whom it was introduced or changed.

Unlike API docs or architecture diagrams which usually document a desired or planned state of the project, commit messages reflect the current state of the project and how it was achieved. They can be read as a “process log” for the project. A project manager may ask: “What did you actually ship in the last two weeks?”. A well maintained Git history is able to answer this question within seconds without having to dig through wikis, issue trackers or Slack conversations.

Read vs. Write ↩️

Developing awareness for the value of commit messages means empathizing with readers of the code (including yourself). It relates to how high we value the maintainability and liveliness of software: Is it supposed to last for a while or is it a throw-away piece of digital junk? This of course depends on the project, but all in all, code is read far more often than written:

Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. …[Therefore,] making it easy to read makes it easier to write.⁶

We should therefore aim to cater the needs of developers reading code, not only the single author writing it. A clean commit history with meaningful commit messages facilitates reading, understanding and maintaining code. To read a piece of code, people, including the author themself, often need to re-establish the context around a change at a later point in time. Reasonably scoped commits with meaningful commit messages give them a chance to do so.

Software we build today usually has an intended lifespan of at least a couple years to decades. This inevitably requires people to read and understand the existing code. In solidarity with future generations of developers, we should aim to this as easy as possible. Writing meaningful commit messages has a significant impact on this and helps preventing frustration. Without this explicit documentation of the commit history, the codebase may easily slide towards big-L legacy, where code is doing things, but itʼs impossible to tell what and why.

🎥 Also watch Tekin Süleymanʼs awesome talk A Branch in Time (a story about revision histories) from Ruby Australia for an illustrative example of the benefits of good commit messages for other developers and your future self. Writing good commit messages makes lives easier.

Writing commit messages ✍️

Commit messages document “why” a change was made. Keeping in mind that this information stays with the change forever, the people at Thoughtbot propose three questions as an orientation for what to actually put into a commit message:⁷

1. Why is the change necessary?

A commit message informs about the purpose of the change and summarizes what the commit is about on a conceptual level beyond what is already visible in the raw diff.

2. How does this address the issue?

A commit message may also give a high level overview of what has been done. Technical detail should be left out as it is better visualized in the diff. It can also be useful to mention alternative approaches one considered. This helps to focus later discussions and makes the solution more transparent to future readers.

3. What are side effects of this change?

A commit message should also list side effects of a change if there are any. Having this close to the change helps when hunting down eventual regression bugs later. Discussing side effects in the code review or inside the team isnʼt sufficient as itʼs rather difficult to dig up this information again later. Noted in the commit message however, the information stays persisted close to the change where people can easily find it.

The questions indicate: commit messages are a powerful communication tool. All developers on the team, including yourself in two weeks will thank you for caring and using them. Therefore - and this applies to a lot of things in life - if you can make the lives of others a bit better, just immediately do it!

The Looks 💅

A good commit message documents the “why” and enables others to understand the context of a change. But how should it look?

Here is an example from the Rails codebase:

Convert configs_for to kwargs, add include_replicas

Changes the `configs_for` method from using traditional arguments to
using kwargs. This is so I can add the `include_replicas` kwarg without
having to always include `env_name` and `spec_name` in the method call.

`include_replicas` defaults to false because everywhere internally in
Rails we donʼt want replicas. `configs_for` is for iterating over
configurations to create / run rake tasks, so we really don't ever need
replicas in that case.

The first line, Convert configs_for to… is referred to as the “subject”, the rest as the “body” of the message. Tim Pope, of general open source fame and the author of the - at least among commit-message-lovers - legendary Note about Git Commit Messages with sensible and concise guidelines for formulating and formatting commit messages states:

The subject/body distinction may seem unimportant but itʼs one of many subtle factors that makes Git history so much more pleasant to work with than Subversion.

Being able to distinguish between subject and body allows for more control about how much detail about a change is displayed in which context.

Subject 📬

The subject is supposed to give a short summary of what the commit is about. You see it in various places in Githubʼs UI or command line tools like git rebase or git reflog where users need a summary of a list of commits, not the individual details.

According to Tim Popeʼs de-facto standard guidelines, the subject should contain an around 50 characters long concise but meaningful summary of the change. Given the size limits, this usually wonʼt make a valid full featured sentence and trying to formulate it as one may indeed be hindering. So itʼs recommended to not treat it as a sentence in the first place and also not end it with a dot. If the subject gets too long, most tools truncate it with ….

Tim Pope also suggests to use imperative forms, such as “fix bug” instead of “fixing bug” and formulate the subject to complete the sentence: “If applied, this commit will …”. This makes it align nicely with Gitʼs auto-generated commit messages like merge and revert and produces a consistent and easy to read commit history.

The subject line is followed by a blank line separating it from the body. This is absolutely crucial: Without this line the message canʼt be parsed correctly and may generate weird cut offs, line breaks and indentation. This brings us to the most important advice in this post. You may forget everything you read so far, but remembering the following will make you a better human being and spark joy in your and other peopleʼs lives:

🚨 Let go of the git commit -m shortcut forever and for good! 🚨

git commit -m <msg> is a shortcut to inline the commit message with the git commit command in one go. This is normally a two-step process where git commit opens your favorite text editor - at least the one configured via the GIT_EDITOR or EDITOR environment variable - to let you enter and save the log message before generating the commit.

It may be challenging to overcome the muscle memory, but youʼll immediately see drastic improvements in your commit message game. A text editor is by far more convenient and inviting for formulating and formatting prose. It gives you spell checking, indentation, line breaks, syntax highlighting and other goodies from the modern world. Multi line commit messages written on the command line will most likely not have the right format and wrapping and end up as a gibberish mess in the Git history.

TL;DR Use a text editor for writing commit messages! 💡

Body 🎒

For a commit small enough that a subject line is already enough to summarize it, one may of course fully omit the body of the commit message. For instance when fixing a small typo in the README.md of the project - a rare case for git commit -m. Most commits are complex enough to require a longer commit message though.

The commit message body explains the “why” and the “how” of the commit in more detail. It can be a multi line text with its paragraphs separated by blank lines and wrapped at 72 characters. Gitʼs default pager less may choke and show hard to read output on other formats, so this is the agreed upon standard and your text editor will most likely already adhere to it by default.

The whole commit message could then be formatted like this example from Tim Pope:

Capitalized, short (50 chars or less) summary

More detailed explanatory text, if necessary.  Wrap it to about 72
characters or so.  In some contexts, the first line is treated as the
subject of an email and the rest of the text as the body.  The blank
line separating the summary from the body is critical (unless you omit
the body entirely); tools like rebase can get confused if you run the
two together.

Write your commit message in the imperative: "Fix bug" and not "Fixed bug"
or "Fixes bug."  This convention matches up with commit messages generated
by commands like git merge and git revert.

Further paragraphs come after blank lines.

- Bullet points are okay, too

- Typically a hyphen or asterisk is used for the bullet, followed by a
  single space, with blank lines in between, but conventions vary here

- Use a hanging indent

Extras 🍦

Depending on the tool you are using, commit messages can also include additional meta information. Github for instance supports closing issues and citing co-authors from commit message bodies formatted accordingly. The developers at thoughtbot already have the respective lines for pairing partners in their default .gitmessage template:

# 50-character subject line
#
# 72-character wrapped longer description. This should answer:
#
# * Why was this change necessary?
# * How does it address the problem?
# * Are there any side effects?
#
# Include a link to the ticket, if any.
#
# Add co-authors if you worked on this code with others:
#
# Co-authored-by: Full Name <email@example.com>
# Co-authored-by: Full Name <email@example.com>

Configuring a .gitmessage file can be useful to raise awareness and establish certain rules for commit messages within a team. Git uses the contents of the file to pre-fill its commit messages. It can be used as a template, checklist or guideline for writing the actual commit message.

Conclusion 👏

Empathize with the readers of your code and write good commit messages. A clean Git history with meaningful, well formatted commit messages is a great way to show solidarity with fellow developers including your future self and consistently document the life cycle of your project. Adapt a healthy mindset for writing commit messages and make other peopleʼs lives easier ☯️.

A note on commit messages

Git 🌳

Git in 2020 📠

A Commit 📸

Commit messages 💌

Communication cut ✂️

Information hunt 🔍

Mindset 🧠

Read vs. Write ↩️

Writing commit messages ✍️

The Looks 💅

Subject 📬

Body 🎒

Extras 🍦

Conclusion 👏

References

Max Mulatz

We’re hiring

Git 🌳​

Git in 2020 📠​

A Commit 📸​

Commit messages 💌​

Communication cut ✂️​

Information hunt 🔍​

Mindset 🧠​

Read vs. Write ↩️​

Writing commit messages ✍️​

The Looks 💅​

Subject 📬​

Body 🎒​

Extras 🍦​

Conclusion 👏​

References​

Max Mulatz

We’re hiring

Git 🌳

Git in 2020 📠

A Commit 📸

Commit messages 💌

Communication cut ✂️

Information hunt 🔍

Mindset 🧠

Read vs. Write ↩️

Writing commit messages ✍️

The Looks 💅

Subject 📬

Body 🎒

Extras 🍦

Conclusion 👏

References