git notes main page | gitolite main page | license

IMPORTANT NOTE: although this page has a "gitolite.com" URL, this is not about gitolite. That's just an artifact of "sitaramc.github.com" being translated to "gitolite.com" and so ALL my git related stuff gets carried over. Gitolite documentation has another /gitolite in the URL, so you can tell. My apologies for this confusion.

Newcomers to git often ask why there is such a thing as the index, and what use is it. They'd rather just do git add -A; git commit each time to avoid thinking about the index at all, because it seems like one extra (and needless) complication.

So... what use is this, apart from confusing you with multiple names like index, staging area, and cache?

what is the index

The index (or any of its other names) is essentially a "holding area" for changes that will be committed when you next do git commit. That is, unlike other VCSs, a "commit" operation does not simply take the current working tree and check it as-is into the repository. The index allows you to control what parts of the working tree go into the repository on the next "commit" operation.

That should be sufficient background to appreciate (in abstract terms) the following discussion.

staging helps you split up one large change into multiple commits

Let's say you worked on a large-ish change, involving a lot of files and quite a few different subtasks. You didn't actually commit any of these -- you were "in the zone", as they say, and you didn't want to think about splitting up the commits the right way just then. (And you're smart enough not to make the whole thing on honking big commit!)

Now the change is all tested and working, you need to commit all this properly, in several clean commits each focused on one aspect of the code changes.

With the index, just stage each set of changes and commit until no more changes are pending. Really works well with git gui if you're into that too, or you can use git add -p or, with newer gits, git add -e.

staging helps in reviewing changes

Staging helps you "check off" individual changes as you review a complex commit, and to concentrate on the stuff that has not yet passed your review. Let me explain.

Before you commit, you'll probably review the whole change by using git diff. If you stage each change as you review it, you'll find that you can concentrate better on the changes that are not yet staged.

git gui is great here. It's two left panes show unstaged and staged changes respectively, and you can move files between those two panes (stage/unstage) just by clicking on the icon to the left of the filename.

Even better, you can even stage partial changes to a file. In the right pane of git gui, right click on a change that you approve of and choose "stage hunk". Just that change (not the entire file) is now staged; in fact, if there are other, unstaged, changes in that same file, you'll find that the file now appears on both top and bottom left panes!

[Do remember, however, that if the change is really complex maybe you should split it into multiple commits!]

staging helps when a merge has conflicts

When a merge happens, changes that merge cleanly are updated both in the staging area as well as in your work tree. Only changes that did not merge cleanly (i.e., caused a conflict) will show up when you do a git diff, or in the top left pane of git gui.

Again, this lets you concentrate on the stuff that needs your attention -- the merge conflicts.

staging helps you keep extra local files hanging around

Usually, files that should not be committed go into .gitignore or the local variant, .git/info/exclude.

However, sometimes you want a local change to a file that cannot be excluded (which is not good practice but can happen sometimes). For example, perhaps you upgraded your build environment and it now requires an extra flag or option for compatibility, but if you commit the change to the Makefile, the other developers will have a problem.

Of course you have to discuss with your team and work out a more permanent solution, but right now, you need that change in your working tree to do any work at all!

Another situation could be that you want a new local file that is temporary, and you don't want to bother with the ignore mechanism. This may be some test data, a log file or trace file, or a temporary shell script to automate some test... whatever.

In git, all you have to do is never to stage that file or that change. That's it.

staging helps you sneak in small changes ;-)

Let's say you're in the middle of a somewhat large-ish change and you are told about a very important bug that needs to be fixed asap.

The usual recommendation is to do this on a separate branch, but let's say this fix is really just a line or two, and can be tested just as easily without affecting your current work.

With git, you can quickly make and commit only that change, without committing all the other stuff you're still working on.

Again, if you use git gui, whatever's on the bottom left pane gets committed, so just make sure only that change gets there and commit, then push!

a word of warning...

While the first 3 use cases are perfectly legitimate and useful uses of the index, this last example could be dangerous if you get carried away. You might end up committing something that is not tested exactly, because you tested with the contents of the work tree, which is not the same as the index that is being committed.

Of course, using a powerful DVCS does not absolve you from the responsibility of thinking :-) So if you want to be 100% sure, there is a simple way to test only the staged code:

# stash away the changes to git-tracked files that you have not staged yet
git stash save --keep-index
# now do your tests on the exact stuff being committed
make; make test     # etc...
# when satisfied, commmit
git commit

# now bring the stashed changes back to continue normal work
git stash pop

If you need to understand how/why this works, ask me, otherwise just use it and be happy :-)