What does git diff –binary do and why might it be useful?

In Git, the ‘diff’ command is commonly used to view differences between commits, branches, and even between the working directory and the staging area. The ‘git diff’ command usually works well for text files as it shows the difference between the file versions line by line.

However, when it comes to binary files (like images or compiled code), ‘git diff’ can’t show differences in a human-readable format. This is where the ‘–binary’ option comes in.

What is git diff –binary?

The ‘git diff –binary’ command is a version of ‘git diff’ which allows Git to handle binary files. The command itself generates a binary patch that can be applied with ‘git apply’. This command treats the binary files as a whole, and instead of trying to interpret and show the actual changes in the files, it gives the entire new version of the file.

The output of ‘git diff –binary’ contains two main parts for each binary file:

  1. A line that starts with ‘GIT binary patch’ indicating the start of a binary diff.
  2. Encoded content representing the binary data.

The binary data are delta encoded, compressed with zlib and then encoded with base85. This means that the actual binary data are not stored completely, but rather the differences from the old blob to the new blob are stored in a compact and efficient format.

Why might it be useful?

‘git diff –binary’ can be particularly useful in the following situations:

  1. Binary Files: If you have binary files in your repository, ‘git diff –binary’ helps you to create a patch that includes the binary file changes. You can share this patch with others, who can then apply it to their repository using ‘git apply’. This makes it easy to distribute changes for binary files in a similar way to text files.
  2. Efficient Patch Files: When creating patch files, ‘git diff –binary’ can create more efficient patches by only including the changes (deltas) rather than the whole file content. This can lead to smaller patch files, which are faster to transmit and apply.
  3. Version Control: It enables you to use Git as a version control system for binary files. Although there are often better systems for versioning large binary files (like Git LFS), ‘git diff –binary’ gives you the option to do it directly in Git when necessary.

In conclusion, ‘git diff –binary’ is a handy command to know when working with binary files in a Git repository. Even though its usage might not be daily, it’s still an essential tool in certain situations.