Home » BASH » How to Create a diff of two files and patching

How to Create a diff of two files and patching

Last updated Apr 28, 2021

In several situations or cases, we need to identify the differences between files and patch them, especially when updating configuration files, applications or patches, etc.

So, when we start thinking about our day-to-day communication with the Linux operating system, we found this is the general thing we regularly do.

There are a couple of things where we need to do this dif and patch operations:

  • When determining whether a particular script or configuration file has modifications
  • When considering differences between versions, or migrating data between an old to new script, and so on

So, what is a diff or differential?

Diff is an output that describes the differences between two files (file A and file B). File A is the source, and file B is assumed to be a modified file.

If the output of diff is not created, File A and B are either empty, or there are no differences. Diffs in a unified format look similar to this:

$ diff -urN fileA.txt fileB.txt 
--- fileA.txt 2017-12-11 15:06:49.972849620 -0500
+++ fileB.txt 2017-12-11 15:08:09.201177398 -0500
@@ -1,3 +1,4 @@
 12345
-abcdef
+abcZZZ
+789aaa

There are different diffs formats, but the unified format is the most popular (and used by the FOSS crowd).

It contains data of both files (A and B), the line numbers and counts in each, and the content added or changed.

If we look at the above sample, we can see that in the original, the string abcdef is removed (-) and then re-added (+) as abcdZZZ. And there is the further addition of a new line containing 789aaa (which can also be seen here: @@ -1,3 +1,4 @@).

A patch is a unified diff that contains changes to one or more files that are to be applied in a specific order or method, hence the concept of patching being the process of applying a patch (which contains diff information).

A patch can consist of several diffs concatenated together as well.

Getting ready

Before start lab section, these two utilities need to be installed:

$ sudo apt-get install patch diff

Now, let’s create a configuration file that’s copied from a real one:

$ cp /etc/updatedb.conf ~/updatedb-v2.conf

Open updatedb-v2.conf and change the contents to look like this:

updatedb-v2.conf

PRUNE_BIND_MOUNTS="yes"
# PRUNENAMES=".git .bzr .hg .svn"
PRUNEPATHS="/tmp /var/spool /media /home/.ecryptfs /var/lib/schroot /media /mount"
PRUNEFS="NFS nfs nfs4 rpc_pipefs afs binfmt_misc proc smbfs autofs iso9660 ncpfs coda devpts ftpfs devfs mfs shfs sysfs cifs lustre tmpfs usbfs udf fuse.glusterfs fuse.sshfs curlftpfs ecryptfs fusesmb devtmpfs"

In the event that your updatedb-v2.conf looks drastically different, add /media /mount to the PRUNEPATHS variable. Notice that they are separated by a space.

How to do it…

Open a terminal, and run the following commands in order to understand the diff command:

$ diff /etc/updatedb.conf ~/updatedb-v2.conf
$ diff -urN /etc/updatedb.conf ~/updatedb-v2.conf

At this point, only the diff information has been output to the console’s standard out and a patch file has not been created. To create the actual patch file, execute the following command:

$ diff -urN /etc/updatedb.conf ~/updatedb-v2.conf > 001-myfirst-patch-for-updatedb.patch

Note:

Patches can be found in many forms, but they usually have the.patch extension and are preceded by a number and a human readable name.

Now, before applying a patch, it can also be tested to ensure that the results are as expected. Try the following commands:

$ echo "NEW LINE" > ~/updatedb-v3.conf
$ cat ~/updatedb-v2.conf >> ~/updatedb-v3.conf
$ patch --verbose /etc/updatedb.conf < 001-myfirst-patch-for-updatedb.patch

Let’s see what happens when patches fail to apply using the following commands:

$ patch --verbose --dry-run ~/updatedb-v1.conf < 001-myfirst-patch-for-updatedb.patch 
$ patch --verbose ~/fileA.txt < 001-myfirst-patch-for-updatedb.patch

How it works…

The first diff command outputs the changes in the simple diff format. However, in the second instance, when running the diff command, we use the -urN flag(s). -u stands for the unified format, -r stands for recursive, and -N stands for a new file:

$ diff /etc/updatedb.conf ~/updatedb-v2.conf
3c3
< PRUNEPATHS="/tmp /var/spool /media /home/.ecryptfs /var/lib/schroot"
---
> PRUNEPATHS="/tmp /var/spool /media /home/.ecryptfs /var/lib/schroot /media /mount"

  
$ diff -urN /etc/updatedb.conf ~/updatedb-v2.conf
--- /etc/updatedb.conf 2014-11-18 02:54:29.000000000 -0500
+++ /home/rbrash/updatedb-v2.conf 2017-12-11 15:26:33.172955754 -0500
@@ -1,4 +1,4 @@
 PRUNE_BIND_MOUNTS="yes"
 # PRUNENAMES=".git .bzr .hg .svn"
-PRUNEPATHS="/tmp /var/spool /media /home/.ecryptfs /var/lib/schroot"
+PRUNEPATHS="/tmp /var/spool /media /home/.ecryptfs /var/lib/schroot /media /mount"
 PRUNEFS="NFS nfs nfs4 rpc_pipefs afs binfmt_misc proc smbfs autofs iso9660 ncpfs coda devpts ftpfs devfs mfs shfs sysfs cifs lustre tmpfs usbfs udf fuse.glusterfs fuse.sshfs curlftpfs ecryptfs fusesmb devtmpfs"

Now, we have created a patch by redirecting standard out to the 001-myfirst-patch-for-updatedb.patch file:

$ diff -urN /etc/updatedb.conf ~/updatedb-v2.conf > 001-myfirst-patch-for-updatedb.patch

Now we have created a new modified version of configuration file ~/updatedb-v3, notice anything from the dry-run? 

Ignoring that /etc/updatedb.conf file only has read-only permissions, we can see that HUNK #1 is applied successfully. 

hunk stands for a section of the diff, and you can have several for one file or many files inside of the same patch. 

Did you notice that the line numbers didn’t match precisely like those in the patch? 

It still applied the patch as it knew enough information and fudged the data to match to apply successfully. 

Be aware of this functionality when dealing with large files, which may have similar match criteria:

$ patch --verbose --dry-run /etc/updatedb.conf < 001-myfirst-patch-for-updatedb.patch 
Hmm... Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|--- /etc/updatedb.conf 2014-11-18 02:54:29.000000000 -0500
|+++ /home/rbrash/updatedb-v2.conf 2017-12-11 15:26:33.172955754 -0500
--------------------------
File /etc/updatedb.conf is read-only; trying to patch anyway
checking file /etc/updatedb.conf
Using Plan A...
Hunk #1 succeeded at 1.
done

If we attempt to apply the patch to a file on a file that does not match, it will fail, like in the following output (if –dry-run is specified). If –dry-run is not specified, the failure will be stored in a reject file as is noted in this line: 1 out of 1 hunk FAILED — saving rejects to file /home/rbrash/fileA.txt.rej:

$ patch --verbose --dry-run /etc/updatedb.conf1 < 001-myfirst-patch-for-updatedb.patch 
Hmm... Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|--- /etc/updatedb.conf 2014-11-18 02:54:29.000000000 -0500
|+++ /home/rbrash/updatedb-v2.conf 2017-12-11 15:26:33.172955754 -0500
--------------------------
checking file /etc/updatedb.conf1
Using Plan A...
Hunk #1 FAILED at 1.
1 out of 1 hunk FAILED
done
$


$ patch --verbose ~/fileA.txt < 001-myfirst-patch-for-updatedb.patch 
Hmm... Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|--- /etc/updatedb.conf 2014-11-18 02:54:29.000000000 -0500
|+++ /home/rbrash/updatedb-v2.conf 2017-12-11 15:26:33.172955754 -0500
--------------------------
patching file /home/rbrash/fileA.txt
Using Plan A...
Hunk #1 FAILED at 1.
1 out of 1 hunk FAILED -- saving rejects to file /home/rbrash/fileA.txt.rej
done

Related Posts

Creating a config file and using it in tandem with your scripts

In this article, we are going to create a config file and use it in our shell script.PrerequisitesBesides having a terminal open, you need basic knowledge of creating scripts and config files.Write scriptNow, we are going to create a script and config file. The...

Calculating and reducing the runtime of a script

In this article, we are going to learn how to calculate and reduce the script’s runtime. A simple time command will help in calculating the execution time.PrerequisitesBesides having a terminal open, make sure you have the necessary scripts present in your...

Using Bash to monitor battery life and optimize it

In this article, we will learn about the TLP Linux tool. TLP is a command-line tool; it is used for power management and will optimize the battery life.PrerequisitesBesides having a Terminal open, you need to ensure that you have TLP installed on your system.How to do...

Creating a simple NAT and DMZ firewall using bash script

In this article, we will create a simple NAT firewall with DMZ using iptables.PrerequisitesBesides having a Terminal open, you need to ensure that iptables is installed in your machine.Write scriptWe will write a script to set up a DMZ using iptables. Create...

Follow Us

Our Communities

More on BASH

The Ultimate Managed Hosting Platform
Load WordPress Sites in as fast as 37ms!

0 Comments

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

18 − five =

Shares