Write bash script for sanitizing user input and for repeatable results

One of the best practices for scripts (or programs, for that matter) is controlling user input, not only for security, but for controlling functionality in a way that input provides predictable results. For example, imagine a user who enters a number instead of a string. Did you check it? Will it cause your script to exit prematurely? Or will an unforeseen event occur such as the user entering rm -rf /* instead of a valid user name?

Table of Contents

In any case, limiting program user input is also useful to you as the author because it can limit paths users take and reduce undefined behavior or bugs. Therefore, if quality assurance is important, test cases and input/output validation can be reduced.

Prerequisites

This script might be introducing some readers to a concept they would like to avoid: software engineering. It’s true, you are probably writing scripts to quickly get a task completed, but if your script is to be used by other people (or for a long time), its great to catch errors early when they occur and prevent program misbehaviour.

Let’s look at a step by step example using a program that should echo the username of the user who executed the script via a prompt:

The script expects input to be read into a variable using the read command (for example).

The variable is assumed to be a string, but it could be the user’s name, a number, a post address in a foreign country, an email, or even a malicious command.

The script reads the variable and runs the echo command.

The results returned could be garbage, but could also be executed by another script—what could go wrong?

In all efforts, if security is not important, then the robustness of an application could be!

How to do it…

Let’s start our activity as follows:

Begin by opening a terminal and a new shell script called bad_input.sh with the following contents:

bad_input.sh

#!/bin/bash 
FILE_NAME=$1 
echo $FILE_NAME 
ls $FILE_NAME

Now, run the following commands:

$ touch TEST.txt 
$ mkdir new_dir/ 
$ bash bad_input.sh "." 
$ bash bad_input.sh "../"

Create a second script called better_input.sh:

better_input.sh

#!/bin/bash 
FILE_NAME=$1 
# first, strip underscores
FILE_NAME_CLEAN=${FILE_NAME//_/} 
FILE_NAME_CLEAN=$(sed 's/..//g' <<< ${FILE_NAME_CLEAN}) 
# next, replace spaces with underscores 
FILE_NAME_CLEAN=${FILE_NAME_CLEAN// /_} 
# now, clean out anything that's not alphanumeric or an underscore 
FILE_NAME_CLEAN=${FILE_NAME_CLEAN//[^a-zA-Z0-9_.]/} 
# here you should check to see if the file exists before running the command 
ls "${FILE_NAME_CLEAN}"

Next, run the script using these commands and not the output:

$ bash better_input.sh "." 
$ bash better_input.sh "../" 
$ bash better_input.sh "anyfile"

Next, create a new script called validate_email.sh to validate email addresses (similarly to how one would validate DNS names):

validate_email.sh

#!/bin/bash 
EMAIL=$1 
echo "${EMAIL}" | grep '^[a-zA-Z0-9._]*@[a-zA-Z0-9]*\.[a-zA-Z0-9]* 
RES=$? 
if [ $RES -ne 1 ]; then 
	echo "${EMAIL} is valid" 
else 
	echo "${EMAIL} is NOT valid" 
fi >/dev/null 
RES=$? 
if [ $RES -ne 1 ]; then 
	echo "${EMAIL} is valid" 
else 
	echo "${EMAIL} is NOT valid" 
fi

Again, we can test the output:

$ bash validate_email.sh 
ron.brash@somedomain.com ron.brash@somedomain.com is valid 
$ bash validate_email.sh 
ron.brashsomedomain.com ron.brashsomedomain.com is NOT valid

Another common task would be to validate IP addresses. Create another script called validate_ip.sh with the following contents:

validate_ip.sh

#!/bin/bash 
IP_ADDR=$1 
IFS=. 
if echo "$IP_ADDR" | { read octet1 octet2 octet3 octet4 extra;
	[[ "$octet1" == *[[:digit:]]* ]] && 
	test "$octet1" -ge 0 && test "$octet1" -le 255 && 
	[[ "$octet2" == *[[:digit:]]* ]] && 
	test "$octet2" -ge 0 && test "$octet2" -le 255 && 
	[[ "$octet3" == *[[:digit:]]* ]] && 
	test "$octet3" -ge 0 && test "$octet3" -le 255 && 
	[[ "$octet4" == *[[:digit:]]* ]] && 
	test "$octet4" -ge 0 && test "$octet4" -le 255 && 
	test -z "$extra" 2> /dev/null; }; then 
	echo "${IP_ADDR} is valid" 
else 
	echo "${IP_ADDR} is NOT valid" 
fi

Try running the following commands:

$ bash validate_ip.sh "a.a.a.a" 
$ bash validate_ip.sh "0.a.a.a" 
$ bash validate_ip.sh "255.255.255.255" 
$ bash validate_ip.sh "0.0.0.0" 
$ bash validate_ip.sh "192.168.0.10"

How script works…

Let’s understand our script in detail:

First, we begin by creating the bad_input.sh script—it takes $1 (or argument 1) and runs the list or ls command.

Running the following commands, we can either list everything in the directory, subdirectory, or even traverse directories backwards! This is clearly not good and security vulnerabilities have even allowed malicious hackers to traverse through a web server—the idea is to contain the input for predictable results and to control input instead of allowing everything:

$ touch TEST.txt 
$ mkdir new_dir/ 
$ bash bad_input.sh "." ... 
$ bash bad_input.sh "../" 
../all the files backwards

In the second script, better_input.sh, the input is sanitized by the following steps. Additionally, one could also check whether the file being listed is in fact there as well:

Remove any underscores (necessary).
Remove any sets of double spaces.
Replace spaces with underscores.
Remove any non-alphanumeric values or anything else that is not an underscore.
Then, run the ls command.

Next, running better_input.sh will allow us to view the current working directory or any file contained within it. Wildcards have been removed and now we cannot traverse directories.

To validate the form of an email, we use the grep command combined with a regex. We are merely looking for the form of an email account name, an @ symbol, and a domain name in the form of acme.x. It is important to note that we are not looking to see whether an email is truly valid or can make its way to the intended destination, but merely whether it fits what an email should look like. Additional tests such as testing the domain’s MX or DNS mail records could extend this functionality to improve the likelihood of a user entering a valid email.

In the next step, we test two domain names—one without the @ symbol (invalid) and one with the @ symbol (valid). Feel free to try several combinations.

Validating an IP address is always something that could be done with a regex, but for the purpose of easy-to-use tools that get the job done, read and simple tests using test (and evaluations) will work just fine. In its basic form, an IP address consists of four octets (or in layman terms, four values separated by a period). Without exploring what a truly valid IP address is, normally a valid octet is between 0 and 255 (never more and never less). IP addresses can have various categories and classes called subnets.

In our examples, we know that an IP address containing alphabetic characters is not a valid IP address (excluding the periods), and that the values range between 0 and 255 per octet. 192.168.0.x (or 192.168.1.x) is an IP subnet many people see on their home routers.

0 Comments

Submit a Comment Cancel reply

Are you open to learn Linux?

Get weekly Linux news, tutoials, tips & tricks, and other useful information related to Linux and Open source in your INBOX.

An introduction on Error Checking and Handling

Bash script is a powerful tool that allows you to automate tasks and perform complex operations on...

Read More →

BASH

Bash script: Error prevention

Bash script is a powerful tool for automating repetitive tasks and streamlining your workflow....

Read More →

BASH

Bash script: Error handling

When it comes to writing scripts in Bash, it's important to consider how to handle errors that may...

Read More →

BASH

Bash script: Error checking

Bash is a powerful tool that can automate repetitive tasks and make your life easier. But with...

Read More →

BASH

Bash: Interactive versus non-interactive scripts

Bash, or the Bourne Again Shell, is a popular command-line interpreter for Unix-based systems. It...

Read More →

BASH

Dealing with user input in bash script

Dealing with user input in bash script can be a tricky task, but with a little bit of knowledge...

Read More →

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Write bash script for sanitizing user input and for repeatable results

Prerequisites

How to do it…

bad_input.sh

better_input.sh

validate_email.sh

validate_ip.sh

How script works…

0 Comments

Submit a Comment Cancel reply

Are you open to learn Linux?

Success!

Related Articles

An introduction on Error Checking and Handling

Bash script: Error prevention

Bash script: Error handling

Bash script: Error checking

Bash: Interactive versus non-interactive scripts

Dealing with user input in bash script

LINUXCONCEPT

MOST VISITED

INFORMATION

CONNECT WITH US