Functions in Bash

Overview

Teaching: 25 min
Exercises: 10 min
Questions
  • What is a function?

  • Why use a function?

  • Where do I write a function?

  • When should I use a function?

  • How do I write a function?

Objectives
  • Understand the use case for a function.

  • Write a simple function.

  • Write a function that accepts inputs.

  • Write a function that returns an output.

  • Refactor a piece of code to use functions.

What is a function and why use one?

This is a function:

function hello_world {
  echo Hello World from teh function 'hello_world'!
}

or with a different syntax,

hello_world () {
  echo Hello World from teh function 'hello_world'!
}

Both of these blocks define a function that when used prints Hello World from teh function 'hello_world'! the following to the terminal.

Try it out now by copying one of the functions into the command line hit return to save it. Then run it by typing hello_world and pressing return.

Hello World from teh function 'hello_world'!

You may note that this is quite similar to running other commands we have been using in the terminal like pwd or ls. These are both functions that have been predefined for us because they are useful tools that we are likely to want to use multiple times.

We use functions to provide utilities that we want to use quickly and repeatedly, the above hello_world function is actually a bad example in this regard. Whilst we can now print the string Hello World from teh function 'hello_world'! somewhat more quickly, is is unlikely that this is something we will frequently want to do. On the other hand pwd is an extremely useful tool that we will use frequently.

Where do I write a function and when should I use them?

In the above example we wrote and used a function directly on the command line. Whilst there is nothing wrong with this, it doesn’t give us any options to edit the function without completely rewriting it. For example, in the above function there was a typo and we put teh and not the to fix this we need to write the whole function again and update it in the command line.

function hello_world {
  echo Hello World from the function 'hello_world'!
}
Hello World from the function 'hello_world'!

It’s therefore much more useful to define and use your functions inside a script where they can be edited and used quickly.

Using the following command we will make a script to work on some functions. Then use chmod so we can run it.

touch my_functions.sh
chmod +x my_functions.sh

You can then edit this script in which ever text editor we choose. For the sake of consistency let’s use Nano

nano my_functions.sh

Let’s start by using the following script that checks the time 100 times but only prints it if the seconds are a multiple of 3:

#!/bin/bash

# Start a loop
while [ $i -le 100 ]
do
   # get the time
   time_var=$(date "+%H:%M:%S")
   # split the time into hours, minutes, and seconds
   IFS=':' read -ra time_arr <<< "$time_var"
   # use the moduli operator to check if the number is divisible by 3
   if ((${time_arr[2]} % 3 == 0));
   then
      # if the number of seconds is divisible by 3 then print the time to console
      echo $time_var
   fi
   # wait one second
   sleep 1
   ((i++))
done

This script has many lines and a decent amount of complexity and is inflexible. So we may want to break it down We can use functions to address this. Along the way we will learn about the way functions return data and how to instruct a function to have different behaviours.

How can I use the result of my function?

Let’s start by considering a function that reads the current time and returns the time array – the first four lines of the while loop.

function get_time_arr {
   # get the time
   time_var=$(date "+%H:%M:%S")
   # split the time into hours, minutes, and seconds
   IFS=':' read -ra time_arr <<< "$time_var"
   echo ${time_arr[@]}
}

If you have experience with other programing languages then you may be hesitant about the way this function looks. Most languages have some kind of return to end a function to return the result. However, in bash the return is only used to return the status of the function, a 0 indicates success and anything else indicates failure. To check the status of the last run function one can use a special variable $?. To get the result of this function we are using ‘command substitution.’ The final line of the function prints the result of the function which can be collected.

Let’s take an aside to look at this behaviour.

Copy and paste the above function to the terminal then run.

get_time_arr

echo $?

The function prints the time then the function result which should be 0. But this isn’t particularly useful we want to be able to collect and use the output. To do this we collect the output of the function in a variable.

time_arr=$( get_time_arr )
echo $?
echo $time_arr

Here we have assigned the output to the variable time_arr then printed it at our convince. We will look at other ways of returning variables later.

For now lets get back to our script and replace the first few lines with the function we just wrote.

#!/bin/bash

# Define a function to get a time and split it into an array
function get_time_arr {
   # get the time
   time_var=$(date "+%H:%M:%S")
   # split the time into hours, miniutes, and seconds
   IFS=':' read -ra time_arr <<< "$time_var"
   # return the array
   echo $time_arr
}

# Start a loop
while [ $i -le 100 ]
do
   # Run the funtion to get the current time as an array are assign the result to time_arr
   time_arr=$( get_time_arr )

   # use the moduli operator to check if the number is divisible by 3
   if ((${time_arr[2]} % 3 == 0));
   then
      # if the number of seconds is divisible by 3 then print the time to console
      echo ${time_arr[@]}
   fi
   # wait one second
   sleep 1
   ((i++))
done

Unfortunately, whilst before we had access to the variables time_var and time_arr, we now we only have time_arr as using ‘command substitution’ we can only return a single argument. So now we will upgrade the get_time_array to be able to assign multiple values.

#!/bin/bash

# Define a function to get a time and split it into an array
function get_time_vars {
   # get the time
   time_var=$(date "+%H:%M:%S")
   # split the time into hours, miniutes, and seconds
   IFS=':' read -ra time_arr <<< "$time_var"
}

To use the function above we need to use two variables that will store the results

time_var='time_var'
time_arr='time_arr'
get_time_vars
# check the return status of the function
echo $?
# print the variables to the console
echo $time_var
echo ${time_arr[*]}

We can use this updated function to restructure our script like so:

#!/bin/bash

function get_time_vars {
   # get the time
   time_var=$(date "+%H:%M:%S")
   # split the time into hours, miniutes, and seconds
   IFS=':' read -ra time_arr <<< "$time_var"
}

# Initilise some variables
i=0
time_var='time_var'
time_arr='time_arr'
# Start a loop
while [[ $i -le 100 ]]; do
   # Run the funtion to get the current time as an array
   get_time_vars
   # use the moduli operator to check if the number is divisible by 3 n.b. we are using '10#' to let bash know seconds '01, 02, 03...' are in base 10
   seconds=10#${time_arr[2]}
   floor_val=$(( seconds % 3 ))
   # if the number of seconds is divisible by 3 then print the time to console
   if [ $floor_val -eq 0 ]; then
      echo $time_var
   fi
   # wait one second
   sleep 1
   (( i++ ))
done

We should note here that we don’t return the variables from the function, we let the function edit global variables. This means the function is less useful than it could be, as we have to know the names of the variables beforehand. But it is necessary for returning more than one variable (where one of the variables isn’t a status code). The above function is also dangerous, it can edit variables without the user being aware. This concept is called variable scope and its worth investigating in more detail.

Let’s consider a simple example, run the three following codes and predict the outcome of each before you do:

Predict the outcome

Firstly we assign and read out two variables.

var_a = 'I am global var_a'
var_b = 'I am global var_b'

echo $var_a
echo $var_b
I am global var_a
I am global var_b

Now we introduce a function like the one above that uses these variable names, what should the output be now?

function i_break_things {
var_a = 'I am var_a inside a function'
var_b = 'I am var_b inside a function'
echo $var_a
echo $var_b
}

var_a = 'I am global var_a'
var_b = 'I am global var_b'

echo $var_a
echo $var_b

i_break_things

echo $var_a
echo $var_b

Solution

I am global var_a
I am global var_b
I am var_a inside a function
I am var_b inside a function
I am var_a inside a function
I am var_b inside a function

The function has updated the variables in this example and the function we created above it seems intentional but what if your code contained hundreds of lines or the function got included in a path there would be no way to predict what any given output should be!

This time we add the local keyword to var_b. Have a guess how this might change the result…

function i_break_fewer_things {
var_a = 'I am var_a inside a function'
local var_b = 'I am local var_b'
echo $var_a
echo $var_b
}

var_a = 'I am global var_a'
var_b = 'I am global var_b'

echo $var_a
echo $var_b

i_break_fewer_things

echo $var_a
echo $var_b

Solution

I am global var_a
I am global var_b
I am var_a inside a function
I am local var_b
I am var_a inside a function
I am global var_b

The inclusion of the local keyword has changed the scope of the variable var_b now it is local to the function and the variable outside the function is left unchanged. The words here ‘global’ and ‘local’ are the names of the scopes. The ‘global’ scope is accessible from anywhere in the script including inside functions

How can I tell my function what do?

Let’s gert back to our script. The stages are currently:

#!/bin/bash

function get_time_vars {
   # get the time
   time_var=$(date "+%H:%M:%S")
   # split the time into hours, miniutes, and seconds
   IFS=':' read -ra time_arr <<< "$time_var"
}

# Initilise some variables
i=0
time_var='time_var'
time_arr='time_arr'
# Start a loop
while [[ $i -le 100 ]]; do
   # Run the funtion to get the current time as an array
   get_time_vars
   # use the moduli operator to check if the number is divisible by 3 n.b. we are using '10#' to let bash know seconds '01, 02, 03...' are in base 10
   seconds=10#${time_arr[2]}
   floor_val=$(( seconds % 3 ))
   # if the number of seconds is divisible by 3 then print the time to console
   if [ $floor_val -eq 0 ]; then
      echo $time_var
   fi
   # wait one second
   sleep 1
   (( i++ ))
done

Try modifying the function

We may want to change the divisor so that it prints on multiples of five. How would you make this change?

Solution

We need to change the line:

floor_val=$(( seconds % 3 ))

to this:

floor_val=$(( seconds % 5 ))

Now we want to print on multiples of seven. How would you make this change?

Solution

We need to change the line:

floor_val=$(( seconds % 5 ))

to this:

floor_val=$(( seconds % 7 ))

We can see that by working this way to change something simple, a user is required to edit a line in the middle of the code. Let’s move the floor operation to a function and pass it a variable instead.

#!/bin/bash

function get_time_vars {
   # get the time
   time_var=$(date "+%H:%M:%S")
   # split the time into hours, miniutes, and seconds
   IFS=':' read -ra time_arr <<< "$time_var"
}

function check_multiple {
   #  we use '10#' to let bash know seconds '01, 02, 03...' are in base 10
   local seconds=10#${time_arr[2]}
   # use the moduli operator to check if the number is divisible by 3 n.b.
   local floor_val=$(( seconds % $1 ))
   echo $floor_val
}

# Initilise some variables
divisor=5
i=0
time_var='time_var'
time_arr='time_arr'
# Start a loop
while [[ $i -le 100 ]]; do
   # Run the funtion to get the current time as an array
   get_time_vars
   # check if the number is a multiple of divisor
   do_i_print=$( check_multiple $divisor )

   # if the number was a multiple then print the time to console
   if [ $do_i_print -eq 0 ]; then
      echo $time_var
   fi

   # wait one second
   sleep 1
   (( i++ ))
done

Check your understanding

The changes made to the script have a few advantages. It allows a variable to be set and used by the new function check_multiple. how is it different to our scoped variables, why might this be better in this instance?

Solution

The variable divisor is passed to the function and accessed using $1, as we haven’t used a variable in the global scope we could use this to set two divisor variables and use them without needing to change anything. More on this later…

But how has the code been made more readable?

Solution

Defining the variable seconds as a number in base 10 was only necessary because we wanted to use the moduli operator we can move it into the function and give it a local scope so that it de-clutters the while loop making the loop appear simpler and more closely follow the steps we defined earlier.

What can I do with a function now?

We’ve shown how a function is defined, where and why functions are written, how to use the result of a function, the affect of variable scope, and how to pass variables to a function. There is a common coding challenge called fizzbuzz, where numbers are looped over. If the number is a multiple of 3 “fizz” is printed, if it is a multiple of 5 “buzz” is printed, and if it is a multiple of both “fizzbuzz” is printed. The pseudocode logic for this is like this:

We will make one alteration that the user calling the function fizzbuzz should be able to specify the two divisors and number at which to stop at.

If you want a stretch challenge try to write this from scratch now using functions where possible to simplify the script. Otherwise, just reveal the answer and move onto the next question.

Fizzbuzz

function check_multiple {
   # use the moduli operator to check if the first argument is a multiple of the second
   local floor_val=$(( $1 % $2 ))
   echo $floor_val
}

function fizzbuzz {
# this function takes three arguments in the following order, divisor a, divisor b, stop
i = 1
# loop while i is less than the 3rd argument
while [[ $i -le $3 ]]; do
  # check if i is a multiple of the first and second argument
  a = $( check_multiple $i $1 )
  b = $( check_multiple $i $2 )

  # add a and b (if both are zero then fum is zero and thats a fizzbuzz)
  a_and_b = $(( $a + $b ))

  # Check the multiple output being careful to use else correctly so to not print the number if it is a multiple.
  if [ $a_and_b -eq 0 ]; then
      echo fizzbuzz
  else
    if [ $b -eq 0 ]; then
      echo fizz
    elif [ $a -eq 0 ]; then
      echo buzz
    else
      echo $i
    fi
  fi
  # increment i
  (( i ++ ))
done
}

# run classic fizzbuzz to 100
fizzbuzz 3 5 100

We’ve created the function fizzbuzz that takes three arguments that define the two number to check for multiples of and the number to stop at. The function check_multiple uses command substitution to return the moduli of the current loop value, by letting this function take arguments we can avoid writing separate functions for each input to check.

Now you have seen the way functions can be used in bash try these exorcises to apply what you have learnt.

Exercises

Thinking about the script that prints the time when seconds are a multiple of three and fizzbuzz example above write some pseudocode for a function that prints the time (not more than once a second) 100 times unless the seconds are a multiple of one or both of the two input values, in which case it should print fizz, buzz or fizzbuzz for both.

Solution

  • Loop over the values 1 to 100
    • Get the time, strip the seconds
    • Check if time is a multiple of either of the inputs (call them a and b)
    • Multiple of both?
      • true
        • print fizzbuzz
      • false
        • multiple of a
          • true
            • print fizz
          • false
            • multiple of b
              • print buzz
          • false
            • print time

Turn your pseudocode into an actual function that uses the time to play fizzbuzz.

Solution

#!/bin/bash

# Global variables for function returns
time_var='time_var'
time_arr='time_arr'

function get_time_vars {
 # get the time
 time_var=$(date "+%H:%M:%S")
 # split the time into hours, miniutes, and seconds
 IFS=':' read -ra time_arr <<< "$time_var"
}

function check_multiple {
 #  we use '10#' to let bash know seconds '01, 02, 03...' are in base 10
 local seconds=10#${time_arr[2]}
 # use the moduli operator to check if the number is divisible by 3 n.b.
 local floor_val=$(( seconds % $1 ))
 echo $floor_val
}

function date_fizzbuzz {
 # this function takes three arguments in the following order, divisor a, divisor b
 local i=0
 while [[ $i -le 100 ]]; do
   # Run the funtion to get the current time as an array
   get_time_vars

   # check if the number is a multiple of divisor
   res_a=$( check_multiple $1 )
   res_b=$( check_multiple $2 )
   # add to combine both, using moduli zero is true so if sum is 0 then both were 0
   res_ab=$(( $res_a + $res_b ))


   if [ $res_ab -eq 0 ]; then
     echo fizzbuzz
   else
     if [ $res_a -eq 0 ]; then
       echo fizz
     elif [ $res_b -eq 0 ]; then
       echo buzz
     else
       echo $time_var
     fi
   fi

   # wait one second
   sleep 1
   (( i++ ))
 done
}

# Run the function with classic fizzbuzz multiples
date_fizzbuzz 3 5

That’s all we have to say about bash functions for now. This isn’t an extensive tutorial, but it should give you a functional understanding of how they work. If you want more practice working on functions, try modifying the date_fizzbuzz to do other more interesting things.

Key Points

  • Functions reduce code duplication.

  • Functions make testing the behaviour of code easier by isolating tasks.

  • Functions make abstracting tasks easier.