Trying to keep a readable coding style in Bash

One of my problems when coding in bash at first was that my scripts tended to have poor code organization. I thought the lack of structures from more sophisticated languages meant the code was hard to organize and was going to inevitably be a mess. I did not know how to use the simple language features well, because when they are put to use it makes a huge difference. I actually got better at writing bash scripts after doing more work in R Functions and Apply in R because it became more obvious a variety of ways to make good use of functional constructs.

Take advantage of functions and imports for organization

Code can be broken up readily by using functions. The arguments aren’t named, but this pushes design toward simple, single purpose functions which is a good way to do it anyway.

Example of a single input, single output function

This function takes a device designation (like sda1, sdb2) and returns the path it is mounted on, or nothing if it was not found.

diskPath () {
  local path=$(cat /proc/mounts | grep  --extended-regexp --regex="($1.*ext4)" | awk '{ printf "%s",$2 }')
  printf "%s" $path
}
# using it: 
Path=$(diskPath sda1)

Note the use of local

I try and set variables as local anytime they are only used in the scope of a function to avoid unneccesary global variables.

Using while and shift allows a multi argument function

I use these like loops. The fcn can take an arbitrary number of input arguments because the processing for each is the same. It can be made more complex, but I use this scaffolding when I need a command that accepts multiple arguments.

transform_toinclude () {
  local includebuild
  while (( "$#" )); do
    includebuild+="--include *$1 "
    shift
  done
  printf "%s" "$includebuild"
}

Make function libraries

I do this to keep similar functions together, and then import them when I need that functionality. This keeps everything organized. an example set of functions: /usr/local/lib/diskfcns.sh

# Get the mount path of a disk 
diskPath () {
  path=$(cat /proc/mounts | grep  --extended-regexp --regex="($1.*ext4)" | awk '{ printf "%s",$2 }')
  printf "%s" $path
}

# determine if a disk exists, return true/false. 
diskExists () {
  local retVal=false
  local diskStats="$(diskPath $1)"
  if [[ ! -z "$diskStats" ]]; then retVal=true; fi
  printf "%s" "$retVal";
}

# disk size using df
diskSize () {
  local diskSizeMB=false
  diskExists="$(diskExists $1)"
  if [[ ! -z "$diskExists" ]]; then
    diskSizeMB=$(df -HBM "$(diskPath $1)" |  awk 'NR==2 {printf "%s", gensub(/[A-Z]+/,"","g",$2)}')
  fi
  printf "%s" "$diskSizeMB";
}

Then Import them with source or .

To pull in the above functions, source or ’.’ it, then use them. Sourcing runs the file, so in this case the fcn definitions are now available in your shell.

# These import the file, either is find
source /usr/local/lib/diskfcns.sh
. /usr/local/lib/diskfcns.sh 

diskSize "sda1"
diskExists "sdb1"
diskPath "sdb1"

Don’t try and make Bash constructs something they aren’t

Trying to make bash more like other languages usually turns out poorly. A random example is passing variables by reference. I can’t recall what use case I was trying to solve, I probably wanted named variables for functions like other languages.

var1="dont"
var2="do it"
poorIdea () {
  input1=${!1}
  input2=${!2}
  printf "Variables by reference: %s %s" "$input1" "$input2"; 
}
poorIdea "var1" "var2"

This is not specifically a knock on vars by reference, which can have uses. It is a warning that bash has simple constructs, and trying to build complex ones usually makes a complicated, hard to follow mess. It can be hard enough to read well written bash code.

Loops aren’t bad, but most of the time you don’t need them

Bash has a lot of ways to conduct operations that seem to require loop behavior without loops. These ways are almost always cleaner and less error prone. Often when I am thinking about using a loop it is because I forgot about using glob patterns and passing many arguments to a command.

Commands can handle many inputs without loops

This is basic, but most bash commands can take a glob/set of inputs. This kills many loop cases right there, you pass all the args directly to the command/function and it handles them.

cp file1 file2 file3 ... file99 /dest
printf "No loop needed: %s\n" "pattern*" 

I find when working in other environments/languages I sometimes forget that this is how things are designed to work in bash.

Using pipes to pass output from one command as input to the next

Pipes take the output from one command and passes it to another. Also simple, but also something where you have to be thinking in bash mode. You can perform complex operations with a set of pipes, removing the need to store/loop/process the data in between steps.

grep -R 'pattern' | uniq | sort | tail -n 10

Using xargs to use output as arguments

This command lets output be treated as arguments, I find it very useful. This is different than pips where output is used as input. My best illustration of this is find piped to grep vs xargs to grep.

# The markdown file names themselves are searched by grep
find . -type f -iname "*md" | grep -i 'pattern' 
# using xargs the file names are passed as arguments to grep,
# not input, and it will search inside the files.  
find . -type f -iname "*md" -print0 | xargs -0 grep -in 'pattern' 

With these techniques, data flows from input to final output

Using these facets of Bash scripting, I string together commands to perform operations and usually don’t need intermidiate steps handling the data I often go from input file/search to final output.

Closing thoughts

This short writeup is just a quick overview of how I would reorient my less experienced self to writing shell scripts. The overall message I would give is that you are looking for ways to create input/output flows that chain commands or functions together. If you are creating complex data and variable structure, then you need to re-evaluate how you are doing it, or consider that shell scripts might not be the right tool for the task you are performing.