Tuesday, May 30, 2006

Testing for correct usage in shell functions

Here's a simple touch you can apply to your shell scripts to aid in debugging when they grow to become monstrous and you can't remember the syntax of all your subroutines any better than you can remember the 10th digit in Pi, which happens to be 3 for those who care about such things.

Although not strictly required to take advantage of this tweak, I recommend you begin by using good headers for each subroutine. I won't go into each one, but a specific entry I always make is usage. For example, if a subroutine do_foo takes arguments arg_one and arg_two, the header would look like this:

# ------
# do_foo
# ------
# USE: doo_foo ARG_ONE ARG_TWO
# DESC: Execute foo functionality
# PRE: na
# POST: na
# ERR: na
foo () {
    ...
    ...
} #end do_foo


The line I want you to pay attention to in the above code begins with "USE:" (4th line). This line specifies the interface which a user of your code should be aware of. You are telling them that this code expects TWO arguments. Now, you can get fancy and use EBNF like syntax to identify optional arguments, but let's keep it simple for this example and just recognize that we have established an interface.

What can we do as a developer to make sure that when someone calls our code, they do not get something unexpected? We can check to make sure they follow our instructions. It's simple enough, although you can certainly take it greater depths. Let's go back to our do_foo example and put a check in place...

foo () {
    test $# -eq 2 || exit 1
    ...
    ...
} #end do_foo


Let's break down the line I just added... test lives in /usr/bin and should be a fluent part of your shell vocabulary. We are "testing" to see if the number of arguments ($#) is equal to the integer 2. If not (symbolized by ||) then we exit with non-zero status, which is the UNIX convention for something other than success. The next level of effort would include writing a shell equivalent to Perl's die subroutine. This would allow an error message to accompany the exit. We'll save that for another article.

So, what's the benefit of adding this code-bloat to our subroutine? It's common to have a function that uses optional arguments and acts differently depending on what arguments it receives. If the function expects ARG_ONE and ARG_TWO, and you call it with only ARG_ONE, it may assume that ARG_TWO is equal to "". In that case, the output may be "object not found" rather then "Whoa! You made a mistake calling me!". If you were depending on a specific output, this could cause later code blocks to break.

Here's a more specific example. If we are using the ldaplist command to check on project information, we will get two totally different sets of output if we omit a second argument. Pay particular attention to the command and arguments in the examples below:

testbox# ldaplist project
dn: solarisprojectname=srs,ou=projects,dc=mydomain,dc=com
dn: solarisprojectname=bar,ou=projects,dc=mydomain,dc=com
dn: solarisprojectname=foo,ou=projects,dc=mydomain,dc=com
dn: solarisprojectname=group.staff,ou=projects,dc=mydomain,dc=com
dn: solarisprojectname=default,ou=projects,dc=mydomain,dc=com
dn: solarisprojectname=noproject,ou=projects,dc=mydomain,dc=com
dn: solarisprojectname=user.root,ou=projects,dc=mydomain,dc=com


In contrast, what we REALLY wanted was only one line that matches our criteria, not the whole set of data.

testbox# ldaplist project solarisprojectname=user.root
dn: solarisprojectname=user.root,ou=projects,dc=mydomain,dc=com


If we use an argument checker, the error woudl be caught immediately rather than passing on a long list of irrelevant data to whatever we do next. In this case it's particularly ugly because both outputs are identically formatted. Maybe you'd find the problem quickly, maybe you wouldn't.

When your code gets to be hundreds of lines long and you need to start debugging obscure behavior, it can save you a lot of time to write self-policing code. Chances are that if you make a simple mistake calling that subroutine it will fail immediately rather than doing the wrong thing in a hard to find way. A line of prevention is worth an hour of debugging!

No comments: