Tuesday, July 11, 2006

Don't Shed Your Shell

I've said it before, and will say it again; Switching interpreters in mid-code is a practice to avoid whenever possible. There are times that it can be avoided, but there's a lot of times when you can sacrifice a bit of elegance for simpler maintenance.

As with most bugs, I was recently bit by a dumb mistake. I needed the ability to lookup Solaris Resource Manager Project information using tags embedded in the description field. For example, SID=TESTDB is how I would specify an Oracle database SID. I wrote a Korn shell function called getprojbyattrib() which accomplished this very thing. Tested on its own, it worked wonderfully. When I went to integrate it with the existing Oracle start-up scripts I ran into some problems. Turned out they were easy to debug, but the root cause was my old enemy of incompatible interpreters.

This new shell library function is used to figure whether or not an SRM project is configured for a given Oracle database. If one and only one match is returned, then the database is started in a project container. Any other condition means that the database is started without SRM. To help in this cause, I embedded a counter in the function to return how many matches were found. The code in question was simple:

# Keep track of the number of projects we find while outputting
# them so the final tally can be used as a success indicator.
PRJCOUNT=0
for PRJ in $PRJLIST
do
echo "$PRJ"
PRJCOUNT=$(($PRJCOUNT+1))
done


Make note of the seventh line of code which does the incrementing. This is a Korn shell specific operation. When the calling code from the oracle startup script referenced this, it gave an error which told me that it had interpreted line #7 at "PRJCOUNT=$". This is because the Bourne shell doesn't understand the operation.

The fix is simple. Either switch the calling script to use the Korn shell interpreter because Korn is a superset of Bourne, or change the increment code to be Bourne-friendly by using either bc or expr.
PRJCOUNT=`/usr/bin/expr $PRJCOUNT + 1`

Interestingly, the library function was written with a header that specified Korn shell as its interpreter:

#!/bin/sh


This becomes irrelevant when you are sourcing functions or variables as the whole point is to have your calling shell get access to these objects.

Sp what did I do? At first I switched the calling code, but some afterthought lead me to work with the underlying Bourne shell subset so the library would be more portable. I don't really like Bourne shell as Korn is much more capable, but in this case portability is weighted more heavily than elegance.

Repeat after me: Switching interpreters in mid-code is something to be avoided whenever possible.

No comments: