Thursday, May 25, 2006

Using syslog with Perl

I recently had an occasion to write a fairly simple Perl script that checks for rhosts files in any home directory which is configured on a system. Nothing fancy, but very useful. After getting through the file detection logic I was left with the question, what now? Should I write a custom log file? Should I call /usr/bin/logger?

As always, I looked for precedents and standard facilities. The first thing that came to mind was syslog. And of course, the fact that I was using Perl led me to believe that I wasn't going to need to execute an external process (the "duct tape hack" as I call it). I view the shell as another language, and something never really feels right when I need to embed one language within another. Don't even get me started about embedding big awk scripts inside shell scripts... That's going to be a future topic.

The duct tape method is bad for a number of reasons. There is overhead associated with forking and executing a new child process from your main script. If you are running awk and sed, or other tools thousands or millions of times against a file then you are forcing Solaris to execute far more system calls than necessary. By keeping it all inside Perl and using modules, you can let the interpreter do the work, and realize a good part of the efficiency that C system programming gives you. I'll save the specifics of this for a later time - we need to dig into the syslog example.

In this case I quickly found the standard Sys::Syslog module. This little gem makes it a snap to log output. I won't go into the Solaris syslog facility here, but suffice it to say that you'll need to arrive at your intended Facility and Priority before going farther. For my purposes I went with User and LOG_NOTICE.

To begin with, we need to include some libraries...

use Sys::Syslog;

When we want to set up the connection with syslog we do the following:

openlog($progname, 'pid', 'user');

The above line specifies that we will use the 'user' facility, which is typically what you should be using if you don't have a specific reason to go with one of the other options. It also specifies that we want to log the pid of the logging process with each entry. Logging the pid is a convention that isn't always necessary, but I like it. The first part, $progname is a variable that stores the name of the script. This deserves a little extra attention.

Since I'm known to change the name of my scripts on occasion I don't like to hard code the name. In shell scripts I usually set a progname variable using /usr/bin/basename with the $0 argument. $0 always contains the first element in the array of command line variables. So, if I called a script named foo with the arguments one, two, three, the command would look something like this:

# /home/me/foo one two three

The resulting array $* would be:


To identify our program name we want the first array element. However, we don't want all that extra garbage of the path. It makes for a messy syslog. The basename UNIX utility helps us to prune the entry. Here's an example in shell:

$ basename /home/me/foo

If we want to do the equivalent in Perl without spawning an external process we can use the File::Basename module. Again, with a simple include at the top of our script this function becomes available to us:

use File::Basename;

Now we can put it all together and create an easily referenced identity check:

my $progname=basename("$0");

Why don't we just hard code the script name? After all, not everyone likes to refactor their code for fun. Besides the idea that we want our code to be maintenance free, there are times when one set of code may be called from links which have different names than the primary body. For example, let's assume that the script foo performs three functions: geta, getb, and getc. To make it easier to call these functions we want to be able to call these directly without duplicating code. Here's how we could do that:

# ls -l ~/bin
-r-xr-xr-x 1 root root 5256 Jun 8 2004 /usr/local/bin/foo
# ln ~/bin/foo ~/bin/geta
# ln ~/bin/foo ~/bin/getb
# ln ~/bin/foo ~/bin/getc

We can now call any of geta,getb,getc and actually call foo. With some simple logic blocks based on what $programe evaluates to we are able to create a convenient interface to a multi-functional program with centralized code. Nice! But I digress - let's get back to looking at syslog...

We have opened a connection to the syslog, and now is the moment of truth. Let's write a syslog entry...

syslog($priority, $msg);

Let's recap... I used a facility of user, and a priority of notice. I want to record the pid, and write a message. What does this look like when its executed?

May 25 11:01:25 testbox rhostck[833]: rhosts file found at /u01/home/cgh

That was really easy, and it's much cleaner than executing the external logger utility because it's all inside Perl.


fu said...

how you set the $priority variable?

When I followed yours, the output is weird:

<13>[333] this is test.

what <13> mean?

Bob said...

use File::Basename;

# suppose its the same, does anyone
# know if which ever is faster?