Tuesday, June 17, 2008

No space left on device? (metainit)

Here comes another rant about error messages. I was rebuilding a server today that uses SVM to manage som SAN storage which gives a home to four very nice Solaris zones. I began by issuing a metainit command to build a concat/stripe device from these two SAN devices...

testbox{lvm}$ sudo metainit -f d100
metainit: testbox: /etc/lvm/md.tab line 72: c4t6006048000018775125753594D433742d0s7: No space left on device


What?!?! I took a quick look at partitioning...

Part Tag Flag Cylinders Size Blocks
0 unassigned wm 0 0 (0/0/0) 0
1 unassigned wm 0 0 (0/0/0) 0
2 backup wu 0 - 56653 25.93GB (56654/0/0) 54387840
3 unassigned wm 1 - 3 1.41MB (3/0/0) 2880
4 unassigned wm 4 - 56653 25.93GB (56650/0/0) 54384000
5 unassigned wm 0 0 (0/0/0) 0
6 unassigned wm 0 0 (0/0/0) 0
7 - wu 0 - 56653 25.93GB (56654/0/0) 54387840


Ok, so the partition exists. What the heck is wrong?

In my absent minded hurry to get this trivial task completed I made an undiscipined assumption that both devices which are to comprise d100 have the same underlying VTOC. It turns out they did not. One of them was set up to use slice 4, and the other slice 7.

So, I issued a quick command to synchronize them using the traditional prtvtoc | fmthard tango, then edited the /etc/lvm/md.tab file to accomodate the s4 slice when defining d100. This time it worked nicely.

But come on, "no space left on device?" What kind of an error message is that? How about something more like, "specified slice does not exist." Technically, a storage device of size zero would have no space available, but there sure are more direct ways to express that concept.

Thursday, June 05, 2008

The Evolution of Email

Have you ever stopped to ask yourself what benefits have been derived by the evolution of email from the days of ASCII text to our modern world where Microsoft Word can act as the email editor?

Fortunately I don't need to ponder this question any longer. Today I received an email which simply would not have had the same impact back in the old days of low-tech correspondence.

The email started out with the following, which is a direct quote:

Starting IMMEDIATELY - ZERO TOLERANCE for any and all non compliance of the following process!
...

It looks pretty menacing in ASCII text, but thanks to Microsoft Exchange and its mind-blowing capabilities to allow more effective self-expression I was able to receive that motivational phrase in a 24-point underlined red font.

I have to admit, it's difficult to fully realize the gravity of the phrase without gratuitous aesthetic enhancement. Let's face it, it would take a PowerPoint attachment to more effectively intimidate me.

Wednesday, June 04, 2008

The Unconventional Explorer

The habit of Sun's explorer dumping output to /opt/SUNWexplo/output makes me wince a bit. In all fairness, I think the documentation could be seen as technically inconclusive, but in spirit I believe a more correct solution is not difficult to derive.

Consulting the Solaris 10 System Administration Guide: Devices and File Systems we find a concise chart of default Solaris file systems and their raison d'etra. Three specific entries jump out at me as being relevant to this topic:


  • /opt: Optional mount point for third-party software. On some systems, the /opt directory might be a UFS file system on a local disk slice.

  • /var: System files and directories that are likely to change or grow over the life of the local system. These include system logs, vi and ex backup files, and uucp files.

  • root(/): The top of the hierarchical file tree. The root (/) directory contains the directories and files that are critical for system operation, such as the kernel, the device drivers, and the programs used to boot the system. The root (/) directory also contains the mount point directories where local and remote file systems can be attached to the file tree.



Considering these practices, it makes perfect sense that explorer is installed in /opt/SUNWexplo. So far, so good. On the systems we deploy at my current place of employment, the /opt file system is part of the root file system, which means that Explorer is dumping output at ~ 5mb per shot onto the root file system.

All things considered, it's pretty benign considering we use either 72 or 146 GB boot drives. But as Solaris Jedi, we look to the harmony and availability of the system, and Explorer is definitely creating a disturbance in the force by dumping volitile files into a subdirectory within /opt. What if someone wrote a script to manage the contents of that output directory and made a little error in their code? What file system would you want it compartmentalized within? Would you want the potential of filling root, or filling a less critical file system? Methinks there must be a better way.

As in most dilemmas, I tend to look for precedents. Where would we find a traditional location in the standard Solaris file system that might be used to spool (hint, hint) volatile files which might grow over time? I would immediately look to /var. There are two immediate paths I see as being preferential to /opt/SUNWexplo/output.

The first option would be /var/spool/explo. This would follow a convention that aligns with out use of a local explorer agent. The servers here produce an explorer on a regular file which is immediately shipped to a central (on-site) repository. The most recent explorer is typically left on the system and the history is managed at the repository. This makes the output directory a traditional spool directory, and as such a perfect fit for /var/spool/explo.

Where this may not be as intuitive is the case of an environment where explorers are retained on the host rather than collected and managed centrally. In that case, the explorers are better described as log files than spools. Intuition brings me to the use of /var/opt/SUNWexplo/output for this case. It's close to the legacy directory structure of the tool, which makes the solution marginally more intuitive than using a spool directory. It also follows the rarely observed SYSV standard of pairing optional software installed in /opt with a directory in /etc/opt, /usr/opt, and /var/opt. I'm not a fan of this specific model when taken to its literal implementation, but it's worth noting.

So, which one is best? As noted earlier, it depends. If I were a member of Sun's Explorer engineering team and needed to pick one consistent location with the intent of minimizing discontent I would select /var/opt/SUNWexplo/output. It is intuitive in the largest set of configurations, and doesn't break any rules. My secondary recommendation would be to create a symbolic link to redirect /opt/SUNWexplo/output for backwards compatibility over the next few years until it could be phased out.

Now I'm left wondering what interesting problems I might create in the data center if I put together a change package that implemented this very model... Nothing is ever as simple or benign as it appears on the surface.