ICS/Lotus (mostly), Linux, Travel, Skiing, Mixology, and Random Musing of Interest

Domino on Linux Tip: top, It Just May Be an Admin’s Best Friend

Bill Malchisky  November 29 2010 11:00:00 PM
For those that do not know, 'top' provides a ton of system information in a compact space, plus the 'top' system hogs. It is a very flexible program that provides a lot of power and the ability to resolve problems quickly.

From the man page: "The top program provides a dynamic real-time view of a running system.  It can display system summary information as well as a list of tasks  currently  being  managed  by the Linux kernel.  The types of system summary information shown and the types, order  and  size  of information  displayed  for tasks are all user configurable and that configuration can be made persistent across restarts."

Here is a typical screen shot with default settings:

Image:Domino on Linux Tip: top, It Just May Be an Admin’s Best Friend

OK, so you get the point...there is a lot here. So, why blog about it? One aspect that really saves time is the ability to get a top listing based upon each user. As I typically install Domino on Linux as a partitioned server, you really get more bang for your buck with several different servers sharing system resources efficiently, with minimal overhead or disk consumption. This setup works great on UNIX installs as well, and was quite common long before virtualization was all the rage. Heck it's a lot cheaper too and is something that still makes a lot of sense in many scenarios.

Now recall that the best practice for a partitioned installation on any OS is to have each DPAR with a unique name. This allows one to keep track of resources by user (which maps to the server). You can organize in a more granular level on Linux by separating tasks by user ID with this command: top -U usenotes (taking a cue from the above graphic).

The point here is that you can open multiple xterm/terminal windows, each with a separate SSH session to your server and run top -U in each and then monitor all of the respective processes for each server in real-time. Then, if one process or server is running too high, you can make the appropriate adjustments either from the Domino server console or via top (if the task is stuck during a shutdown, as an example). For clarity, you can kill a task from within the top console, by depressing, "k" (sans quotes), then enter the PID value, and a numeral one for the signal (which is sufficient to kill 95% of task issues).

Real-world usage:
unbeknownst to the primary admin, a junior admin kicked-off an additional Domino server instance in the background; that server was a devbox that was retired and scheduled to be purged...it then proceeded to create a 12GB log.nsf for a server that was not even supposed to be running--didn't know to look for it initially. In just a few minutes the data directory's file system became full from both the log file--primarily. This all occurred during the maintenance window (fortunately). In applying a fix pack, the shutdown to the main Domino DPAR failed due to the full file system. Using top I then proceeded to see which processes needed attention:

Image:Domino on Linux Tip: top, It Just May Be an Admin’s Best Friend

Note:
a properly tuned Linux box running Domino will almost never see 81% CPU utilization, so this is a big clue something is wrong; the event task running at 122% CPU provides the other piece to the puzzle. With the process ID on the line, it is easy to kill the errant Domino task on a server that is stuck on shutdown and can not be reached via the Domino console.

Bonus tip:
with a default top session running, you can type "u" then enter the username (UID value) to immediately filter the list by username; depressing "u" and hitting the Enter key restores the list, or allows you to change UID values; very powerful. Read the man page for additional capabilities for the powerful tool.

As I firmly believe in parallel processing, while I was shutting down the runaway processes, in another xterm, I used "gzip --best log.nsf" to compact the large file in question, to give me some additional free space so that I could work.

After clearing the errant processes for the server, the Linux box resolved to normal and proceeded to clean-up things -- very quickly. In less then 15 seconds, the free space returned as the OS cleared-up all the temporary files and the errant task stopped filling NSF databases. Quite amazing.

For the command sequence below, I typed in the initial command, depressed the Enter key, observed the value, then pressed the up arrow key. Repeated three times.
Image:Domino on Linux Tip: top, It Just May Be an Admin’s Best Friend
In all my years, I have never seen a Windows box clean-up and recover from a full filesystem so quickly. Power of the EXT3 filesystem and a good kernel (et al).

Once calm returned, I then gracefully shutdown the primary DPAR to continue the maintenance which completed sans a hitch.

In this scenario, top proved a huge time-saver allowing for quick work of a situation that could have taken far longer without the knowledge of these quality tools provided gratis with Linux and UNIX. The one top session per terminal window ratio is very powerful and allows for excellent monitoring across multiple DPARs. I also add additional xterms for console monitoring and on one screen it is quite easy to troubleshoot upgrades, application roll-outs, and work with development to resolve complex application errors.



Two Additional Tips on Top


Toggle CPU Utilization

The default is to show the aggregate of all CPUs on one row; depressing the '1' key will show each individually; this key is a toggle.
Image:Domino on Linux Tip: top, It Just May Be an Admin’s Best Friend


Verifying Full Shutdown

If you want a quick way to establish that your DPAR has shutdown completely and has left no trailing processes, bring up top -u ("top -u usenotes" for this post's example) and you should see this:
Image:Domino on Linux Tip: top, It Just May Be an Admin’s Best Friend

A blank process list. If you just run top normally, you will never see an empty list, and most likely any remaining Domino processes would be at the bottom of the list and thus, scrolled off the screen. This approach guarantees things are copasetic.

Look for more Domino and Linux tips in the near future...
Comments
No Comments Found

Powered by IBM Lotus Domino 8 | Lotus User Group | Get Firefox! | This blog is listed on Planet Lotus   IBM Certified

© 2010 William Malchisky.