5 Unix tools that DBAs don’t use enough

Oracle runs on an operating system, very often this operating is a variant of Unix. I’m pointing out the obvious because I’ve worked with many excellent DBAs who were not aware of this fact. They were masters of using everything that Oracle provided, usually in the form of V$ and even X$ views, but completely ignored the fact that Oracle runs on Unix, and Unix can also supply lots of information that can be used to monitor, diagnose issues or automate tasks.

Here are few of my favorites, at no particular order:

1) sar and vmstat – reporting current and historical system informatio, including CPU utilization, IO, memory utilization, time spent looking for memory, load average, etc. I begin almost every performance diagnosis session by checking what these tools can tell me. They never tell the whole story, but they give good hints.

2) strace – This gives a system-call level trace of what a process is doing. You usually want to use it when a process hangs and you want to know exactly what it was doing when it hanged. The output looks like this:
open("/proc/25062/environ", O_RDONLY) = -1 EACCES (Permission denied)
stat("/proc/27800", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/proc/27800/stat", O_RDONLY) = 6
read(6, "27800 (pickup) S 1534 1534 1534 "..., 1023) = 224

Think twice before running this on a busy production system, as this baby will slow the traced process to a crawl and generate tons of IO.

3) gdb -c – run this when you find a core file in an Oracle directory and you want to know who crashed.

4) find – I’ll need a separate post about this wonderful tool. My favorite usage example is for deleting old user traces that usually just take up space. Lets delete those that are over two weeks old:
find /orahome/ora6410g/admin/ITGST/udump/ -name "*.trc" -mtime +14 -exec rm {} \;
This is just an example, find can do much more useful tricks. Its one of those utilities I’ll take when I have to work as a DBA on a deserted island.

5) Perl – Thats the other tool to take to a deserted island. Its a bit large to be called a tool, its an operating system with programing language thrown in. I use it for everything – our backup procedure is written in perl, we automatically generate reports for our customers from perl, disk space usage projections, etc. I’d write a post about this too, but it’ll probably grow to be a book.


6 Comments on “5 Unix tools that DBAs don’t use enough”

  1. Adi Stav says:

    Regarding strace, you might be interested in its -e option (see man page). For instance, ‘strace -e trace=open’ to see which files it opens but no other syscalls. Reduces the footprint quite a bit (and the log size).

    Another indispensable tool is lsof. “Who’s keeping that file open?” Run ‘lsof /data/that-file’ and you’ll know. netstat is just a private case of lsof, as it can search and report use of any type of file descriptor, devices, etc.

  2. prodlife says:

    Thanks for the tips! I’ve added them to my knowledge base. I’ve never thought of netstat as a private case of lsof.

    My favorite lsof use case is when I try to umount a partition (mostly nfs shares) and find that the device is still busy.

    Also, thanks for commenting on my blog. Feels much more like a “proper” blog now 🙂

  3. starprogrammer says:

    A few notes on strace from the reverse-engineer perspective….

    It is possible to attach strace to running process. This is handy if the process routinely gets stuck, but it takes quite a lot of time (and processing) to get to that point. Running the process for its entire lifespan under strace could slow down that buildup procedure. The other possibility is to run the process in the regular way, and to attach the strace when the buildup is over.

    Unlike the case in which the process is being run “under” strace, when running strace in “attached” mode, signals to strace are not propagated to the traced process. Therefore killing attached strace detaches the strace, but doesn’t kill the traced process.

    To attach strace to a process, you do strace -p PID, and it attached to the process whose process id is PID.

    Another less known strace feature is its Counting mode. It is useful if you want to see what system calls the process does, and how much time it spends in these:

    [somewhere~]$ strace -c find /usr -name “foogoo” > /dev/null
    Process 13798 detached
    % time seconds usecs/call calls errors syscall
    —— ———– ———– ——— ——— —————-
    84.03 0.102935 14 7139 getdents64
    11.28 0.013823 4 3504 lstat64
    1.41 0.001727 0 10519 close
    1.39 0.001701 0 10520 1 open

    A lesser known cousine of strace is ltrace – it traces “standard library calls”, as opposed to operating system kernel calls.

    The standard library is a wrapper around the operating system. Sometimes this wrapper is thin, but sometimes it is significant, and has its own logic.

    For example, all file operations are carried out by the operating system kernel. Opening file, reading from it, closing it – it’s all done by kernel. Same with TCP/IP sockets – Unix pretends these are special case of files, so it is kernel who does the real dirty work of opening, reading and closing sockets.

    The bottom line is that strace is very useful when a process gets stuck on file operations.

    There are, however, some operations that are not carried out by the operating system kernel, but rather by the surrounding wrapper – the standard library.

    These are mostly the network naming resolution requests, sometimes the memory management routines and a few other routines.

    In order to trace these, strace is not sufficient, as it doesn’t really knows that a packet sent on UDP port 53 is actually a name resolution request. But ltrace does know, as it sees the “gethostbyname” call that preceded that packet.

    By using a -S option, ltrace can intermix library calls with operating system kernel calls, thus giving you the whole picture.

  4. prodlife says:

    Thanks for the tips – I’m sure they’ll turn out to be useful.

  5. moshez says:

    Every time you use find -exec, god kills a kitten.

    Think of the kittens, use xargs.

  6. prodlife says:

    more dead kittens = less lolcats, and more Vietnamese food 🙂

    this is an old post and I’m using xargs these days, but now that you mention the kittens, maybe its time to go back to find.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s