Use Collectl as an Advanced System Monitoring Tool for Linux

Monitoring system resources is one of the most frequent tasks that system admins perform. In Linux there are various tools available for this, including top, free, htop, and more, but one tool that stands out is collectl, primarily because of its capabilities. In this article, we will discuss the basics of collectl along with the features it provides.

Collectl

As the name indicates, collectl collects data that describes the current system status. It has the ability to monitor almost any subsystem, but its biggest strength is that it can monitor different parameters at the same time, compared to other tools that measure only a specific system parameter.

According to the man page, you can use collectl to display information specific to following subsystems:

SUMMARY SUBSYSTEMS

b - buddy info (memory fragmentation)
c - CPU
d - Disk
f - NFS V3 Data
i - Inode and File System
j - Interrupts
l - Lustre
m - Memory
n - Networks
s - Sockets
t - TCP
x - Interconnect
y - Slabs (system object caches)

DETAIL SUBSYSTEMS

C - CPU
D - Disk
E - Environmental data (fan, power, temp), via ipmitool
F - NFS Data
J - Interrupts
L - Lustre OST detail OR client Filesystem detail
M - Memory node data, which is also known as numa data
N - Networks
T - 65 TCP counters only available in plot format
X - Interconnect
Y - Slabs (system object caches)
Z - Processes

The lower-case and the upper-case options specified above let you perform brief and detailed measurements of the corresponding subsystems respectively – to monitor and measure a particular subsystem, the -s option along with the subsystem specific option should be used. Let’s discuss some of the important features of the collectl command.

Note: All the examples used in the article are tested on Ubuntu 14.04

Download/Install

You can download and install the command line utility on Debian-based systems using the following command:

sudo apt-get install collectl

If you’re on some other Linux distribution, you can grab the tool’s latest version from its project website and compile it from source.

Default output

When the command is run without any option, here is what you get:

You can see that the commands log cpu usage, disk io, and network activity (equivalent of passing cdn as command line options) each second. Since the output keeps growing, you can press “Ctrl + C” to stop the execution of the command.

Monitor CPU usage

To display a summary of CPU usage, use the -sc option

collectl -sc

and to display a detailed output, use the -sC option

collectl -sC

Similarly you can monitor memory using -sm and -sM options, disk usage using -sd and -sD options, and more.

Monitor multiple subsystems

Suppose you want to monitor CPU, memory, and disk usage together; you can do so by passing corresponding command line options along with the -s option. Here is how to do it:

collectl -scmd

So, you can see that the command produced information related to all three subsystems.

Display time

Since collectl output is updated after a set interval of time, you can also ask the command to display timing information in the beginning of each line of output. This can be done by using the -oT option.

collectl -oT

You can now see that a time stamp was added to each line in output.

List processes like top

You can also use the collectl command to display output in the same way the top command does. For this, you have to use the --top option.

collectl --top

So you can see that the output contains process specific information.

To learn more about the command, go through its man page.

Conclusion

That was just a brief overview of what collectl is capable of, as we’ve barely scratched the surface here. It provides tons of options, and when used correctly, it can prove to be a Swiss army knife for system monitoring in Linux. Have you ever used collectl? How was your experience? Share your thoughts in the comments below.