Using systemtap to analyze latency of the kernel module function
In this post, we continue to explore how to use SystemTap to analyze the latency of the kernel module function. In the following example, we want to analyze the latency of the function “nfsd_vfs_write” from the kernel module “nfsd”.
Deploy the SystemTap packages
Refer to this post to deploy SystemTap and its required packages.
Check the nfsd kernel module info
[root@host1 ~]# uname -r
3.10.0-1160.el7.x86_64
[root@host1 ~]# ls /lib/modules/`uname -r`/kernel/
arch crypto drivers fs kernel lib mm net sound virt
[root@host1 ~]# ls /lib/modules/`uname -r`/kernel/fs
binfmt_misc.ko.xz btrfs cachefiles ceph cifs cramfs dlm exofs ext4 fat fscache fuse gfs2 isofs jbd2 lockd mbcache.ko.xz nfs nfs_common nfsd nls overlayfs pstore squashfs udf xfs
[root@host1 ~]# ls /lib/modules/`uname -r`/kernel/fs/ext4
ext4.ko.xz
[root@host1 ~]# ls /lib/modules/`uname -r`/kernel/fs/xfs
xfs.ko.xz
[root@host1 ~]# ls /lib/modules/`uname -r`/kernel/fs/btrfs
btrfs.ko.xz
[root@host1 ~]# ls /lib/modules/`uname -r`/kernel/fs/nfs
blocklayout filelayout flexfilelayout nfs.ko.xz nfsv3.ko.xz nfsv4.ko.xz objlayout
[root@host1 ~]# lsmod | grep "nfsd "
nfsd 351321 13
[root@host1 ~]# modinfo nfsd
filename: /lib/modules/3.10.0-1160.el7.x86_64/kernel/fs/nfsd/nfsd.ko.xz
license: GPL
author: Olaf Kirch <okir@monad.swb.de>
alias: fs-nfsd
retpoline: Y
rhelversion: 7.9
srcversion: 61A6390CD82AA4A7492CB06
depends: auth_rpcgss,sunrpc,grace,lockd,nfs_acl
intree: Y
vermagic: 3.10.0-1160.el7.x86_64 SMP mod_unload modversions
signer: CentOS Linux kernel signing key
sig_key: E1:FD:B0:E2:A7:E8:61:A1:D1:CA:80:A2:3D:CF:0D:BA:3A:A4:AD:F5
sig_hashalgo: sha256
parm: cltrack_prog:Path to the nfsdcltrack upcall program (string)
parm: cltrack_legacy_disable:Disable legacy recoverydir conversion. Default: false (bool)
parm: nfs4_disable_idmapping:Turn off server's NFSv4 idmapping when using 'sec=sys' (bool)
[root@host1 ~]# cat /proc/kallsyms | egrep -i -w "nfsd_vfs_write"
ffffffffc094fdd0 t nfsd_vfs_write [nfsd]
SystemTap script
In the following SystemTap script, we have implemented
a probe to detect the kernel module “nfsd” and its function “nfsd_vfs_write”
a probe to detect the return of the function “nfsd_vfs_write”
handlers to get the execname, pid, tid and timestamp in each probe function
a probe as timer(5 seconds tracing)
a probe “end” to analyze the collected runtime of the function “nfsd_vfs_write” in the end of tracing
[root@host1 ~]# cat nfs_trace.stp
global count
global start_t,diff_t[1000000]probe module(“nfsd”).function(“nfsd_vfs_write”){
count++
e = execname()
p = pid()
t = tid()
start_t[e,p,t] = gettimeofday_us()
}probe module(“nfsd”).function(“nfsd_vfs_write”).return {
e = execname()
p = pid()
t = tid()
start_ts = start_t[e,p,t]
end_ts = gettimeofday_us()
if(start_ts>0)
diff_t[e,p,t,start_ts] = end_ts - start_ts
}probe timer.s(5){exit()}
probe end{
count=1
total_time=0
foreach([e,p,t,ts] in diff_t){
printf(“nfsd_vfs_write(%d %s %d %d %d) call time: %d\n”,count,e,p,t,ts,diff_t[e,p,t,ts])
count++
total_time+=diff_t[e,p,t,ts]
}
count–
printf(“nfsd_vfs_write total call time(us): %d\n”,total_time)
printf(“nfsd_vfs_write calls: %d\n”,count)
printf(“nfsd_vfs_write average call time(us): %d\n”,total_time/count)
}
Run the SystemTap script
[root@host1 ~]# cat stp.sh
runid=$1
sleep 10
stap -D MAXACTION=1000000 nfs_trace.stp > nfs_trace.${runid}.out
[root@host1 ~]# ./stp.sh kernel-3.10-run1
Note that, the script will wait for 10 seconds before tracing. This is to make sure the network workload to be monitored will be running stable.
Get the tracing result
[root@host1 ~]# cat stp_3.10/nfs_trace.1.out | head -3
nfsd_vfs_write(1 nfsd 3597 3597 1658189341055849) call time: 139
nfsd_vfs_write(2 nfsd 3598 3598 1658189341055864) call time: 136
nfsd_vfs_write(3 nfsd 3596 3596 1658189341055977) call time: 149
[root@host1 ~]# cat stp_3.10/nfs_trace.1.out | tail -5
nfsd_vfs_write(116340 nfsd 3595 3595 1658189346055536) call time: 182
nfsd_vfs_write(116341 nfsd 3598 3598 1658189346055621) call time: 237
nfsd_vfs_write total call time(us): 23050967
nfsd_vfs_write calls: 116341
nfsd_vfs_write average call time(us): 198
As we can see, there are totally 116341 calls for the function “nfsd_vfs_write” and the average call time is 198 us. This gives us a clear sense of the function call latency so that we can compare the same for different system configurations(e.g. different kernel versions).
SystemTap runtime errors
When we run the SystemTap script, we added a option “-D MAXACTION=1000000” to fix the following runtime error.
[root@host1 ~]# stap nfs_trace.stp > nfs_trace.out
ERROR: MAXACTION exceeded near identifier 'printf' at nfs_trace.stp:27:3
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed. [man error::pass5]
What does “MAXACTION exceeded” mean?
The probe handler attempted to execute too many statements in the probe handler. The default number of actions allowed in a probe handler is 1000.
Reference
- https://sourceware.org/systemtap/SystemTap_Beginners_Guide/
- https://sourceware.org/systemtap/examples/
- https://sourceware.org/systemtap/SystemTap_Beginners_Guide/errors.html
- https://sourceware.org/systemtap/SystemTap_Beginners_Guide/runtimeerror.html
- https://mediatemple.net/blog/legacy/wrangling-nsf-load-with-systemtap/