Saturday 19 September 2009

SystemTap – DTrace for Linux ?

SystemTap – DTrace for Linux ?: "

Since DTrace was released for Solaris I am missing it on Linux systems... It can't be included in Linux by the same reason why ZFS can't be - it's licensing issue. Both ZFS and DTrace are under CDDL, which is incompatible with GPL. So you can see DTrace and ZFS on Solaris, FreeBSD, MacOS, but not on Linux.


However I follow the project SystemTap for couple of years (it was started in 2005), which is supposed to provide similar to DTrace functionality.


Why I am interested in this tool, because there is no simple way under Linux to profile not CPU-bound load (for CPU-bound there is OProfile, see for example

http://mysqlinsights.blogspot.com/2009/08/oprofile-for-io-bound-apps.html). I.e. for IO-bound or for mutex contention problems OProfile is not that useful.


SystemTap is included in RedHat 5 releases, but I was not able to get it running even in CentOS 5.3 (it crashed and hung every so often). Latest updated RedHat 5.4 promised some more fixes to SystemTap, so I decided to give it more try as soon as I got RedHat 5.4 on hands.


Surprising, but now it runs much more stable. I was able to get profiling of kernel and system calls.

Here is simple script to show IO activity per disk per process (well, it is similar to iotop, but iotop is not available in RedHat / CentOS)


with output like this



CODE:




  1. Mon Sep 14 05:22:14 2009 , Average:20353Kb/sec, Read: 4337Kb, Write: 97428Kb






  2. UID PID PPID CMD DEVICE T BYTES



  3. 27 3701 3651 mysqld dm-0 W 99766272



  4. 27 3701 3651 mysqld dm-0 R 4440064



  5. 0 2324 2296 hald-addon-stor dm-0 R 1242






  6. Mon Sep 14 05:22:19 2009 , Average:21756Kb/sec, Read: 4263Kb, Write: 104521Kb






  7. UID PID PPID CMD DEVICE T BYTES



  8. 27 3701 3651 mysqld dm-0 W 107029504



  9. 27 3701 3651 mysqld dm-0 R 4358144



  10. 0 2883 2879 pam_timestamp_c dm-0 R 6528



  11. 0 2324 2296 hald-addon-stor dm-0 R 828







This example maybe is simple, but the point is that there is rich scripting language with tons

of probes you can intersect ( kernel functions, FS drivers functions, any other drives and modules)


What else I see very useful in SystemTap it can work in userspace. That is you can use it to profile your and any application that has -debuginfo packages ( all -debuginfo for standard RedHat RPMS you can download from RedHat FTP), but basically it is info you get compiling with gcc -g.


Well, there seems another war story going on. To profile userspace application with SystemTap your kernel should be patches with uprobes patch, which fortunately is included in RedHat based kernels, but not included in vanilla kernel yet. So I am not sure if you can get userspace profiling running in another distributives.


There is quite simple script that I tried to hack around MySQL ®



CODE:




  1. probe process('/usr/libexec/mysqld').function('*innobase*').



  2. {



  3. printf('s(%s)\n', probefunc(), $parms)



  4. }







with output which I get running simple SELECT against InnoDB table:



CODE:




  1. stap -v lsprob.stp



  2. Pass 1: parsed user script and 52 library script(s) in 240usr/10sys/261real ms.



  3. Pass 2: analyzed script: 107 probe(s), 22 function(s), 1 embed(s), 0 global(s) in 540usr/20sys/554real ms.



  4. Pass 3: using cached /root/.systemtap/cache/4f/stap_4f8b8738f58ff78e294c62765ac83d91_36925.c



  5. Pass 4: using cached /root/.systemtap/cache/4f/stap_4f8b8738f58ff78e294c62765ac83d91_36925.ko



  6. Pass 5: starting run.



  7. innobase_register_trx_and_stmt(thd=? )



  8. innobase_register_stmt(thd=? )



  9. innobase_map_isolation_level(iso=? )



  10. innobase_release_stat_resources(trx=0x2aaaaaddb8b8 )



  11. convert_search_mode_to_innobase(find_flag=? )



  12. innodb_srv_conc_enter_innodb(trx=? )



  13. srv_conc_enter_innodb(trx=0x2aaaaaddb8b8 )



  14. innodb_srv_conc_exit_innodb(trx=? )



  15. srv_conc_exit_innodb(trx=0x2aaaaaddb8b8 )



  16. innobase_release_temporary_latches(thd=0x1a6aced0 )



  17. innobase_release_stat_resources(trx=? )



  18. srv_conc_force_exit_innodb(trx=0x2aaaaaddb8b8 )







Again, this case is maybe too simple, but basically you can intersect internal MySQL function and script (measure time, count of call, statistics) what you what. I did not figure out yet how to intersect C++ style function (i.e. ha_innobase::index_read), so there is area to investigate.


So I am going to play with it more and do some useful scripting to get profiling of MySQL.


And it seems SystemTap can re-use DTrace probes available in application, as you may know DTrace-probes were added into MySQL 5.4, so interesting how it works.


I should mention that there is second alternative of DTrace... It's .... DTrace port. Looking on blog it seems one-man project and currently author is fighting with resolving userspace issues. I gave to this a try, but on my current RedHat 5.4 after several runs I got 'Kernel panic', so it's enough for now.




Entry posted by Vadim |
2 comments


Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting:
Vote UP /
Vote DOWN"

No comments:

Sike's shared items