Chris Ball » Tracing internal function calls in a binary

Tracing internal function calls in a binary — November 30, 2007

Dear everyone who likes Unix,

I have a binary (which uses glib and was compiled from C) and I’d like to get output with the function name each time any function in that binary is called. So, I’d like the output of ltrace(1), but for function calls rather than dynamic library calls. I am bored of adding g_debug("%s", G_STRFUNC); to the top of all my functions.

You’d think this would be easy, given that incredibly similar tools have existed for twenty years, but so far the shortest answer I’ve heard starts “well, you could write a gcc profile function stub that..”. It would be nice not have to recompile, since gdb certainly doesn’t have to, but I’d welcome the way to achieve this with a recompile as well.

Any ideas? Thanks!

Update: jmbr wins, with the only solution that doesn’t require anything more than gdb, and no recompile. Here’s his script: http://superadditive.com/software/callgraph. I’d like to work on it to add support for modules loaded with dlopen().

Comments

Andrew Sutherland said on November 30, 2007 at 9:21 pm:

Have you looked at chronicle recorder (http://code.google.com/p/chronicle-recorder/)? It’s arguably a bit heavy-weight; you are running your program under valgrind, resulting in a complete execution trace. But it can tell you all the functions that were called, what their arguments were, and other exciting debugging functionality from the year 2020. (But the run will be slow…)
Reply ↓
jmbr said on November 30, 2007 at 9:54 pm:

Hi Chris,

My hack to achieve is to compile the program using gcc’s -g option and take control of gdb. The technique is illustrated in the following script:

http://superadditive.com/software/callgraph

Hope it helps
Reply ↓
Federico Mena Quintero said on November 30, 2007 at 10:06 pm:

Dtrace can do this.

I’m not sure if Systemtap can dive into userspace these days, or if they have a related tool for that.
Reply ↓
Ian McKellar said on November 30, 2007 at 10:44 pm:

If you can recompile you can play with GCC’s function instrumentation feature. Google found this: http://ndevilla.free.fr/etrace/
Reply ↓
Toby Jaffey said on November 30, 2007 at 11:06 pm:

I’d do it with instrumentation as you can recompile, but etrace looks a bit heavyweight.

Here’s a contrived example showing how to add the hooks, compile the code and get a live trace.

http://the.earth.li/~toby/hello.c

🙂
Reply ↓
Mark Wielaard said on November 30, 2007 at 11:28 pm:

You might want to check out frysk
http://www.sourceware.org/frysk/

It very recently (read today) got various function and syscall tracing options merged in by Petr Machata:
http://sourceware.org/ml/frysk/2007-q4/msg00164.html

man page explaining the different options of call tracing:
http://sourceware.org/git/?p=frysk.git;a=blob;f=frysk-core/frysk/bindir/ftrace.xml;hb=HEAD

I am afraid you will have to build it from a current git checkout for now though:
http://sourceware.org/frysk/build/

And you might find some bugs, but we welcome reports!
http://sourceware.org/bugzilla/
Reply ↓
Chris said on November 30, 2007 at 11:31 pm:

jmbr, thanks so much! This looks perfect, and works.

My process opens some shared libraries with the glib module_open() call, though, and those obviously don’t get breakpoints set ahead of time.

Those libraries have individual debuginfo files in /usr/lib/debug; I’m going to try doing `nm /usr/lib/debug/foo/*.debug` and massaging the output so that it fits into the trace.gdb style, then I’ll hope that gdb can resolve the functions once they’re dynamically loaded. Can anyone think of a reason this won’t work, or a better way? 🙂

Thanks!
Reply ↓
Chris said on November 30, 2007 at 11:37 pm:

Ian, many thanks to you too — I also like the look of etrace.
Reply ↓
Chris said on November 30, 2007 at 11:56 pm:

Hi Toby!

Yes, your example is what one of my coworkers recommended (but he didn’t have code handy, so thanks for that).

It’s clear that a gdb solution is preferable, though. I wonder if a patch to accomplish jmbr’s hack through a “trace *”-like syntax would be accepted upstream.
Reply ↓
Davyd said on December 1, 2007 at 1:30 am:

I thought that oprofile could do this.
Reply ↓
Stoffe said on December 1, 2007 at 9:26 am:

> I thought that oprofile could do this.

So, are you surprised to learn that it can’t, or are you trying to be helpful but in a smug oh-look-I-know-more-than-someone-else-on-the-internet way?

If you are trying to be helpful, the grown up way to do that is saying “oprofile will do want you want”.

Thanks.
Reply ↓
Ross said on December 2, 2007 at 8:32 am:

oprofile can’t do this, because it’s a statistical profiler. If you have a function which is called frequently but very short, it may never notice it being called.
Reply ↓
Rob said on December 2, 2007 at 1:43 pm:

LTTng can apparently do this, but I have to admit I haven’t got it working yet!
Reply ↓
John Levon said on December 3, 2007 at 1:51 am:

It’s easy with DTrace:

dtrace -n ‘pid*:::entry, pid*:::return {}’ -c /bin/mycommand
Reply ↓
TomTromey said on December 5, 2007 at 1:10 am:

Like Mark said, ftrace — part of frysk — can do this. frysk has a lot of potential for fun stuff like this; last year I was playing around writing jython scripts to debug things. Back then frysk wasn’t mature enough to do much interesting, but this has changed.

Also ltrace can do it for things loaded from shared libraries. Unfortunately this doesn’t work with the main executable. You could hack around this, though, by having a small main and then putting the rest of your program into a .so.

Pace Federico, but systemtap can’t do userspace yet. However, that is also coming.

Finally, I wonder if a valgrind skin exists to do this.
Reply ↓

Leave a Reply Cancel reply