Valtrix Blog

A Lightweight Function Entry and Exit Profiler

Oct 23, 2015. | By: Shubhodeep Roy Choudhury

Recently, I had been working on a framework for logging of debug and information messages in STING. STING messages contain lot of information related to test generation and debug which are important for the developer and debugger. Since the code base is quite huge, it is essential to improve the clarity of the messages by annotating them with the names of function module and source file which contain those print statements. Here I will show how the solution for this problem in STING was developed.

Profiling function entry and exit in your computer program helps in failure debug and general understanding of the code flow. Sophisticated profiling information could have been generated by gprof integration, but I was looking for something lightweight. Something like below.  

void foo()
{
    printf("Entering foo\n");

    /*
     * Bla bla bla
     */

    printf("Exiting foo\n");
    return;
}

The function and file name to be used in the print statements can be obtained from preprocessor macros like __FUNC__ and __FUNCTION__. It might work for small programs, but is highly infeasible for a large code base like STING . Adding new lines of profiling code into the source every time a function module is created is highly prone to manual errors. Over a period of time, one would see a large number of function modules which have the profiling code missing.

So what was needed was a solution that can replicate a piece of code uniformly across all the functions in the entire source code.

Researching a bit on GCC options, helped me in finding a compile time flag -finstrument-functions. The details from the GNU GCC homepage are pasted below -  

-finstrument-functions

Generate instrumentation calls for entry and exit to functions. Just after function entry and just before function exit, the following profiling functions are called with the address of the current function and its call site. (On some platforms, __builtin_return_address does not work beyond the current function, so the call site information may not be available to the profiling functions otherwise.)

  void __cyg_profile_func_enter (void *this_fn, void *call_site);
  void __cyg_profile_func_exit  (void *this_fn, void *call_site);

The first argument is the address of the start of the current function, which may be looked up exactly in the symbol table.

This instrumentation is also done for functions expanded inline in other functions. The profiling calls indicate where, conceptually, the inline function is entered and exited. This means that addressable versions of such functions must be available. If all your uses of a function are expanded inline, this may mean an additional expansion of code size. If you use extern inline in your C code, an addressable version of such functions must be provided. (This is normally the case anyway, but if you get lucky and the optimizer always expands the functions inline, you might have gotten away without providing static copies.)

A function may be given the attribute no_instrument_function, in which case this instrumentation is not done. This can be used, for example, for the profiling functions listed above, high-priority interrupt routines, and any functions from which the profiling functions cannot safely be called (perhaps signal handlers, if the profiling routines generate output or allocate memory).

In nutshell, using this flag during compilation of a file inserts functions present in the block above, at the beginning and the end of every function present in it. These functions can be later coded up as per the developer’s requirement. This is what we did in case of STING.

The following block of code was put in a source file.  

void func_entry(void * func)
{
    std::cout << "Func_Entry: " << (uint64_t)func << "\n";
}

void func_exit(void * func)
{
    std::cout << "Func_Exit: " << (uint64_t)func << "\n";
}

extern "C"
{
    void __cyg_profile_func_enter(void *this_fn, void *call_site)
        __attribute__((no_instrument_function));
    void __cyg_profile_func_exit(void *this_fn, void *call_site)
        __attribute__((no_instrument_function));
}

void __cyg_profile_func_enter(void * func, void * caller)
{
    func_entry(func);
}

void __cyg_profile_func_exit(void * func, void * caller)
{
    func_exit(func);
}

The entire source code was compiled with -finstrument-functions except for the file that has the definition for __cyg_profile_func_enter and __cyg_profile_func_exit. You really don’t want the compiler to insert the call to instrumentation functions inside their definition itself.

Note that we are printing the function pointer address inside the definition. The output log looks something like this.
 

Func_Exit: 42c71e
Func_Exit: 4249f8
CPU0: Generating test
Func_Entry: 423e66
Func_Entry: 42e150
Func_Entry: 40af42
Func_Exit: 40af42
Func_Entry: 40af42
Func_Exit: 40af42
Func_Entry: 40af42
Func_Exit: 40af42
Func_Entry: 40af42
Func_Exit: 40af42
Func_Entry: 40af42
Func_Exit: 40af42
test_el = 1
Func_Exit: 42e150
Func_Exit: 423e66
Func_Entry: 438a68
cpu_id: 0 wait_start_count = 1
Func_Exit: 438a68

The function pointer addresses are present in the program binary and can be resolved using addr2line utility in Linux. One can write a addr2line based script which resolves the addresses and pretty prints the information. For STING, this is how the final output looks like.

| FUNC(Generator::run())
| | FUNC(Generator::emitAsmHeaders())
| | | FUNC(AsmEmitter::emitDirective(char const*))
| | | | FUNC(AsmEmitter::emitMnemonic(char const*, ...))
| | | FUNC(AsmEmitter::emitDirective(char const*))
| | | | FUNC(AsmEmitter::emitMnemonic(char const*, ...))
| | | FUNC(AsmEmitter::emitDirective(char const*))
| | | | FUNC(AsmEmitter::emitMnemonic(char const*, ...))
| CPU0: Generating test
| | FUNC(Generator::do_test_setup())
| | | FUNC(CpuResource::do_test_setup())
| | | | FUNC(Rand::get_rand32())
| | | | FUNC(Rand::get_rand32())
| | | | FUNC(Rand::get_rand32())
| | | | FUNC(Rand::get_rand32())
| | | | FUNC(Rand::get_rand32())
| | | test_el = 1
| | FUNC(wait_for_all_start(unsigned int, unsigned int))
| | cpu_id: 0 wait_start_count = 1
| | FUNC(Generator::reset())
| | | FUNC(CpuResource::reset(unsigned int))
| | | Test = 0

So this is how we achieved a lightweight function entry and exit profiling in our program. I hope all the readers find the information given here useful. Feel free to contact me for any questions that you have.

Categories: programming

© 2020 Valtrix Technologies Private Limited. All Rights Reserved.