Analyze with valgrind only some functions and subfunctions - c

I want to debug a "big" C code, and use valgrind, in particular the tool memcheck. The output is very long, due to the size of the program, and I only want to focus on some function and relative subfunctions of the program. Is it possible in valgrind only to analyze certain function and subfunctions (up to some depth level)?
Thanks

Valgrind must supervise the process from the start; it is not possible to attach it to already running process (or, equivalently, to ignore the process until some point in execution, then start emulating/checking).
The reverse is not true -- you can "detach" valgrind after some number of instructions; but I am guessing that's not what you want.
Please note that:
the "output is very long" is a poor excuse -- Valgrind errors are usually
true positives (unless you are using optimized code, in which case: don't do that), and should really be addressed, and
you can concentrate on the more serious problems (heap corruption) before addressing the use of uninitialized values, by using --undef-value-errors=no

Related

Are memcheck errors ever acceptable?

The valgrind quickstart page mentions:
Try to make your program so clean that Memcheck reports no errors. Once you achieve this state, it is much easier to see when changes to the program cause Memcheck to report new errors. Experience from several years of Memcheck use shows that it is possible to make even huge programs run Memcheck-clean. For example, large parts of KDE, OpenOffice.org and Firefox are Memcheck-clean, or very close to it.
This block left me a little perplexed. Seeing as the way the C standard works, I would assume most (if not all) practices that produce memcheck errors would invoke undefined behavior on the program, and should therefore be avoided like the plague.
However, the last sentence in the quoted block implies there are in fact "famous" programs that run in production with memcheck errors. After reading this, I thought I'd put this to test and I tried running VLC with valgrind, getting a bunch of memcheck errors right after starting it.
This lead me to this question: are there ever good reasons not to eliminate such errors from a program in production? Is there ever anything to be gained from releasing a program that contains such errors and, if so, how do the developers keep it safe despite the fact that a program that contains such errors can, to my knowledge, act unpredictably and there is no way to make assumptions about its behavior in general? If so, can you provide real-world examples of cases in which the program is better off running with those errors than without?
There has been a case where fixing the errors reported by Valgrind actually led to security flaws, see e.g. https://research.swtch.com/openssl . Intention of the use of uninitialised memory was to increase entropy by having some random bytes, the fix led to more predictable random numbers, indeed weakening security.
In case of VLC, feel free to investigate ;-)
One instance is when you are deliberately writing non-portable code to take advantage of system-specific optimizations. Your code might be undefined behavior with respect to the C standard, but you happen to know that your target implementation does define the behavior in a way that you want.
A famous example is optimized strlen implementations such as those discussed at vectorized strlen getting away with reading unallocated memory. You can design such algorithms more efficiently if they are allowed to potentially read past the terminating null byte of the string. This is blatant UB for standard C, since this might be past the end of the array containing the string. But on a typical real-life machine (say for instance x86 Linux), you know what will actually happen: if the read touches an unmapped page, you will get SIGSEGV, and otherwise the read will succeed and give you whatever bytes happen to be in that region of memory. So if your algorithm checks alignment to avoid crossing page boundaries unnecessarily, it may still be perfectly safe for x86 Linux. (Of course you should use appropriate ifdef's to ensure that such code isn't used on systems where you can't guarantee its safety.)
Another instance, more relevant to memcheck, might be if you happen to know that your system's malloc implementation always rounds up allocation requests to, say, multiples of 32 bytes. If you have allocated a buffer with malloc(33) but now find that you need 20 more bytes, you could save yourself the overhead of realloc() because you know that you were actually given 64 bytes to play with.
memcheck is not perfect. Following are some problems and possible reasons for higher false positive rate:
memcheck's ability and shadow bit propagation related rules to decrease overhead - but it affects false positive rate
imprecise representation of flag registers
higher optimization level
From memcheck paper (published in usenix 2005) - but things might definitely have changed since then.
A system such as Memcheck cannot simultaneously be free of false
negatives and false positives, since that would be equivalent to
solving the Halting Problem. Our design attempts to almost completely
avoid false negatives and to minimise false positives. Experience in
practice shows this to be mostly successful. Even so, user feedback
over the past two years reveals an interesting fact: many users have
an (often unstated) expectation that Memcheck should not report any
false positives at all, no matter how strange the code being checked
is.
We believe this to be unrealistic. A better expectation is to accept
that false positives are rare but inevitable. Therefore it will
occasionally necessary to add dummy initialisations to code to make
Memcheck be quiet. This may lead to code which is slightly more
conservative than it strictly needs to be, but at least it gives a
stronger assurance that it really doesn't make use of any undefined
values.
A worthy aim is to achieve Memcheck-cleanness, so that new errors are
immediately apparent. This is no different from fixing source code to
remove all compiler warnings, even ones which are obviously harmless.
Many large programs now do run Memcheck-clean, or very nearly so. In
the authors' personal experience, recent Mozilla releases come close
to that, as do cleaned-up versions of the OpenOffice.org-680
development branch, and much of the KDE desktop environment. So this
is an achievable goal.
Finally, we would observe that the most effective use of Memcheck
comes not only from ad-hoc debugging, but also when routinely used on
applications running their automatic regression test suites. Such
suites tend to exercise dark corners of implementations, thereby
increasing their Memcheck-tested code coverage.
Here's a section on avoiding false positives:
Memcheck has a very low false positive rate. However, a few hand-coded assembly sequences, and a few very
rare compiler-generated idioms can cause false positives.
You can find the origin of the error using --track-origins=yes option, you may be able to see what's going on.
If a piece of code is running in a context that would never cause uninitialized storage to contain confidential information that it must not leak, some algorithms may benefit from a guarantee that reading uninitialized storage will have no side effects beyond yielding likely-meaningless values. For example, if it's necessary to quickly set up a hash map, which will often have only a few items placed in it before it's torn down, but might sometimes have many items, a useful approach is to have an array which holds data items and has values in the order they were added, along with a hash table that maps hash values to storage slot numbers. If the number of items stored into the table is N, an item's hash is H, and attempting to access hashTable[H] is guaranteed yield a value I that will either be the number stored there, if any, or else an arbitrary number, then one of three things will happen:
I might be greater than or equal to N. In that case, the table does not contain a value with a hash of H.
I might be less than N, but items[I].hash != H. In that case, the table does not contain a value with a hash of H.
I might be less than N, and items[I].hash == H. In that case, the table rather obviously contains at least one value (the one in slot I) with a hash of H.
Note that if the uninitialized hash table could contain confidential data, an adversary who can trigger hashing requests may be able to use timing attacks to gain some information about its contents. The only situations where the value read from a hash table slot could affect any aspect of function behavior other than execution time, however, would be those in which the hash table slot had been written.
To put things another way, the hash table would contain a mixture of initialized entries that would need to be read correctly, and meaningless uninitialized entries whose contents could not observably affect program behavior, but the code might not be able to determine whether the contents of an entry might affect program behavior until after it had read it.
For program to read uninitialized data when it's expecting to read initialized data would be a bug, and since most places where a program would attempt to read data would be expecting initialized data, most attempts to read uninitialized data would be bugs. If a language included a construct to explicitly request that an implementation either read data if it had been written, and otherwise or yield some arbitrary value with no side effects, it would make sense to regard attempts to read uninitialized data without such a construct as a defect. In a language without such a construct, however, the only way to avoid warnings about reading uninitialized data would be to forego some useful algorithms that could otherwise benefit from the aforementioned guarantee.
My experience of posts concerning Valgrind on Stack Overflow is that there is often either or both a misplaced sense of overconfidence or a lack of understanding of the what the compiler and Valgrind are doing [neither of these observations is aimed at the OP]. Ignoring errors for either of these reasons is a recipe for disaster.
Memcheck false positives are quite rare. I've used Valgrind for many years and I can count the types of false positives that I've encountered on one hand. That said, there is an ongoing battle by the Valgrind developers and the code that optimising compilers emit. For instance see this link (if anyone is interested, there are plenty other good presentations about Valgrind on the FOSDEM web site). In general, the problem is that optimizing compilers can make changes so long as there is no observable difference in the behaviour. Valgrind has baked in assumptions about how executables work, and if a new compiler optimization steps outside of those assumptions false positives can result.
False negatives usually mean that Valgrind has not correctly encapsulated some behaviour. Usually this will be a bug in Valgrind.
What Valgrind won't be able to tell you is how serious the error is. For instance, you may have a printf that is passed a pointer to character array that contains some uninitialized bytes but which is always nul terminated. Valgrind will detect an error, and at runtime you might get some random rubbish on the screen, which may be harmless.
One example that I've come across where a fix is probably not worth the effort is the use of the putenv function. If you need to put a dynamically allocated string into the environment then freeing that memory is a pain. You either need to save the pointer somewhere or save a flag that indicates that the env var has been set, and then call some cleanup function before your executable terminates. All that just for a leak of around 10-20 bytes.
My advice is
Aim for zero errors in your code. If you allow large numbers of errors then the only way to tell if you introduce new errors is to use scripts that filter the errors and compare them with some reference state.
Make sure that you understand the errors that Valgrind generates. Fix them if you can.
Use suppression files. Use them sparingly for errors in third party libraries that you cannot fix, harmless errors for which the fix is worse than the error and any false positives.
Use the -s/-v Valgrind options and remove unused suppressions when you can (this will probably require some scripting).

How am I writing on some spot of memory that I didnt allocated? [duplicate]

How dangerous is accessing an array outside of its bounds (in C)? It can sometimes happen that I read from outside the array (I now understand I then access memory used by some other parts of my program or even beyond that) or I am trying to set a value to an index outside of the array. The program sometimes crashes, but sometimes just runs, only giving unexpected results.
Now what I would like to know is, how dangerous is this really? If it damages my program, it is not so bad. If on the other hand it breaks something outside my program, because I somehow managed to access some totally unrelated memory, then it is very bad, I imagine.
I read a lot of 'anything can happen', 'segmentation might be the least bad problem', 'your hard disk might turn pink and unicorns might be singing under your window', which is all nice, but what is really the danger?
My questions:
Can reading values from way outside the array damage anything
apart from my program? I would imagine just looking at things does
not change anything, or would it for instance change the 'last time
opened' attribute of a file I happened to reach?
Can setting values way out outside of the array damage anything apart from my
program? From this
Stack Overflow question I gather that it is possible to access
any memory location, that there is no safety guarantee.
I now run my small programs from within XCode. Does that
provide some extra protection around my program where it cannot
reach outside its own memory? Can it harm XCode?
Any recommendations on how to run my inherently buggy code safely?
I use OSX 10.7, Xcode 4.6.
As far as the ISO C standard (the official definition of the language) is concerned, accessing an array outside its bounds has "undefined behavior". The literal meaning of this is:
behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements
A non-normative note expands on this:
Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message).
So that's the theory. What's the reality?
In the "best" case, you'll access some piece of memory that's either owned by your currently running program (which might cause your program to misbehave), or that's not owned by your currently running program (which will probably cause your program to crash with something like a segmentation fault). Or you might attempt to write to memory that your program owns, but that's marked read-only; this will probably also cause your program to crash.
That's assuming your program is running under an operating system that attempts to protect concurrently running processes from each other. If your code is running on the "bare metal", say if it's part of an OS kernel or an embedded system, then there is no such protection; your misbehaving code is what was supposed to provide that protection. In that case, the possibilities for damage are considerably greater, including, in some cases, physical damage to the hardware (or to things or people nearby).
Even in a protected OS environment, the protections aren't always 100%. There are operating system bugs that permit unprivileged programs to obtain root (administrative) access, for example. Even with ordinary user privileges, a malfunctioning program can consume excessive resources (CPU, memory, disk), possibly bringing down the entire system. A lot of malware (viruses, etc.) exploits buffer overruns to gain unauthorized access to the system.
(One historical example: I've heard that on some old systems with core memory, repeatedly accessing a single memory location in a tight loop could literally cause that chunk of memory to melt. Other possibilities include destroying a CRT display, and moving the read/write head of a disk drive with the harmonic frequency of the drive cabinet, causing it to walk across a table and fall onto the floor.)
And there's always Skynet to worry about.
The bottom line is this: if you could write a program to do something bad deliberately, it's at least theoretically possible that a buggy program could do the same thing accidentally.
In practice, it's very unlikely that your buggy program running on a MacOS X system is going to do anything more serious than crash. But it's not possible to completely prevent buggy code from doing really bad things.
In general, Operating Systems of today (the popular ones anyway) run all applications in protected memory regions using a virtual memory manager. It turns out that it is not terribly EASY (per se) to simply read or write to a location that exists in REAL space outside the region(s) that have been assigned / allocated to your process.
Direct answers:
Reading will almost never directly damage another process, however it can indirectly damage a process if you happen to read a KEY value used to encrypt, decrypt, or validate a program / process. Reading out of bounds can have somewhat adverse / unexpected affects on your code if you are making decisions based on the data you are reading
The only way your could really DAMAGE something by writing to a loaction accessible by a memory address is if that memory address that you are writing to is actually a hardware register (a location that actually is not for data storage but for controlling some piece of hardware) not a RAM location. In all fact, you still wont normally damage something unless you are writing some one time programmable location that is not re-writable (or something of that nature).
Generally running from within the debugger runs the code in debug mode. Running in debug mode does TEND to (but not always) stop your code faster when you have done something considered out of practice or downright illegal.
Never use macros, use data structures that already have array index bounds checking built in, etc....
ADDITIONAL
I should add that the above information is really only for systems using an operating system with memory protection windows. If writing code for an embedded system or even a system utilizing an operating system (real-time or other) that does not have memory protection windows (or virtual addressed windows) that one should practice a lot more caution in reading and writing to memory. Also in these cases SAFE and SECURE coding practices should always be employed to avoid security issues.
Not checking bounds can lead to to ugly side effects, including security holes. One of the ugly ones is arbitrary code execution. In classical example: if you have an fixed size array, and use strcpy() to put a user-supplied string there, the user can give you a string that overflows the buffer and overwrites other memory locations, including code address where CPU should return when your function finishes.
Which means your user can send you a string that will cause your program to essentially call exec("/bin/sh"), which will turn it into shell, executing anything he wants on your system, including harvesting all your data and turning your machine into botnet node.
See Smashing The Stack For Fun And Profit for details on how this can be done.
You write:
I read a lot of 'anything can happen', 'segmentation might be the
least bad problem', 'your harddisk might turn pink and unicorns might
be singing under your window', which is all nice, but what is really
the danger?
Lets put it that way: load a gun. Point it outside the window without any particular aim and fire. What is the danger?
The issue is that you do not know. If your code overwrites something that crashes your program you are fine because it will stop it into a defined state. However if it does not crash then the issues start to arise. Which resources are under control of your program and what might it do to them? I know at least one major issue that was caused by such an overflow. The issue was in a seemingly meaningless statistics function that messed up some unrelated conversion table for a production database. The result was some very expensive cleanup afterwards. Actually it would have been much cheaper and easier to handle if this issue would have formatted the hard disks ... with other words: pink unicorns might be your least problem.
The idea that your operating system will protect you is optimistic. If possible try to avoid writing out of bounds.
Not running your program as root or any other privileged user won't harm any of your system, so generally this might be a good idea.
By writing data to some random memory location you won't directly "damage" any other program running on your computer as each process runs in it's own memory space.
If you try to access any memory not allocated to your process the operating system will stop your program from executing with a segmentation fault.
So directly (without running as root and directly accessing files like /dev/mem) there is no danger that your program will interfere with any other program running on your operating system.
Nevertheless - and probably this is what you have heard about in terms of danger - by blindly writing random data to random memory locations by accident you sure can damage anything you are able to damage.
For example your program might want to delete a specific file given by a file name stored somewhere in your program. If by accident you just overwrite the location where the file name is stored you might delete a very different file instead.
NSArrays in Objective-C are assigned a specific block of memory. Exceeding the bounds of the array means that you would be accessing memory that is not assigned to the array. This means:
This memory can have any value. There's no way of knowing if the data is valid based on your data type.
This memory may contain sensitive information such as private keys or other user credentials.
The memory address may be invalid or protected.
The memory can have a changing value because it's being accessed by another program or thread.
Other things use memory address space, such as memory-mapped ports.
Writing data to unknown memory address can crash your program, overwrite OS memory space, and generally cause the sun to implode.
From the aspect of your program you always want to know when your code is exceeding the bounds of an array. This can lead to unknown values being returned, causing your application to crash or provide invalid data.
You may want to try using the memcheck tool in Valgrind when you test your code -- it won't catch individual array bounds violations within a stack frame, but it should catch many other sorts of memory problem, including ones that would cause subtle, wider problems outside the scope of a single function.
From the manual:
Memcheck is a memory error detector. It can detect the following problems that are common in C and C++ programs.
Accessing memory you shouldn't, e.g. overrunning and underrunning heap blocks, overrunning the top of the stack, and accessing memory after it has been freed.
Using undefined values, i.e. values that have not been initialised, or that have been derived from other undefined values.
Incorrect freeing of heap memory, such as double-freeing heap blocks, or mismatched use of malloc/new/new[] versus free/delete/delete[]
Overlapping src and dst pointers in memcpy and related functions.
Memory leaks.
ETA: Though, as Kaz's answer says, it's not a panacea, and doesn't always give the most helpful output, especially when you're using exciting access patterns.
If you ever do systems level programming or embedded systems programming, very bad things can happen if you write to random memory locations. Older systems and many micro-controllers use memory mapped IO, so writing to a memory location that maps to a peripheral register can wreak havoc, especially if it is done asynchronously.
An example is programming flash memory. Programming mode on the memory chips is enabled by writing a specific sequence of values to specific locations inside the address range of the chip. If another process were to write to any other location in the chip while that was going on, it would cause the programming cycle to fail.
In some cases the hardware will wrap addresses around (most significant bits/bytes of address are ignored) so writing to an address beyond the end of the physical address space will actually result in data being written right in the middle of things.
And finally, older CPUs like the MC68000 can locked up to the point that only a hardware reset can get them going again. Haven't worked on them for a couple of decades but I believe it's when it encountered a bus error (non-existent memory) while trying to handle an exception, it would simply halt until the hardware reset was asserted.
My biggest recommendation is a blatant plug for a product, but I have no personal interest in it and I am not affiliated with them in any way - but based on a couple of decades of C programming and embedded systems where reliability was critical, Gimpel's PC Lint will not only detect those sort of errors, it will make a better C/C++ programmer out of you by constantly harping on you about bad habits.
I'd also recommend reading the MISRA C coding standard, if you can snag a copy from someone. I haven't seen any recent ones but in ye olde days they gave a good explanation of why you should/shouldn't do the things they cover.
Dunno about you, but about the 2nd or 3rd time I get a coredump or hangup from any application, my opinion of whatever company produced it goes down by half. The 4th or 5th time and whatever the package is becomes shelfware and I drive a wooden stake through the center of the package/disc it came in just to make sure it never comes back to haunt me.
I'm working with a compiler for a DSP chip which deliberately generates code that accesses one past the end of an array out of C code which does not!
This is because the loops are structured so that the end of an iteration prefetches some data for the next iteration. So the datum prefetched at the end of the last iteration is never actually used.
Writing C code like that invokes undefined behavior, but that is only a formality from a standards document which concerns itself with maximal portability.
More often that not, a program which accesses out of bounds is not cleverly optimized. It is simply buggy. The code fetches some garbage value and, unlike the optimized loops of the aforementioned compiler, the code then uses the value in subsequent computations, thereby corrupting theim.
It is worth catching bugs like that, and so it is worth making the behavior undefined for even just that reason alone: so that the run-time can produce a diagnostic message like "array overrun in line 42 of main.c".
On systems with virtual memory, an array could happen to be allocated such that the address which follows is in an unmapped area of virtual memory. The access will then bomb the program.
As an aside, note that in C we are permitted to create a pointer which is one past the end of an array. And this pointer has to compare greater than any pointer to the interior of an array.
This means that a C implementation cannot place an array right at the end of memory, where the one plus address would wrap around and look smaller than other addresses in the array.
Nevertheless, access to uninitialized or out of bounds values are sometimes a valid optimization technique, even if not maximally portable. This is for instance why the Valgrind tool does not report accesses to uninitialized data when those accesses happen, but only when the value is later used in some way that could affect the outcome of the program. You get a diagnostic like "conditional branch in xxx:nnn depends on uninitialized value" and it can be sometimes hard to track down where it originates. If all such accesses were trapped immediately, there would be a lot of false positives arising from compiler optimized code as well as correctly hand-optimized code.
Speaking of which, I was working with some codec from a vendor which was giving off these errors when ported to Linux and run under Valgrind. But the vendor convinced me that only several bits of the value being used actually came from uninitialized memory, and those bits were carefully avoided by the logic.. Only the good bits of the value were being used and Valgrind doesn't have the ability to track down to the individual bit. The uninitialized material came from reading a word past the end of a bit stream of encoded data, but the code knows how many bits are in the stream and will not use more bits than there actually are. Since the access beyond the end of the bit stream array does not cause any harm on the DSP architecture (there is no virtual memory after the array, no memory-mapped ports, and the address does not wrap) it is a valid optimization technique.
"Undefined behavior" does not really mean much, because according to ISO C, simply including a header which is not defined in the C standard, or calling a function which is not defined in the program itself or the C standard, are examples of undefined behavior. Undefined behavior doesn't mean "not defined by anyone on the planet" just "not defined by the ISO C standard". But of course, sometimes undefined behavior really is absolutely not defined by anyone.
Besides your own program, I don't think you will break anything, in the worst case you will try to read or write from a memory address that corresponds to a page that the kernel didn't assign to your proceses, generating the proper exception and being killed (I mean, your process).
Arrays with two or more dimensions pose a consideration beyond those mentioned in other answers. Consider the following functions:
char arr1[2][8];
char arr2[4];
int test1(int n)
{
arr1[1][0] = 1;
for (int i=0; i<n; i++) arr1[0][i] = arr2[i];
return arr1[1][0];
}
int test2(int ofs, int n)
{
arr1[1][0] = 1;
for (int i=0; i<n; i++) *(arr1[0]+i) = arr2[i];
return arr1[1][0];
}
The way gcc will processes the first function will not allow for the possibility that an attempt to write arr[0][i] might affect the value of arr[1][0], and the generated code is incapable of returning anything other than a hardcoded value of 1. Although the Standard defines the meaning of array[index] as precisely equivalent to (*((array)+(index))), gcc seems to interpret the notion of array bounds and pointer decay differently in cases which involve using [] operator on values of array type, versus those which use explicit pointer arithmetic.
I just want to add some practical examples to this questions - Imagine the following code:
#include <stdio.h>
int main(void) {
int n[5];
n[5] = 1;
printf("answer %d\n", n[5]);
return (0);
}
Which has Undefined Behaviour. If you enable for example clang optimisations (-Ofast) it would result in something like:
answer 748418584
(Which if you compile without will probably output the correct result of answer 1)
This is because in the first case the assignment to 1 is never actually assembled in the final code (you can look in the godbolt asm code as well).
(However it must be noted that by that logic main should not even call printf so best advice is not to depend on the optimiser to solve your UB - but rather have the knowledge that sometimes it may work this way)
The takeaway here is that modern C optimising compilers will assume undefined behaviour (UB) to never occur (which means the above code would be similar to something like (but not the same):
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int n[5];
if (0)
n[5] = 1;
printf("answer %d\n", (exit(-1), n[5]));
return (0);
}
Which on contrary is perfectly defined).
That's because the first conditional statement never reaches it's true state (0 is always false).
And on the second argument for printf we have a sequence point after which we call exit and the program terminates before invoking the UB in the second comma operator (so it's well defined).
So the second takeaway is that UB is not UB as long as it's never actually evaluated.
Additionally I don't see mentioned here there is fairly modern Undefined Behaviour sanitiser (at least on clang) which (with the option -fsanitize=undefined) will give the following output on the first example (but not the second):
/app/example.c:5:5: runtime error: index 5 out of bounds for type 'int[5]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /app/example.c:5:5 in
/app/example.c:7:27: runtime error: index 5 out of bounds for type 'int[5]'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /app/example.c:7:27 in
Here is all the samples in godbolt:
https://godbolt.org/z/eY9ja4fdh (first example and no flags)
https://godbolt.org/z/cGcY7Ta9M (first example and -Ofast clang)
https://godbolt.org/z/cGcY7Ta9M (second example and UB sanitiser on)
https://godbolt.org/z/vE531EKo4 (first example and UB sanitiser on)

How to count how many times a global variable was used (read and write)?

I'm trying to optimize a C code project.
I would like to count how many times a global variable was used (read or write) in order to place it at the most suitable memory type.
For example, to store commonly used variables at the fast access memory type.
Data cache is disabled for determenistic reasons.
Is there a way to count how many times a variable was used without inserting counters or adding extra code? for example, using the assembly code?
The code is written in C.
In my possession:
A) (.map) file, generated by the GCC compiler from which I extracts the global variables names, addresses and sizes.
B) The assembly code of the project generated using the GCC compiler -S flag.
Thanks a lot,
GDB has something called watchpoints: https://sourceware.org/gdb/onlinedocs/gdb/Set-Watchpoints.html
Set a watchpoint for an expression. GDB will break when the expression
expr is written into by the program and its value changes. The
simplest (and the most popular) use of this command is to watch the
value of a single variable:
(gdb) watch foo
awatch [-l|-location] expr [thread thread-id] [mask maskvalue]
Set a watchpoint that will break when expr is either read from or written
into by the program.
Commands can be attached to watchpoints: https://sourceware.org/gdb/onlinedocs/gdb/Break-Commands.html#Break-Commands
You can give any breakpoint (or watchpoint or catchpoint) a series of commands to execute when your program stops due to that breakpoint… For example, here is how you could use breakpoint commands to print
the value of x at entry to foo whenever x is positive.
break foo if x>0
commands
silent
printf "x is %d\n",x
cont
end
The command should typically increment a variable or print "read/write" to a file, but you can really add other stuff too such as a backtrace. Unsure about the best way for outward communication using gdb. Maybe it is good enough for you to run it in interactive mode.
You can do this, using Visual Studio (or an other IDE): search for all places where your variable is used in the source code, put a conditional breakpoint, logging some information, attach to process, and launch the features which use that variable. You can count the instances in the output window.
I think, what you need is automatic instrumentation and/or profiling. GCC can actually do profile-guided optimization for you. As well as other types of instrumentation, the documentation even mentions a hook for implementing your own custom instrumentation.
There are several performance analysis tools out there such as perf and gprof profilers.
Also, execution inside a virtual machine could (at least in theory) do what you are after. valgrind comes to mind. I think, valgrind actually knows about all memory accesses. I'd look for ways to obtain this informaiton (and then corellate that with the map files).
I don't know if any of the above tools solves exactly your problem, but you definitely could use, say, perf (if it's available for your platform) to see in what areas of code significant time is spent. Then probably there are either a lot of expensive memory accesses, or just intensive computations, you can figure out which is the case by staring at the code.
Note that the compiler already allocates frequently accessed variables to registers, so the kind of information you are after won't give you an accurate picture. I.e. while some variable might be accessed a lot, cache-allocating it might not improve things much if its value already lives on a register most of the time.
Also consider that optimization affects your program greatly on the assembly level. So any performance statistics such as memory accesses counters would be different with and without optimization. And what should be of interest to you is the optimized case. On the other hand, restoring the information about what location corresponds to which variable is harder with the optimized program if feasible at all.

Valgrind memcheck finds lots of conditional jumps and invalid reads in commercial library

I am debugging a program which links against a commercial API library (under Linux). I am using valgrind memcheck, because I am experiencing strange behavior which could be due to writes beyond allocated blocks of memory:
valgrind --tool=memcheck --error-limit=no --log-file=memcheck.log ./executable
The first thing which jumps to my eye, however, are many errors of the types
Use of uninitialised value of size (4/8/16)
Invalid read of size (4/8/16)
Conditional jump or move depends on uninitialised value(s)
Some, but not all, of these occur in __intel_sse2_strcpy or __intel_sse2_strlen. Furthermore, according to valgrind there are definite memory leaks.
which appear in the library. They also appear when I compile one of the examples that ship with the library, so they are not my programming errors. Furthermore, they consistently occur with different versions of the library. Since the library is closed-source I cannot seem to clarify if the errors are fatal or not.
Practically this makes it hard for me to identify my potential own errors. I am a bit surprised to see so many warnings because I tend to fix my own programs until memcheck does not print these anymore (before I give it away at least). The question is: Can I consider such errors as save to ignore, do they commonly appear in packaged software, or are they likely even false positives (for instance because the library was compiled with optimizations)?
I would say:
No, you can't consider them safe to ignore. Valgrind is good.
Yes, they can be pretty common if the original developers have never used Valgrind or a similar tool on their code, it's reasonable to expect some hits.
I don't think they are false posivives, such are rare.
Quoting an answer from here which might explains the false positives encountered in string operations:
https://www.intel.com/content/www/us/en/developer/articles/troubleshooting/false-positive-diagnostic-on-string-operations-reported-by-intel-inspector.html
''' there are certain string operations that use vector(SIMD) instructions to calculate the string length. They read a string pointer in 32 byte chunks and check for a NULL character in each chunk that it reads. If the string size is not a multiple of 32, then it reads garbage in the memory region after the NULL '''

C code on Linux under gdb runs differently if run standalone?

I have built a plain C code on Linux (Fedora) using code-sorcery tool-chain. This is for ARM Cortex-A8 target. This code is running on a Cortex A8 board, running embedded Linux.
When I run this code for some test case, which does dynamic memory allocation (malloc) for some large size (10MB), it crashes after some time giving error message as below:
select 1 (init), adj 0, size 61, to kill
select 1030 (syslogd), adj 0, size 64, to kill
select 1032 (klogd), adj 0, size 74, to kill
select 1227 (bash), adj 0, size 378, to kill
select 1254 (ppp), adj 0, size 1069, to kill
select 1255 (TheoraDec_Corte), adj 0, size 1159, to kill
send sigkill to 1255 (TheoraDec_Corte), adj 0, size 1159
Program terminated with signal SIGKILL, Killed.
Then, when I debug this code for the same test case using gdb built for the target, the point where this dynamic memory allocation happens, code fails to allocate that memory and malloc returns NULL. But during normal stand-alone run, I believe malloc should be failing to allocate but it strangely might not be returning NULL, but it crashes and the OS kills my process.
Why is this behaviour different when run under gdb and when without debugger?
Why would malloc fails yet not return a NULL. Could this be possible, or the reason for the error message I am getting is else?
How do I fix this?
thanks,
-AD
So, for this part of the question, there is a surefire answer:
Why would malloc fails yet not return a NULL. Could this be possible, or the reason for the error message i am getting is else?
In Linux, by default the kernel interfaces for allocating memory almost never fail outright. Instead, they set up your page table in such a way that on the first access to the memory you asked for, the CPU will generate a page fault, at which point the kernel handles this and looks for physical memory that will be used for that (virtual) page. So, in an out-of-memory situation, you can ask the kernel for memory, it will "succeed", and the first time you try to touch that memory it returned back, this is when the allocation actually fails, killing your process. (Or perhaps some other unfortunate victim. There are some heuristics for that, which I'm not incredibly familiar with. See "oom-killer".)
Some of your other questions, the answers are less clear for me.
Why is this behaviour different when run under gdb and when without debugger?It could be (just a guess really) that GDB has its own malloc, and is tracking your allocations somehow. On a somewhat related point, I've actually frequently found that heap bugs in my code often aren't reproducible under debuggers. This is frustrating and makes me scratch my head, but it's basically something I've pretty much figured one has to live with...
How do i fix this?
This is a bit of a sledgehammer solution (that is, it changes the behavior for all processes rather than just your own, and it's generally not a good idea to have your program alter global state like that), but you can write the string 2 to /proc/sys/vm/overcommit_memory. See this link that I got from a Google search.
Failing that... I'd just make sure you're not allocating more than you expect to.
By definition running under a debugger is different than running standalone. Debuggers can and do hide many of the bugs. If you compile for debugging you can add a fair amount of code, similar to compiling completely unoptimized (allowing you to single step or watch variables for example). Where compiling for release can remove debugging options and remove code that you needed, there are many optimization traps you can fall into. I dont know from your post who is controlling the compile options or what they are.
Unless you plan to deliver the product to be run under the debugger you should do your testing standalone. Ideally do your development without the debugger as well, saves you from having to do everything twice.
It sounds like a bug in your code, slowly re-read your code using new eyes as if you were explaining it to someone, or perhaps actually explain it to someone, line by line. There may be something right there that you cannot see because you have been looking at it the same way for too long. It is amazing how many times and how well that works.
I could also be a compiler bug. Doing things like printing out the return value, or not can cause the compiler to generate different code. Adding another variable and saving the result to that variable can kick the compiler to do something different. Try changing the compiler options, reduce or remove any optimization options, reduce or remove the debugger compiler options, etc.
Is this a proven system or are you developing on new hardware? Try running without any of the caches enabled for example. Working in a debugger and not in standalone, if not a compiler bug can be a timing issue, single stepping flushes the pipline, mixes the cache up differently, gives the cache and memory system an eternity to come up with a result which it doesnt have in real time.
In short there is a very long list of reasons why running under a debugger hides bugs that you cannot find until you test in the final deliverable like environment, I have only touched on a few. Having it work in the debugger and not in standalone is not unexpected, it is simply how the tools work. It is likely your code, the hardware, or your tools based on the description you have given so far.
The fastest way to eliminate it being your code or the tools is to disassemble the section and inspect how the passed values and return values are handled. If the return value is optimized out there is your answer.
Are you compiling for a shared C library or static? Perhaps compile for static...

Resources