Malloc only works with valgrind on. How to debug? - c

to put you in some context here, project i test with valgrind is slightly modified version of distcc. The function that fails has not been changed. The exact place in code that is problematic is function dcc_compress_file_lzo1x in compress.c. It goes like this:
int dcc_compress_file_lzo1x(int in_fd,
size_t in_len,
char **out_buf,
size_t *out_len)
{
char *in_buf = NULL;
int ret;
if ((in_buf = malloc(in_len)) == NULL) {
rs_log_error("allocation of %ld byte buffer failed",
(long) in_len);
ret = EXIT_OUT_OF_MEMORY;
goto out;
}
The problem here is malloc in if statement. If I run this program normally it fails randomly(I wrapped whole thing in debug prints), by which i mean that program crashes while doing malloc, without producing error print. On the other hand if i start program with valgrind, everything passes and valgrind doesn't produce anything useful.
I'm not looking for easy answer. I would just like to know how to even debug this, because I'm out of ideas.

Related

Why segfaults occur with string.h functions?

With the same command in my coworker's PC, my program works without the problem.
But in my PC, the program crashes with segfault;
GDB backtrace at core reads as follows:
#0 strrchr () at ../sysdeps/x86_64/strrchr.S:32
32 ../sysdeps/x86_64/strrchr.S: no such file or directory
(gdb) bt
#0 strrchr () at ../sysdeps/x86_64/strrchr.S:32
#1 0x00007f10961236d7 in dirname (path=0x324a47a0 <error: Cannot access memory at address 0x324a47a0>) at dirname.c:31
I'm already compiling the executable with -g -ggdb options.
Odd thing is that.. with valgrind the program works without error in my PC as well.
How can I solve the problem? I've observed that the errors occur only with strrchr, strcmp, strlen, ... string.h functions.
+Edit: the gdb backtrace indicates that the program crashes here:
char* base_dir = dirname(get_abs_name(test_dir));
where get_abs_name is defined as
char* get_abs_name(char* dir) {
char abs_path[PATH_MAX];
char* c = malloc(PATH_MAX*sizeof(char));
realpath(dir, abs_path);
strcpy(c, abs_path);
return c;
}
+Edit2: 'dir' is a path of certain file, like '../program/blabla.jpg'.
Using valgrind,
printf("%s\n", dir)
normally prints '/home/frozenca/path_to_program'.
I can't guess why the program crashes without valgrind..
We cannot know for sure without a Minimal, Complete, and Verifiable example. Your code looks mostly correct (albeit convoluted), except you do not check for errors.
char* get_abs_name(char* dir) {
char abs_path[PATH_MAX];
char* c = malloc(PATH_MAX*sizeof(char)); /* this may return NULL */
realpath(dir, abs_path); /* this may return NULL */
strcpy(c, abs_path);
return c;
}
Now, how could this lead to an error like you see? Well, if malloc returns NULL, you'll get a crash right away in strcpy. But if realpath fails:
The content of abs_path remains undefined.
So strcpy(c, abs_path) will copy undefined content. Which could lead to it copying just one byte if abs_path[0] happens to be \0. But could also lead to massive heap corruption. Which happens depends on unrelated conditions, such as how the program is compiled, and whether some debugging tool such as valgrind is attached.
TL;DR: get into the habit of checking every function that may fail.
char* get_abs_name(char* dir) {
char abs_path[PATH_MAX];
char* c = malloc(PATH_MAX*sizeof(char));
if (!c) { return NULL; }
if (!realpath(dir, abs_path)) {
free(c);
return NULL;
}
strcpy(c, abs_path);
return c;
}
Or, here, you can simplify it alot assuming a GNU system or POSIX.1-2008 system:
char * get_abs_name(const char * dir) {
return realpath(dir, NULL);
}
Note however that either way, in your main program, you also must check that get_abs_name() did not return NULL, otherwise dirname() will crash.
Drop your function entirely and use the return value of realpath(dir, NULL) instead.
Convert type
char* c = malloc(PATH_MAX*sizeof(char));
Thanks!

malloc() Crashing everytime || windbg breaks in with nt!DbgLoadImageSymbols

LPTSTR name = NULL;
DWORD nameLength = 0;
namelength = host->nameLength; // returns 10
name = (LPTSTR) malloc( sizeof(nameLength * sizeof(TCHAR))); //crashes here
I don't understand the reason for its crashing at this point. Could somebody explain why?
Update =*(deleted the next line after the crashing line, had copied it by mistake. was just a commented out line in the code)
UPDATE:
Sorry guys, I had tried all the ways you have described before asking the question. Doesn't work.
I think its some other issue. heres a windows service, calling up the function above (from a dll) when the computer starts, so was doing a remote debugging the dll using windbg ( I break-in using a hard-coded debugbreak, just before the function gets called).
when I am over the malloc step and give a "next step" instruction (F10), it doesn't go to the next step, instead says the client is running, but then suddenly breaks in at nt!DbgLoadImageSymbols with "leave" instruction. Giving a go(F5) after this keeps the machine in a hanged state.
If you crash inside of malloc, then it means that you have previously corrupted the heap (or more accurately, the double-linked lists that organize the heap).
Considering that you have a glaring bug here:
name = (LPTSTR) malloc( sizeof(nameLength * sizeof(TCHAR)));
You should review all of your malloc calls and ensure that you are allocating enough memory. What you have likely done is allocate too small of a buffer, written too much data into the returned pointer corrupting the heap, and then crashed in a later call to malloc.
Since you are on Windows, you can also utilize page-heap verification (via the gflags tool). This will help you catch buffer overwrites when they happen.
Not enough info for an answer, but too much for a comment, sorry. I made a simple main() based as closely on your clues as I can see, with any previously commented errors uncorrected, but extra lines FYI how much memory is allocated. The program compiles and runs without complaint. So your problem has not been properly expressed.
#include <stdio.h>
#include <stdlib.h>
#include <windows.h>
int main(){
LPTSTR name = NULL;
DWORD nameLength = 10;
name = (LPTSTR) malloc( sizeof(nameLength * sizeof(TCHAR)));
if (name) {
printf ("Memory for %d bytes allocated\n", sizeof(nameLength * sizeof(TCHAR)));
free (name);
} else {
printf ("No memory for %d bytes\n", sizeof(nameLength * sizeof(TCHAR)));
}
return 0;
}
Program output:
Memory for 4 bytes allocated
What is clear, is that it's unlikely to be enough memory for whatever you have said is 10.

execvp call in git's source for external shell cmd returns EFAULT (Bad address) errno, seemingly only in 64 bit. Googling reveals nothing

UPDATE: running git diff with valgrind results in
Syscall param execve(argv) points to uninitialised byte(s)
And the output from strace is not fully decoded--i.e., there are hex numbers among the array of strings that is argv.
...
This started out like a superuser problem but it's definitely moved into SO's domain now.
But anyway, here is my original SU post detailing the problem before I looked at the source very much: https://superuser.com/questions/795751/various-methods-of-trying-to-set-up-a-git-diff-tool-lead-to-fatal-cannot-exec
Essentially, following standard procedure to set up vimdiff as a diff tool by setting the external directive under [diff] in .gitconfig leads to this errors like this:
fatal: cannot exec 'git_diff_wrapper': Bad address
external diff died, stopping at HEAD:switch-monitor.sh.
It happens on my Linux Mint 17 64 bit OS, as well as on an Ubuntu 14.04 64 bit OS on a virtualbox VM, but not in an Ubuntu 14.04 32 bit VM...
Googling reveals no similar problems. I've spent a lot of time looking at git's source to figure this out. Bad address is the description returned by strerror for an EFAULT error. Here is a short description of EFAULT from execve manpage:
EFAULT filename points outside your accessible address space
I've tracked down how the error message is pieced together by git, and have used that to narrow down the source of the problem quite a bit. Let's start here:
static int execv_shell_cmd(const char **argv)
{
const char **nargv = prepare_shell_cmd(argv);
trace_argv_printf(nargv, "trace: exec:");
sane_execvp(nargv[0], (char **)nargv);
free(nargv);
return -1;
}
This function should not return control, but it does due to the error. The actual execvp call is in sane_execvp, but perhaps prepare_shell_cmd is of interest, though I don't spot any problems:
static const char **prepare_shell_cmd(const char **argv)
{
int argc, nargc = 0;
const char **nargv;
for (argc = 0; argv[argc]; argc++)
; /* just counting */
/* +1 for NULL, +3 for "sh -c" plus extra $0 */
nargv = xmalloc(sizeof(*nargv) * (argc + 1 + 3));
if (argc < 1)
die("BUG: shell command is empty");
if (strcspn(argv[0], "|&;<>()$`\\\"' \t\n*?[#~=%") != strlen(argv[0])) {
#ifndef GIT_WINDOWS_NATIVE
nargv[nargc++] = SHELL_PATH;
#else
nargv[nargc++] = "sh";
#endif
nargv[nargc++] = "-c";
if (argc < 2)
nargv[nargc++] = argv[0];
else {
struct strbuf arg0 = STRBUF_INIT;
strbuf_addf(&arg0, "%s \"$#\"", argv[0]);
nargv[nargc++] = strbuf_detach(&arg0, NULL);
}
}
for (argc = 0; argv[argc]; argc++)
nargv[nargc++] = argv[argc];
nargv[nargc] = NULL;
return nargv;
}
It doesn't look like they messed up the terminating NULL pointer (the absence of which is known to cause EFAULT).
sane_execvp is pretty straightforward. It's a call to execvp and returns -1 if it fails.
I haven't quite figured out what trace_argv_printf does, though it looks like it might would affect nargv and maybe screw up the terminating NULL pointer? If you'd like me to include it in this post, let me know.
I have been unable to reproduce an EFAULT with execvp in my own C code thus far.
This is git 1.9.1, and the source code is available here: https://www.kernel.org/pub/software/scm/git/git-1.9.1.tar.gz
Any thoughts on how to move forward?
Thanks
Answer (copied from comments): it seems to be a bug in git 1.9.1. The (old) diff.c code, around line 2910-2930 or so, fills in an array of size 10 with arguments, before calling the run-command code. But in one case it puts in ten actual arguments and then an 11th NULL. Depending on the whims of the compiler, the NULL may get overwritten with some other local variable (or the NULL might overwrite something important).
Changing the array to size 11 should fix the problem. Or just update to a newer git (v2.0.0 or later); Jeff King replaced the hard-coded array with a dynamic one, in commits 82fbf269b9994d172719b2d456db5ef8453b323d and ae049c955c8858899467f6c5c0259c48a5294385.
Note: another possible cause for bad address is the use of run-command.c#exists_in_PATH() by run-command.c#sane_execvp().
This is fixed with Git 2.25.1 (Feb. 2020).
See commit 63ab08f (07 Jan 2020) by brian m. carlson (bk2204).
(Merged by Junio C Hamano -- gitster -- in commit 42096c7, 22 Jan 2020)
run-command: avoid undefined behavior in exists_in_PATH
Noticed-by: Miriam R.
Signed-off-by: brian m. carlson
In this function, we free the pointer we get from locate_in_PATH and then check whether it's NULL.
However, this is undefined behavior if the pointer is non-NULL, since the C standard no longer permits us to use a valid pointer after freeing it.
The only case in which the C standard would permit this to be defined behavior is if r were NULL, since it states that in such a case "no action occurs" as a result of calling free.
It's easy to suggest that this is not likely to be a problem, but we know that GCC does aggressively exploit the fact that undefined behavior can never occur to optimize and rewrite code, even when that's contrary to the expectations of the programmer.
It is, in fact, very common for it to omit NULL pointer checks, just as we have here.
Since it's easy to fix, let's do so, and avoid a potential headache in the future.
So instead of:
static int exists_in_PATH(const char *file)
{
char *r = locate_in_PATH(file);
free(r);
return r != NULL;
}
You now have:
static int exists_in_PATH(const char *file)
{
char *r = locate_in_PATH(file);
int found = r != NULL;
free(r);
return found;
}

C code runs in Eclipse-Kepler but fails to run in Codeblocks IDE

I am a newbie at C programming and new on Stackoverflow as well.
I have some c code that compiles and runs in Eclipse Kepler (Java EE IDE); I installed the C/C++ plugin and Cygwin Gcc compiler for c.
Everything runs ok in Eclipse IDE; however, when my friend tries to run the same code on his Codeblocks IDE, he doesn't get any output. At some point, he got some segmentation error, which we later learned was due to our program accessing memory space that didn't belong to our program.
Codeblocks IDE is using Gcc compiler not cygwin gcc, but I don't think they're that different to cause this sort of problem.
I am aware that C is extremely primitive and non-standardized, but why would my code run in eclipse with cygwin-gcc compiler but not run in Codeblocks IDE with gcc compiler?
Please help, it's important for our class project.
Thanks to all.
[EDIT] Our code is a little large to paste in here but here's a sample code of what would RUN SUCCESSFULLY in eclipse but FAIL in codeblocks, try it yourself if you have codeblocks please:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <string.h>
int main(void) {
char *traceEntry1;
FILE *ifp;
traceEntry1 = malloc(200*sizeof(char));
ifp = fopen("./program.txt", "r");
while (fgets(traceEntry1, 75, ifp))
printf("String input is %s \n", traceEntry1);
fclose(ifp);
}
It simply doesn't give any outputs in codeblocks, sometimes just results in a segmentation fault error.
I have no idea what the problem is.
We need your help please, thanks in advance.
Always and ever test the results of all (revelant) calls. "relevant" at least are those call which return results which are unusable if the call failed.
In the case of the OP's code they are:
malloc()
fopen()
fclose()
A save version of the OP's code could look like this:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int result = EXIT_SUCCESS; /* Be optimistic. */
char * traceEntry1 = NULL; /* Always initialise your variables, you might remove this during the optimisation phase later (if ever) */
FILE * ifp = NULL; /* Always initialise your variables, you might remove this during the optimisation phase later (if ever) */
traceEntry1 = malloc(200 * sizeof(*traceEntry1)); /* sizeof(char) is always 1.
Using the dereferenced target (traceEntry1) on the other hand makes this
line of code robust against modifications of the target's type declaration. */
if (NULL == traceEntry1)
{
perror("malloc() failed"); /* Log error. Never ignore useful and even free informaton. */
result = EXIT_FAILURE; /* Flag error and ... */
goto lblExit; /* ... leave via the one and only exit point. */
}
ifp = fopen("./program.txt", "r");
if (NULL == ifp)
{
perror("fopen() failed"); /* Log error. Never ignore useful and even free informaton. */
result = EXIT_FAILURE; /* Flag error ... */
goto lblExit; /* ... and leave via the one and only exit point. */
}
while (fgets(traceEntry1, 75, ifp)) /* Why 75? Why not 200 * sizeof(*traceEntry1)
as that's what was allocated to traceEntr1? */
{
printf("String input is %s \n", traceEntry1);
}
if (EOF == fclose(ifp))
{
perror("fclose() failed");
/* Be tolerant as no poisened results are returned. So do not flag error. It's logged however. */
}
lblExit: /* Only have one exit point. So there is no need to code the clean-up twice. */
free(traceEntry1); /* Always clean up, free what you allocated. */
return result; /* Return the outcome of this exercise. */
}

C segmentation faults due to fopen(). How to trace and what to look for?

(this was asked on ffmpeg-devel list, but counted way offtopic, so posting it here).
ffmpeg.c loads multiple .c's, that are using log.c's av_log -> av_log_default_callback function, that uses fputs;
void av_log_default_callback(void* ptr, int level, const char* fmt, va_list vl)
{
...
snprintf(line, sizeof(line), "[%s # %p] ", (*parent)->item_name(parent), parent);
... call to colored_fputs
Screen output:
static void colored_fputs(int level, const char *str){
...
fputs(str, stderr);
// this causes sigsegv just by fopen()
FILE * xFile;
xFile = fopen('yarr', 'w');
//fputs(str, xFile);fclose(xFile); // compile me. BOOM!
av_free(xFile); // last idea that came, using local free() version to avoid re-creatio
Each time, when fopen is put into code, it gives a segmentation fault of unknown reason. Why this kind of thing may happen here? Maybe due to blocking main I/O?
What are general 'blockers' that should be investigated in such a situation? Pthreads (involved in code somewhere)?
fopen takes strings as arguments, you're giving it char literals
xFile = fopen('yarr', 'w');
Should be
xFile = fopen("yarr", "w");
if(xFile == NULL) {
perror("fopen failed");
return;
}
The compiler should have warned about this, so make sure you've turned warning flags on (remeber to read them and fix them)

Resources