Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
For an assignment, I have to order a list of students. Each one is represented by a number (string of size 15), his father's lastname (string 20), his mother's lastname (string 20) and his firstname (string 20 also).
I did a program that build from a file the list of students and order it (I use a merge sort to do so).
When I run the program on small number of students (<10 000) everything is fine (no memory leak or anything according to valgrind).
However, as soon as I try to use it on bigger ones (more than 100 000), I get a segmentation fault 11. I investigated with Valgrind and it says the error comes from the strcy or strcasecmp functions, and renders :
==2433== Invalid write of size 8
==2433== at 0x4019BD: merge (sort.c:59)
==2433== by 0x40173B: sortBeginEnd (sort.c:38)
==2433== by 0x4014B0: sortWithoutInterval (sort.c:9)
==2433== by 0x401EE0: firstSort (sort.c:166)
==2433== by 0x4009EB: main (main.c:44)
==2433== Address 0xffe79ac88 is on thread 1's stack
==2433==
==2433==
==2433== Process terminating with default action of signal 11 (SIGSEGV)
==2433== Access not within mapped region at address 0xFFE79AC88
==2433== at 0x4019BD: merge (sort.c:59)
==2433== If you believe this happened as a result of a stack
==2433== overflow in your program's main thread (unlikely but
==2433== possible), you can try to increase the size of the
==2433== main thread stack using the --main-stacksize= flag.
==2433== The main thread stack size used in this run was 8388608.
==2433==
==2433== Process terminating with default action of signal 11 (SIGSEGV)
==2433== Access not within mapped region at address 0xFFE79AC81
==2433== at 0x4A256B0: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-amd64-linux.so)
==2433== If you believe this happened as a result of a stack
==2433== overflow in your program's main thread (unlikely but
==2433== possible), you can try to increase the size of the
==2433== main thread stack using the --main-stacksize= flag.
==2433== The main thread stack size used in this run was 8388608.
==2433==
==2433== HEAP SUMMARY:
==2433== in use at exit: 12,800,101 bytes in 500,007 blocks
==2433== total heap usage: 500,008 allocs, 1 frees, 12,800,669 bytes allocated
==2433==
==2433== LEAK SUMMARY:
==2433== definitely lost: 0 bytes in 0 blocks
==2433== indirectly lost: 0 bytes in 0 blocks
==2433== possibly lost: 0 bytes in 0 blocks
==2433== still reachable: 12,800,101 bytes in 500,007 blocks
==2433== suppressed: 0 bytes in 0 blocks
==2433== Rerun with --leak-check=full to see details of leaked memory
==2433==
==2433== For counts of detected and suppressed errors, rerun with: -v
==2433== ERROR SUMMARY: 7452721 errors from 31 contexts (suppressed: 0 from 0)
Could the error be that I use too much memory (each student represents 79 characters = 316 bytes and I have 100 000 of them so it is 31 600 000 bytes if I am right) ?
PS : i am not really familiar with the concept of stack and heap
EDIT :
"Everything is fine" valgrind report :
==2454==
==2454== HEAP SUMMARY:
==2454== in use at exit: 0 bytes in 0 blocks
==2454== total heap usage: 50,008 allocs, 50,008 frees, 1,280,669 bytes allocated
==2454==
==2454== All heap blocks were freed -- no leaks are possible
==2454==
==2454== For counts of detected and suppressed errors, rerun with: -v
==2454== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
EDIT2 :
The code is available here if you want to check it.
EDIT LAST :
I finally found the solution thanks to #Lundin's answer. The problem was that i was not using a malloc to allocate the temporary arrays for the merge part of the mergeSort.
I will investigate a bit more the question of heap/stack to fully understand the problem.
You aren't even mentioning which system this is for. Because of Valgrind I assume Linux. You don't mention where you allocate the variables. Apparently not on the heap since Valgrid only reports 12.8kb there.
If I remember correctly (and I know very little of Linux) processes have a stack size of roughly 8Mb.
316 * 10000 = 3.16 Mb.
316 * 100000 = 31.60 Mb.
Qualified guess: if you are allocating your variables in any other way than with malloc, then stack overflow is the source of the described problems.
Whenever using large amounts of memory in your program, you must allocate them dynamically on the heap.
the stack is the place, where your function holds its local/temporary data (parameters and local variables). it is organized as stack of papers, so when you call a function, the parameters are put onto the stack, and when the function finishes, everything except the result is discarded from the stack. normally the stack has a limited size.
the heap is the memory, where your allocated data is kept (f.e. malloc()). you can have different heaps (for your application, for each process and system wide)
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I'm new using Valgrind for the first time to check memory errrors. I'm running C program and seeing the errors that are not related to the C program but all the errors are from memory (open64.c:48, _IO_file_open (fileops.c:189), .....). I don't know where these files are located. Could you please help me how to resolve this?
==40910== Memcheck, a memory error detector
==40910== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==40910== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==40910== Command: ./dd
==40910==
==40910== Syscall param openat(filename) points to unaddressable byte(s)
==40910== at 0x4ABCEAB: open (open64.c:48)
==40910== by 0x4A3F195: _IO_file_open (fileops.c:189)
==40910== by 0x4A3F459: _IO_file_fopen##GLIBC_2.2.5 (fileops.c:281)
==40910== by 0x4A31B0D: __fopen_internal (iofopen.c:75)
==40910== by 0x4A31B0D: fopen##GLIBC_2.2.5 (iofopen.c:86)
==40910== by 0x109336: main (in /home/Desktop/dd)
==40910== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==40910==
==40910== Invalid read of size 4
==40910== at 0x4A317D7: fgets (iofgets.c:47)
==40910== by 0x109427: main (in /home/Desktop/dd)
==40910== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==40910==
==40910==
==40910== Process terminating with default action of signal 11 (SIGSEGV)
==40910== Access not within mapped region at address 0x0
==40910== at 0x4A317D7: fgets (iofgets.c:47)
==40910== by 0x109427: main (in /home/Desktop/dd)
==40910== If you believe this happened as a result of a stack
==40910== overflow in your program's main thread (unlikely but
==40910== possible), you can try to increase the size of the
==40910== main thread stack using the --main-stacksize= flag.
==40910== The main thread stack size used in this run was 16777216.
==40910==
==40910== HEAP SUMMARY:
==40910== in use at exit: 984 bytes in 3 blocks
==40910== total heap usage: 4 allocs, 1 frees, 1,456 bytes allocated
==40910==
==40910== LEAK SUMMARY:
==40910== definitely lost: 0 bytes in 0 blocks
==40910== indirectly lost: 0 bytes in 0 blocks
==40910== possibly lost: 0 bytes in 0 blocks
==40910== still reachable: 984 bytes in 3 blocks
==40910== suppressed: 0 bytes in 0 blocks
==40910== Rerun with --leak-check=full to see details of leaked memory
==40910==
==40910== For lists of detected and suppressed errors, rerun with: -s
==40910== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
Without the code this is certainly the easiest question to answer!
"unaddressable" = point to byte that do not belong to you.
valgrind warns you because probably the memory that you freed in memory is not yours (or, at least, it is no longer reserved for the use you had asked for), and then you could be using it for another thing and interpret a value that is not.
Why doesn't it break when you run without valgrind? Good, for starters - that's what you say. For onething your code is not doing appropriate error checking. So it may be breaking inside, so you wont notice it. All I could say is bad coding style may compile and runs without showing you any errors but in the background it maybe suffocating itself or the thing which it is running on.
Address 0x0 is not stack'd, malloc'd or (recently) free'd`
tells you you're dereferencing a NULL pointer (Address 0x0 ...) meaning fopen failed and returned 0/NULL.
Try fixing it? like..
-Check if returned fopen() valid FILE* to avoid undefined behavior when trying to read from input_file.
-Make sure that if fgets() succeeds (does not return NULL) to avoid undefined behavior.
PS: Read "The 8 Commandments for C Programmers"
2. Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end.
6. If a function be advertised to return an error code in the event of difficulties, thou shalt check for that code, yea, even though the checks triple the size of thy code and produce aches in thy typing fingers, for if thou thinkest “it cannot happen to me”, the gods shall surely punish thee for thy arrogance.
Address 0x0 is not stack'd, malloc'd or (recently) free'd
That means you are using a NULL pointer ( NULL = (void*)0 AND 0 = 0x0 in hexadecimal). Try check if a pointer is NULL before using it.
Edit: if you are using "fopen", this function returns NULL if it cannot open the file.
Should I worry about handling the event that user passes SIGINT in the middle of using my program?
The program in question deals with heap allocations and frees, so I am worried that such a situation would cause a memory leak. When I pass SIGINT in the middle of using the program, Valgrind states:
==30173== Process terminating with default action of signal 2 (SIGINT)
==30173== at 0x4ACC142: read (read.c:26)
==30173== by 0x4A4ED1E: _IO_file_underflow##GLIBC_2.2.5 (fileops.c:517)
==30173== by 0x4A41897: getdelim (iogetdelim.c:73)
==30173== by 0x109566: main (main.c:55)
==30173==
==30173== HEAP SUMMARY:
==30173== in use at exit: 1,000 bytes in 1 blocks
==30173== total heap usage: 3 allocs, 2 frees, 3,048 bytes allocated
==30173==
==30173== LEAK SUMMARY:
==30173== definitely lost: 0 bytes in 0 blocks
==30173== indirectly lost: 0 bytes in 0 blocks
==30173== possibly lost: 0 bytes in 0 blocks
==30173== still reachable: 1,000 bytes in 1 blocks
==30173== suppressed: 0 bytes in 0 blocks
==30173== Rerun with --leak-check=full to see details of leaked memory
==30173==
==30173== For lists of detected and suppressed errors, rerun with: -s
==30173== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
The answer is OS-dependent. Most modern operating systems will clean up memory allocated by your process once it is killed (Windows, Linux, *nix in general, and more). This is usually just part of the OS memory isolation and protection system, where each process gets its own virtual memory mapping and the physical pages corresponding to that mapping are allocated / freed by way of reference counting (a killed / exited process will decrement the reference counts to its mapped physical pages and free them if they reach zero).
If you plan on running your process on obscure embedded systems with no such guarantees with respect to memory management, then perhaps you might need to worry about such a thing. Otherwise, if memory management is your only concern, then it's a non-issue.
If you want to account for other things which should happen on exit (e.g. saving state), then you will certainly need to trap SIGINT, likely along with other signals as well.
I wrote a linked list in C today at work on a Linux machine and everything checked out in Valgrind. Then I ran the same test (a handful of pushes and then deleting the list) at home on OS X and got a crazy amount of allocs.
==4344== HEAP SUMMARY:
==4344== in use at exit: 26,262 bytes in 187 blocks
==4344== total heap usage: 267 allocs, 80 frees, 32,374 bytes allocated
==4344==
==4344== LEAK SUMMARY:
==4344== definitely lost: 0 bytes in 0 blocks
==4344== indirectly lost: 0 bytes in 0 blocks
==4344== possibly lost: 0 bytes in 0 blocks
==4344== still reachable: 0 bytes in 0 blocks
==4344== suppressed: 26,262 bytes in 187 blocks
==4344==
==4344== For counts of detected and suppressed errors, rerun with: -v
==4344== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I know the code is fine and doesn't have any leaks. So I just commented out the list test and compiled with only printf("test\n"); in the main, and it showed 263 allocs with 76 frees (I had 4 intentional allocs in the list test). Why am I getting so many allocs on OS X? Is this just something the OS did? I don't understand why I'd have 263 allocs when I just did a printf...
OS X has a very bad architecture. Because libdl, libdyld, libm, libc and some other libraries are "packed" into libSystem, all of them are initialized when the library is loaded. Most of them come from dyld. Dyld is written in C and C++, that's why C++ part may push up number of allocs.
This is only Apple thing, not OS X thing. I have written an alternate C library. It does not have many "not-needed allocs".
Also, allocs are caused by opening FILE *s. Note that 3 streams (stdin, stdout and stderr) are initialized on run.
Valgrind support on OS X is currently being actively worked on. Your best approach is to ensure you are using a SVN trunk build, and update frequently.
The errors Valgrind is reporting to you are present within the OS X system libraries. These are not the fault of your program, but because even simple programs including these system libraries Valgrind continues to pick them up. Suppressions within Valgrind trunk are continually being updated to catch these issues, allowing you to focus on the real problems that may be present within your code.
The following commands will allow you to use Valgrind trunk, if you're not already:
svn co svn://svn.valgrind.org/valgrind/trunk valgrind
cd valgrind
./autogen.sh
./configure
make -j4
sudo make install
Full disclosure: I'm one of the Valgrind developers who contributed patches to support OS X 10.11
The Problem
I've written a php extension (PHP 5.3) which appears to work fine for simple tests but the moment I start making multiple calls it I start seeing the error:
zend_mm_heap corrupted
Normally through a console or apache error log, I also sometimes see the error
[Thu Jun 19 16:12:31.934289 2014] [:error] [pid 560] [client 127.0.0.1:35410] PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 139678164955264 bytes) in Unknown on line 0
What I've tried to do
I've tried find the exact spot where the issue occurs but it appears to occurs between the destructor being called for my php class that calls the extension but before the constructor runs the first line of the constructor (note, I have mainly used phpunit to diagnose this, if I run it in a browser it will usually work once and then throw the error to the log on the next attempt with a 'The connection was reset' in my browser window so no output.
I've tried adding debug lines with memory_get_usage and installing the extension memprof but all output fails to show any serious memory issues and I've never seen a memory usage greater than 8mb.
I've looked at other stack overflow posts with regard to changing php settings to deal with zend_mm_corrupted issue, disabling/enabling garbage collection without any degree of success.
What I'm looking for
I realise that there is not enough information here to possibly know what is causing what I presume to be a memory leak, so what I want to know is what are possible and probable causes of my issue and how can I go about diagnosing this issue to find where the problem is.
Note:
I have tried building my extension with --enable-debug but it comes as unrecognised argument.
Edit: Valgrind
I have run over it with valgrind and got the following output:
--24803-- REDIR: 0x4ebde30 (__GI_strncmp) redirected to 0x4c2dd20 (__GI_strncmp)
--24803-- REDIR: 0x4ec1820 (__GI_stpcpy) redirected to 0x4c2f860 (__GI_stpcpy)
Segmentation fault (core dumped)
==24803==
==24803== HEAP SUMMARY:
==24803== in use at exit: 2,401 bytes in 72 blocks
==24803== total heap usage: 73 allocs, 1 frees, 2,417 bytes allocated
==24803==
==24803== Searching for pointers to 72 not-freed blocks
==24803== Checked 92,624 bytes
==24803==
==24803== LEAK SUMMARY:
==24803== definitely lost: 0 bytes in 0 blocks
==24803== indirectly lost: 0 bytes in 0 blocks
==24803== possibly lost: 0 bytes in 0 blocks
==24803== still reachable: 2,401 bytes in 72 blocks
==24803== suppressed: 0 bytes in 0 blocks
==24803== Reachable blocks (those to which a pointer was found) are not shown.
==24803== To see them, rerun with: --leak-check=full --show-reachable=yes
==24803==
==24803== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
--24803--
--24803-- used_suppression: 2 dl-hack3-cond-1
==24803==
==24803== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
This suggests to me that perhaps the issue isn't a memory leak but am not certain on this.
It appears to me that your program does have heap memory corruption. This is is bit difficult to find by looking out your code snippet or faulty call stack. You may want run your program under some dynamic tools(Valgrind, WindDBG/Pageheap) to track the actual source of error.
$ valgrind --tool=memcheck --db-attach=yes ./a.out
This way Valgrind would attach your program in the debugger when your first memory error is detected so that you can do live debugging(GDB). This should be the best possible way to understand and resolve your problem.
Allowed memory size of 134217728 bytes exhausted (tried to allocate
139678164955264 bytes) in Unknown on line 0
It looks like somewhere in your program signed to unsigned conversion is getting executed. Normally allocators have size parameter of unsigned type so it interpret the negative value to be very large type and under those scenario, allocation would fail.
I have created 20 threads to read/write a shared file.I have synchronized threads.
Now My program works fine but when I run it with valgrind it gives me Errors like this:
LEAK SUMMARY:
**definitely lost: 0 bytes in 0 blocks.
\
**possibly lost: 624 bytes in 5 blocks.**
**still reachable: 1,424 bytes in 5 blocks.****
suppressed: 0 bytes in 0 blocks.
Reachable blocks (those to which a pointer was found) are not shown.
Also When I press Ctrl + c , it gives the same errors.
I have not even malloced anything but still valgrind complains.
Any suggestion would be appreciated .
You can run valgrind --leak-check=full ./prog_name to make sure these reachable blocks are not something you can destroy in your program. Many times initializing a library such as libcurl without closing or destroying it will cause leaks. If it's not something you have control over, you can write a suppression file. http://valgrind.org/docs/manual/mc-manual.html section 4.4 has some info and a link to some examples
Sill reachable blocks are probably caused by your standard library not freeing memory used in pools for standard containers (see this faq): which would be a performance optimisation for program exit, since the memory is immediately going to be returned to the operating system anyway.
"Possibly lost" blocks are probably caused by the same thing.
The Valgrind Manual page for memcheck has a good explanation about the different sorts of leaks detected.