Flex yy_scan_string memory leaks - c

Description
This is an example created to introduce issue from larger solution. I have to use flex and yy_scan_string(). I have an issue with memory leaks in flex (code below). In this example memory leaks are marked as "still reachable", but in the original solution they get marked as "lost memory".
I think this problem is somewhere in memory allocated by flex internally, which I don't know how to free properly and I can't find any tutorial / documentation for that issue.
file.lex
%option noyywrap
%{
#include <stdio.h>
%}
%%
. printf("%s\n", yytext);
%%
int main() {
printf("Start\n");
yy_scan_string("ABC");
yylex();
printf("Stop\n");
return 0;
}
bash$ flex file.lex
bash$ gcc lex.yy.c
bash$ ./a.out
Start
H
a
l
l
o
W
o
r
l
d
Stop
bash$ valgrind --leak-check=full --show-leak-kinds=all ./a.out
==6351== Memcheck, a memory error detector
==6351== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==6351== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==6351== Command: ./a.out
==6351==
==6351== error calling PR_SET_PTRACER, vgdb might block
Start
A
B
C
Stop
==6351==
==6351== HEAP SUMMARY:
==6351== in use at exit: 77 bytes in 3 blocks
==6351== total heap usage: 4 allocs, 1 frees, 589 bytes allocated
==6351==
==6351== 5 bytes in 1 blocks are still reachable in loss record 1 of 3
==6351== at 0x483577F: malloc (vg_replace_malloc.c:299)
==6351== by 0x10AE84: yyalloc (in ./a.out)
==6351== by 0x10ABE5: yy_scan_bytes (in ./a.out)
==6351== by 0x10ABBC: yy_scan_string (in ./a.out)
==6351== by 0x10AEE2: main (in /home/./a.out)
==6351==
==6351== 8 bytes in 1 blocks are still reachable in loss record 2 of 3
==6351== at 0x483577F: malloc (vg_replace_malloc.c:299)
==6351== by 0x10AE84: yyalloc (in ./a.out)
==6351== by 0x10A991: yyensure_buffer_stack (in ./a.out)
==6351== by 0x10A38D: yy_switch_to_buffer (in ./a.out)
==6351== by 0x10AB8E: yy_scan_buffer (in ./a.out)
==6351== by 0x10AC68: yy_scan_bytes (in ./a.out)
==6351== by 0x10ABBC: yy_scan_string (in ./a.out)
==6351== by 0x10AEE2: main (in ./a.out)
==6351==
==6351== 64 bytes in 1 blocks are still reachable in loss record 3 of 3
==6351== at 0x483577F: malloc (vg_replace_malloc.c:299)
==6351== by 0x10AE84: yyalloc (in ./a.out)
==6351== by 0x10AAEF: yy_scan_buffer (in ./a.out)
==6351== by 0x10AC68: yy_scan_bytes (in ./a.out)
==6351== by 0x10ABBC: yy_scan_string (in ./a.out)
==6351== by 0x10AEE2: main (in ./a.out)
==6351==
==6351== LEAK SUMMARY:
==6351== definitely lost: 0 bytes in 0 blocks
==6351== indirectly lost: 0 bytes in 0 blocks
==6351== possibly lost: 0 bytes in 0 blocks
==6351== still reachable: 77 bytes in 3 blocks
==6351== suppressed: 0 bytes in 0 blocks
==6351==
==6351== For counts of detected and suppressed errors, rerun with: -v
==6351== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

The answer is mentioned in the flex documentation, but you have to read carefully to find it. (See below.)
If you generate a reentrant scanner, then you are responsible for creating and destroying each scanner object you require, and the scanner object manages all of the memory required by a scanner instance. But even if you don't use the reentrant interface, you can use yylex_destroy to manage memory. In the traditional non-reentrant interface, then there is no scanner_t argument, so the prototype of yylex_destroy is simply
int yylex_destroy(void);
(Although it has a return value which is supposed to be a status code, it never returns an error.)
You can, if you want to, call yylex_init, which also takes no arguments in the non-reentrant interface, but unlike the reentrant interface, it is not necessary to call it.
From the manual chapter on memory management:
Flex allocates dynamic memory during initialization, and once in a while from within a call to yylex(). Initialization takes place during the first call to yylex(). Thereafter, flex may reallocate more memory if it needs to enlarge a buffer. As of version 2.5.9 Flex will clean up all memory when you call yylex_destroy See faq-memory-leak.
Example:
$ cat clean.l
%option noinput nounput noyywrap nodefault
%{
#include <stdio.h>
%}
%%
.|\n ECHO;
%%
int main() {
printf("Start\n");
yy_scan_string("ABC");
yylex();
printf("Stop\n");
yylex_destroy();
return 0;
}
$ flex -o clean.c clean.l
$ gcc -Wall -o clean clean.c
$ valgrind --leak-check=full --show-leak-kinds=all ./clean
==16187== Memcheck, a memory error detector
==16187== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16187== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==16187== Command: ./clean
==16187==
Start
ABCStop
==16187==
==16187== HEAP SUMMARY:
==16187== in use at exit: 0 bytes in 0 blocks
==16187== total heap usage: 4 allocs, 4 frees, 1,101 bytes allocated
==16187==
==16187== All heap blocks were freed -- no leaks are possible
==16187==
==16187== For counts of detected and suppressed errors, rerun with: -v
==16187== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Related

C variable initialization valgrind complains

I have a simple question here. I have some variable declarations as follows:
char long_name_VARA[]="TEST -- Gridded 450m daily Evapotranspiration (ET)";
int16 fill_PET_8day=32767;
Given the above valgrind complains for the char declaration as follows:
Invalid write of size 8
==21902== at 0x408166: main (main.c:253)
==21902== Location 0x7fe677840 is 0 bytes inside long_name_VARA[0]
and for the int16 declaration as follows:
==21902== Invalid write of size 2
==21902== at 0x408178: main (main.c:226)
Location 0x7fe677420 is 0 bytes inside local var "fill_PET_8day"
What am i doing wrong in my declarations here?
Also can I not declare a char array like this:
char temp_year[5]={0}
The warning messages you quoted show invalid memory accesses, which happen to hit memory areas belonging to the above two variables. The variables in question are victims of the error, not perpetrators. The variables are not to blame here. Nothing is wrong with the above declarations. Most likely these declarations are not in any way relevant here.
The perpetrators are lines at main.c:253 and main.c:226, which you haven't quoted yet. That's where your problem occurs.
A wild guess would be that you have another object declared after fill_PET_8day (an array?). When working with that other object, you overrun its memory boundary by ~10 bytes, thus clobbering fill_PET_8day and first 8 bytes of long_name_VARA. This is what valgrind is warning you about.
As mentioned, the declarations aren't the issue.
Although FYI, it may be better to declare your constant string as const...
const char* long_name_VARA = "TEST -- Gridded 450m daily Evapotranspiration (ET)";
or even...
const char* const long_name_VARA = "TEST -- Gridded 450m daily Evapotranspiration (ET)";
This prevents the string from being modified in code, (and the pointer).
When I run the following
/* test.c */
#include <stdio.h>
int main(void)
{
char s[8000000];
int x;
s[0] = '\0';
x=5;
printf("%s %d\n",s,x);
return 0;
}
with Valgrind I get
$ gcc test.c
$ valgrind ./a.out
==1828== Memcheck, a memory error detector
==1828== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==1828== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==1828== Command: ./a.out
==1828==
==1828== Warning: client switching stacks? SP change: 0xfff0002e0 --> 0xffe85f0d0
==1828== to suppress, use: --max-stackframe=8000016 or greater
==1828== Invalid write of size 1
==1828== at 0x400541: main (in /home/m/a.out)
==1828== Address 0xffe85f0d0 is on thread 1's stack
==1828== in frame #0, created by main (???)
==1828==
==1828== Invalid write of size 8
==1828== at 0x400566: main (in /home/m/a.out)
==1828== Address 0xffe85f0c8 is on thread 1's stack
==1828== in frame #0, created by main (???)
==1828==
==1828== Invalid read of size 1
==1828== at 0x4E81ED3: vfprintf (vfprintf.c:1642)
==1828== by 0x4E88038: printf (printf.c:33)
==1828== by 0x40056A: main (in /home/m/a.out)
==1828== Address 0xffe85f0d0 is on thread 1's stack
==1828== in frame #2, created by main (???)
==1828==
5
==1828== Invalid read of size 8
==1828== at 0x4E88040: printf (printf.c:37)
==1828== by 0x40056A: main (in /home/m/a.out)
==1828== Address 0xffe85f0c8 is on thread 1's stack
==1828== in frame #0, created by printf (printf.c:28)
==1828==
==1828== Warning: client switching stacks? SP change: 0xffe85f0d0 --> 0xfff0002e0
==1828== to suppress, use: --max-stackframe=8000016 or greater
==1828==
==1828== HEAP SUMMARY:
==1828== in use at exit: 0 bytes in 0 blocks
==1828== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==1828==
==1828== All heap blocks were freed -- no leaks are possible
==1828==
==1828== For counts of detected and suppressed errors, rerun with: -v
==1828== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)
If I give the option Valgrind warns with I get
$ valgrind --max-stackframe=10000000 ./a.out
==1845== Memcheck, a memory error detector
==1845== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==1845== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==1845== Command: ./a.out
==1845==
5
==1845==
==1845== HEAP SUMMARY:
==1845== in use at exit: 0 bytes in 0 blocks
==1845== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==1845==
==1845== All heap blocks were freed -- no leaks are possible
==1845==
==1845== For counts of detected and suppressed errors, rerun with: -v
==1845== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
So the "Invalid reads/writes" are due to the large stack variable without the proper --max-stackframe=... option.

Custom allocator: Valgrind shows 7 allocs, 0 frees, no leaks

I am working on a clone of the malloc (3) functions (malloc, realloc and free for now).
I would like to add support for Valgrind. I'm using these docs. However, after adding calls to the VALGRIND_MEMPOOL_FREE, VALGRIND_MEMPOOL_ALLOC and VALGRIND_CREATE_MEMPOOL macros, I get the following from Valgrind:
==22303== HEAP SUMMARY:
==22303== in use at exit: 0 bytes in 0 blocks
==22303== total heap usage: 7 allocs, 0 frees, 2,039 bytes allocated
==22303==
==22303== All heap blocks were freed -- no leaks are possible
This is despite my realloc calling VALGRIND_MEMPOOL_FREE and my free calling VALGRIND_MEMPOOL_FREE.
What could be the cause of this ?
What could be the cause of this ?
This is due to a bug in valgrind. See the link to the valgrind bug tracker in my comment to your answer.
From the other link in my comment:
A cursory search through the source code indicates that MEMPOOL_ALLOC
calls new_block, which increments cmalloc_n_mallocs, but there is no
corresponding change to cmalloc_n_frees in MEMPOOL_FREE.
/* valgrind.c */
#include <valgrind/valgrind.h>
int main(int argc, char *argv[]) {
char pool[100];
VALGRIND_CREATE_MEMPOOL(pool, 0, 0);
VALGRIND_MEMPOOL_ALLOC(pool, pool, 8);
VALGRIND_MEMPOOL_FREE(pool, pool);
VALGRIND_DESTROY_MEMPOOL(pool);
return 0;
}
$ gcc valgrind.c -g
$ valgrind --leak-check=full --show-leak-kinds=all ./a.out
==10186== Memcheck, a memory error detector
==10186== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==10186== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==10186== Command: ./a.out
==10186==
==10186==
==10186== HEAP SUMMARY:
==10186== in use at exit: 0 bytes in 0 blocks
==10186== total heap usage: 1 allocs, 0 frees, 8 bytes allocated
==10186==
==10186== All heap blocks were freed -- no leaks are possible
==10186==
==10186== For counts of detected and suppressed errors, rerun with: -v
==10186== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
$
Taken from here: http://valgrind.10908.n7.nabble.com/VALGRIND-MEMPOOL-FREE-not-reflected-in-heap-summary-td42789.html
A cursory search through the source code indicates that MEMPOOL_ALLOC
calls new_block, which increments cmalloc_n_mallocs, but there is no
corresponding change to cmalloc_n_frees in MEMPOOL_FREE. Here's a
patch that increments it at the very end of MEMPOOL_FREE. This gives
me the behavior I expect.

Should I free the pointer returned by setlocale?

int main(int argc, char *argv[])
{
char *ret = setlocale(LC_ALL, NULL);
// should I free 'ret' ???
// free(ret);
return 0;
}
I've tried both on Linux and OS X 10.10, on Linux, I must not call 'free', but on OS X, if I do not call 'free', valgrind complains a memory leak.
==62032== Memcheck, a memory error detector
==62032== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==62032== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright info
==62032== Command: ./a.out
==62032==
--62032-- ./a.out:
--62032-- dSYM directory is missing; consider using --dsymutil=yes
==62032==
==62032== HEAP SUMMARY:
==62032== in use at exit: 129,789 bytes in 436 blocks
==62032== total heap usage: 519 allocs, 83 frees, 147,421 bytes allocated
==62032==
==62032== 231 bytes in 1 blocks are definitely lost in loss record 63 of 91
==62032== at 0x10000859B: malloc (in /usr/local/Cellar/valgrind/HEAD/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
==62032== by 0x1001E68C8: currentlocale (in /usr/lib/system/libsystem_c.dylib)
==62032== by 0x100000F6B: main (in ./a.out)
==62032==
==62032== LEAK SUMMARY:
==62032== definitely lost: 231 bytes in 1 blocks
==62032== indirectly lost: 0 bytes in 0 blocks
==62032== possibly lost: 0 bytes in 0 blocks
==62032== still reachable: 94,869 bytes in 10 blocks
==62032== suppressed: 34,689 bytes in 425 blocks
==62032== Reachable blocks (those to which a pointer was found) are not shown.
==62032== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==62032==
==62032== For counts of detected and suppressed errors, rerun with: -v
==62032== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 17 from 17)
So,
in Linux, if I call 'free', it will crash.
in OS X, if I do not call 'free', it has a memory leak.
You should not free the string that you get. According to the C11 standard:
7.11.1.1 The setlocale function
The pointer to string returned by the
setlocale
function is such that a subsequent call
with that string value and its associated category will restore that part of the program’s
locale. The string pointed to shall not be modified by the program, but may be
overwritten by a subsequent call to the
setlocale
function
Additionally, the Linux man pages say:
This string may be allocated in
static storage.
which would result in your program crashing if you tried to free it.
It looks like the Linux implementation uses static storage, but the OSX one uses malloc. Regardless of what's going on under the hood, you should not modify it because the standard disallows you from doing so --- the fact that it would be safe on OSX is an implementation quirk you should ignore. Valgrind is essentially giving you a false positive here.

Valgrind Reports Invalid Realloc

I'm trying to backfill my knowledge of C memory management. I've come from a mostly scripting and managed background, and I want to learn more about C and C++. To that end I've been reading a few books, including one which included this example of using realloc to trim a string of whitespace:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* trim(char* phrase)
{
char* old = phrase;
char* new = phrase;
while(*old == ' ') {
old++;
}
while(*old) {
*(new++) = *(old++);
}
*new = 0;
return (char*)realloc(phrase, strlen(phrase)+1);
}
int main ()
{
char* buffer = (char*)malloc(strlen(" cat")+1);
strcpy(buffer, " cat");
printf("%s\n", trim(buffer));
free(buffer);
buffer=NULL;
return 0;
}
I dutifully copied the example, and compiled with c99 -Wall -Wpointer-arith -O3 -pedantic -march=native. I don't get any compile errors, and the app runs and does what's promised in the book, but when I run it against valgrind I get an error about invalid realloc.
==21601== Memcheck, a memory error detector
==21601== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==21601== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==21601== Command: ./trim
==21601==
==21601== Invalid free() / delete / delete[] / realloc()
==21601== at 0x402B3D8: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==21601== by 0x804844E: main (in /home/mo/programming/learning_pointers/trim)
==21601== Address 0x4202028 is 0 bytes inside a block of size 6 free'd
==21601== at 0x402C324: realloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==21601== by 0x80485A9: trim (in /home/mo/programming/learning_pointers/trim)
==21601== by 0x804842E: main (in /home/mo/programming/learning_pointers/trim)
==21601==
==21601==
==21601== HEAP SUMMARY:
==21601== in use at exit: 4 bytes in 1 blocks
==21601== total heap usage: 2 allocs, 2 frees, 10 bytes allocated
==21601==
==21601== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==21601== at 0x402C324: realloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==21601== by 0x80485A9: trim (in /home/mo/programming/learning_pointers/trim)
==21601== by 0x804842E: main (in /home/mo/programming/learning_pointers/trim)
==21601==
==21601== LEAK SUMMARY:
==21601== definitely lost: 4 bytes in 1 blocks
==21601== indirectly lost: 0 bytes in 0 blocks
==21601== possibly lost: 0 bytes in 0 blocks
==21601== still reachable: 0 bytes in 0 blocks
==21601== suppressed: 0 bytes in 0 blocks
==21601==
==21601== For counts of detected and suppressed errors, rerun with: -v
==21601== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
So please help me understand why it's consider an invalid realloc. Is the example crap? Is there something I'm missing? I know that according to the specs, realloc expects the pointer to have been created previously by malloc, so is it because the realloc is in another function? Or is valgrind confused because they're in separate functions? I'm not a complete idiot (most days), but right now I kind of feel like one for not seeing the issue.
Thanks in advance!
You're trying to free the original pointer, not the reallocd one. You can fix it by:
buffer = trim(buffer)

Don't see line numbers using Valgrind inside Ubuntu (Vagrant+Virtualbox)

I am currently reading and fallowing along the "Learn C The Hard Way"-book. On exercise 4 I have to install Valgrind. I first did this locally on my Macbook running Maverick, but I received a warning that Valgrind might not work 100%.
So now I tried it with Vagrant (using VirtualBox) with an Ubuntu 12.04 box. You can check out the exact Vagrantfile (setup) and the exercise files here on my github repo.
The problem:
I don't see the line numbers and instead I get something like 0x40052B.
I compiled the files by doing the fallowing:
$ make clean # Just to be sure
$ make
$ valgrind --track-origins=yes ./ex4
I pasted the result to pastebin here.
I found the fallowing 3 questions on SO that (partly) describes the same problem, but the answer's and there solutions didn't work for me:
Valgrind not showing line numbers in spite of -g flag (on Ubuntu 11.10/VirtualBox)
How do you get Valgrind to show line errors?
Valgrind does not show line-numbers
What I have tried sofar:
Added libc6-dbg
installed gcc and tried compiling with that instead of cc.
added --track-origins=yes to the valgrind-command
Added (and later removed) compiling with -static and -oO flags
So I am not sure where to go from here? I could try and install the latest (instead of v3.7) off gcc manually although that looked rather difficult.
edit:
#abligh answer seems to be right. I made this with kaleidoscope:
On the left side you see the result of: valgrind --track-origins=yes ./ex4 and on the right side the result of valgrind ./ex4.
I guess I still need to learn allot about c and it's tools.
I think you are just missing them in the output.
Here is some of your output (copied from Pastebin):
==16314== by 0x40052B: main (ex4.c:9)
^^--- LINE NUMBER
==16314== Uninitialised value was created by a stack allocation
==16314== at 0x4004F4: main (ex4.c:4)
^^--- LINE NUMBER
Though I think your invocation is wrong to check memory leaks. I wrote a very simple program to leak one item:
#include <stdio.h>
#include <stdlib.h>
int
main (int argc, char **argv)
{
void *leak;
leak = malloc (1);
printf ("Leaked %p\n", leak);
exit (0);
}
and compiled it using your Makefile:
gcc -Wall -g test.c -o test
Running your command:
$ valgrind --track-origins=yes ./test
==26506== Memcheck, a memory error detector
==26506== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==26506== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==26506== Command: ./test
==26506==
Leaked 0x51f2040
==26506==
==26506== HEAP SUMMARY:
==26506== in use at exit: 1 bytes in 1 blocks
==26506== total heap usage: 1 allocs, 0 frees, 1 bytes allocated
==26506==
==26506== LEAK SUMMARY:
==26506== definitely lost: 0 bytes in 0 blocks
==26506== indirectly lost: 0 bytes in 0 blocks
==26506== possibly lost: 0 bytes in 0 blocks
==26506== still reachable: 1 bytes in 1 blocks
==26506== suppressed: 0 bytes in 0 blocks
==26506== Rerun with --leak-check=full to see details of leaked memory
==26506==
==26506== For counts of detected and suppressed errors, rerun with: -v
==26506== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
But invoking the way I normally invoke it:
$ valgrind --leak-check=full --show-reachable=yes ./test
==26524== Memcheck, a memory error detector
==26524== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==26524== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==26524== Command: ./test
==26524==
Leaked 0x51f2040
==26524==
==26524== HEAP SUMMARY:
==26524== in use at exit: 1 bytes in 1 blocks
==26524== total heap usage: 1 allocs, 0 frees, 1 bytes allocated
==26524==
==26524== 1 bytes in 1 blocks are still reachable in loss record 1 of 1
==26524== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==26524== by 0x40059C: main (test.c:8)
==26524==
==26524== LEAK SUMMARY:
==26524== definitely lost: 0 bytes in 0 blocks
==26524== indirectly lost: 0 bytes in 0 blocks
==26524== possibly lost: 0 bytes in 0 blocks
==26524== still reachable: 1 bytes in 1 blocks
==26524== suppressed: 0 bytes in 0 blocks
==26524==
==26524== For counts of detected and suppressed errors, rerun with: -v
==26524== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
Note the line numbers are again in brackets, e.g.
==26524== by 0x40059C: main (test.c:8)
^^^^^^^^ <- FILENAME AND LINE NUMBER

Resources