Determining the proper predefined array size in C? - c

In the following code I have the array size set to 20. In Valgrind the code tests clean. But as soon as I change the size to 30, it gives me errors (showed further below). The part that confuses me is that I can change the value to 40 and the errors go away. Change it to 50, errors again. Then 60 tests clean and so on. Keeps going like that. So I was hoping someone might be able to explain this to me. Because it's not quite coming clear to me despite my best efforts to wrap my head around it. These errors were hard to pinpoint because the code by all appearances was valid.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct record {
int number;
char text[30];
};
int main(int argc, char *argv[])
{
FILE *file = fopen("testfile.bin", "w+");
if (ferror(file)) {
printf("%d: Failed to open file.", ferror(file));
}
struct record rec = { 69, "Some testing" };
fwrite(&rec, sizeof(struct record), 1, file);
if (ferror(file)) {
fprintf(stdout,"Error writing file.");
}
fflush(file);
fclose(file);
}
Valgrind errors:
valgrind --leak-check=full --show-leak-kinds=all\
--track-origins=yes ./fileio
==6675== Memcheck, a memory error detector
==6675== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==6675== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==6675== Command: ./fileio
==6675==
==6675== Syscall param write(buf) points to uninitialised byte(s)
==6675== at 0x496A818: write (in /usr/lib/libc-2.28.so)
==6675== by 0x48FA85C: _IO_file_write##GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675== by 0x48F9BBE: new_do_write (in /usr/lib/libc-2.28.so)
==6675== by 0x48FB9D8: _IO_do_write##GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675== by 0x48F9A67: _IO_file_sync##GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675== by 0x48EEDB0: fflush (in /usr/lib/libc-2.28.so)
==6675== by 0x109288: main (fileio.c:24)
==6675== Address 0x4a452d2 is 34 bytes inside a block of size 4,096 alloc'd
==6675== at 0x483777F: malloc (vg_replace_malloc.c:299)
==6675== by 0x48EE790: _IO_file_doallocate (in /usr/lib/libc-2.28.so)
==6675== by 0x48FCBBF: _IO_doallocbuf (in /usr/lib/libc-2.28.so)
==6675== by 0x48FBE47: _IO_file_overflow##GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675== by 0x48FAF36: _IO_file_xsputn##GLIBC_2.2.5 (in /usr/lib/libc-2.28.so)
==6675== by 0x48EFBFB: fwrite (in /usr/lib/libc-2.28.so)
==6675== by 0x10924C: main (fileio.c:19)
==6675== Uninitialised value was created by a stack allocation
==6675== at 0x109199: main (fileio.c:11)
==6675==
==6675==
==6675== HEAP SUMMARY:
==6675== in use at exit: 0 bytes in 0 blocks
==6675== total heap usage: 2 allocs, 2 frees, 4,648 bytes allocated
==6675==
==6675== All heap blocks were freed -- no leaks are possible
==6675==
==6675== For counts of detected and suppressed errors, rerun with: -v
==6675== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

The problem is that there is padding in the structure to make the int a always aligned by 4 in memory, even within an array of struct records. Now, 20+4 is divisible by 4, and so is 40+4 and 60+4. But 30+4 and 50+4 are not. Hence 2 padding bytes need to be added to make the sizeof (struct record) divisible by 4.
When you're running the code with array size 34, sizeof (struct record) == 36, and bytes 35 and 36 contain indeterminate values - even if the struct record is otherwise fully initialized. What is worse, code that writes indeterminate values can leak sensitive information - the Heartbleed bug being a prime example.
The solution is actually to not write the structure using fwrite. Instead write the members individually - this improves portability too. There isn't much performance difference either, as fwrite buffers the writes and so does fread.
P.S. the road to hell is paved with packed structs, you want to avoid them like plague in generic code.
P.P.S. ferror(file) will almost certainly never be true just after fopen - and in normal failures fopen will return NULL and ferror(NULL) will probably lead to a crash.

[edit]
My answer relates to a weakness in OP's code, yet the Valgrind write(buf) points to uninitialized byte(s) is due to other reasons answered by others.
When the open fails, ferror(file) is undefined behavior (UB).
if (ferror(file)) is not the right test for determining open success.
FILE *file = fopen("testfile.bin", "w+");
// if (ferror(file)) {
// printf("%d: Failed to open file.", ferror(file));
// }
if (file == NULL) {
printf("Failed to open file.");
return -1; // exit code, do not continue
}
I do not see other obvious errors.
ferror(file) is useful to test the result of I/O, not of opening a file.

I initially misinterpreted the valgrind output, so #chux's deserves acceptance. I'll try to put together the best answer I can though.
Checking errors
The first error (the one I didn't immediately consider) is to check the value returned by fopen(3) with ferror(3). The fopen(3) call returns NULL on error (and sets errno), so checking NULL with ferror(3) is wrong.
Serializing a structure on a file.
With the initialization you write all the fields of your structure, but you don't initialize all the memory it covers. Your compiler might for example leave some padding in the structure, in order to get better performance while accessing data. As you write the whole structure on the file, you are actually passing non-initialized data to the fwrite(3) function.
By changing the size of the array you change Valgrind's behaviour. Probably this is due to the fact that the compiler changes the layout of the structure in memory, and it uses a different padding.
Try wiping the rec variable with memset(&rec, 0, sizeof(rec)); and Valgrind should stop complaining. This will only fix the symptom though: since you are serializing binary data, you should mark struct record with __attribute__((packed)).
Initializing memory
Your original initialization is good.
An alternative way of initializing data is to use strncpy(3). Strncpy will accept as parameters a pointer to the destination to write, a pointer to the source memory chunk (where data should be taken from) and the available write size.
By using strncpy(&rec.text, "hello world", sizeof(rec.text) you write "hello world" over the rec.text buffer. But you should pay attention to the termination of the string: strncpy won't write beyond the given size, and if the source string is longer than that, there won't be any string terminator.
Strncpy can be used safely as follows
strncpy(&rec.text, "hello world", sizeof(rec.text) - 1);
rec.text[sizeof(rec.text) - 1] = '\0';
The first line copies "hello world" to the target string. sizeof(rec.text) - 1 is passed as size, so that we leave room for the \0 terminator, which is written explicitly as last character to cover the case in which sizeof(rec.text) is shorter than "hello world".
Nitpicks
Finally, error notifications should go to stderr, while stdout is for results.

Related

How to use the "$$" value returned by a PackCC parser?

This is a minimal PackCC grammar example.
I try to retrieve and print the $$ value after parsing. The word is matched but only garbage is displayed by the printf call.
%value "char*"
word <- < [a-z]+[\n]* > {$$ = $1;}
%%
int main(void)
{
char* val = "Value";
// Create a file to parse.
FILE* f = freopen("text.txt", "w", stdin);
if(f != NULL) {
// Write the text to parse.
fprintf(f, "example\n");
// Set the file in read mode.
f = freopen("text.txt", "r", stdin);
pcc_context_t *ctx = pcc_create(NULL);
// I expect val to receive the "$$" value from the parse.
while(pcc_parse(ctx, &val));
printf("val: %s\n",val);
pcc_destroy(ctx);
fclose(f);
}
else {
puts("File is NULL");
}
return 0;
}
The PackCC doc says that $$ is:
The output variable, to which the result of the rule is stored.
And it says that the pcc_parse function:
Parses an input text (from standard input by default) and returns the result in ret. The ret can be NULL if no output data is needed. This function returns 0 if no text is left to be parsed, or a non-0 value otherwise.
There is no problem with your use of $$, in the sense that the char * value stored in $$ by the word action is faithfully returned into val.
The problem is that the char* value is a pointer to dynamically-allocated memory, and by the time the parser returns that dynamically-allocated memory has already been freed. So the pointer returned into val is a dangling pointer, and by the time printf is called, the memory region has been been used for some other object.
The documention for PackCC, such as it is, does not go into any detail about its memory management strategy, so it's not really clear how long the $1 pointer in a rule is valid. I think it would be safest to assume that it is only valid until the end of the last action in the rule. But it is certainly not reasonable to assume that the pointer will outlast a call to pcc_parse. After all, the parser has no way to know that you have stored the pointer outside of the parser context. The parser cannot rely on the programmer to free capture strings produced during rules; having to free every capture, even the ones never used, would be a sever inconvenience. To avoid memory leaks, the parser therefore must free its capture buffers.
The problem is easy to see if you are able to use valgrind or some similar tool. (Valgrind is available for most Linux distributions and for OS X since v10.9.x. Other platforms might be supported.) Running your parser under valgrind produced the following error report (truncated):
$ valgrind --leak-check=full ./test3
==2763== Memcheck, a memory error detector
==2763== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2763== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==2763== Command: ./test3
==2763==
==2763== Invalid read of size 1
==2763== at 0x4C34CF2: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2763== by 0x4E9B5D2: vfprintf (vfprintf.c:1643)
==2763== by 0x4F7017B: __printf_chk (printf_chk.c:35)
==2763== by 0x10A32D: printf (stdio2.h:104)
==2763== by 0x10A32D: main (test3.c:1013)
==2763== Address 0x5232e20 is 0 bytes inside a block of size 9 free'd
==2763== at 0x4C32D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2763== by 0x109498: pcc_capture_table__term (test3.c:339)
==2763== by 0x1096E3: pcc_thunk_chunk__destroy (test3.c:441)
==2763== by 0x10974F: pcc_lr_answer__destroy (test3.c:557)
==2763== by 0x109818: pcc_lr_memo_map__term (test3.c:602)
==2763== by 0x10985F: pcc_lr_table_entry__destroy (test3.c:619)
==2763== by 0x109BB8: pcc_lr_table__shift (test3.c:680)
==2763== by 0x109C1C: pcc_commit_buffer (test3.c:757)
==2763== by 0x10A22C: pcc_parse (test3.c:986)
==2763== by 0x10A314: main (test3.c:1011)
==2763== Block was alloc'd at
==2763== at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2763== by 0x108C9D: pcc_malloc_e (test3.c:225)
==2763== by 0x108FF3: pcc_strndup_e (test3.c:252)
==2763== by 0x109038: pcc_get_capture_string (test3.c:764)
==2763== by 0x10904E: pcc_action_word_0 (test3.c:892)
==2763== by 0x108C56: pcc_do_action (test3.c:872)
==2763== by 0x108C87: pcc_do_action (test3.c:875)
==2763== by 0x10A224: pcc_parse (test3.c:983)
==2763== by 0x10A314: main (test3.c:1011)
That's a lot to go through, but it shows that there was an attempt to use the first byte of a 9-byte dynamically-allocated memory region which has already been free'd. ("Address 0x5232e20 is 0 bytes inside a block of size 9 free'd".) Furthermore, the backtrace shows that the error was triggered by a call to strlen, which had been called by printf; printf was called from your main function. (Unfortunately, PackCC does not issue #line directives, making it impossible to correlate the line numbers in the generated C parser with the line numbers in the original PEG grammar file. However, in this case it's clear where the printf is, since there's really only one possibility inside the main function.) Valgrind also shows you where the memory was dynamically allocated; although you'd have to have a copy of the generated parser handy to see how all the parts fit together, the names of the functions in the call trace are somewhat helpful.
The solution is basically the same as the way you must handle yytext in a parser which relies on (f)lex-based scanners: since the string pointed to by the action is in memory which whose lifetime is about to end, any token which you want to use later must be copied. The simplest way to do that is to use strdup (or equivalent, if you're not able to use standard Posix interfaces), changing the action to:
word <- < [a-z]+[\n]* > {$$ = strdup($1);}
Once you do this, the "word" example will be printed as expected (including the newline character which terminates it).
You also must remember to free the copies you have made in order to avoid leaking memory. Valgrind will also help you detect memory leaks, so it can help you catch errors resulting from forgetting to do so.

Using Dynamic Memory get every second element in a char array into another

I've been tasked with getting writing a function that uses dynamic memory and will take a string s and pull out every second element of the string, and then return a new string with those elements. So far my code is:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
char* skipping(const char* s)
{
int inc = 0; //new list incrementer
int len = strlen(s);
char* new_s = malloc(len + 1);
for (int i = 0; i < len - 1; i+=2) {
new_s[inc] = s[i];
inc++;
}
return new_s;
}
int main(void)
{
char* s = skipping("0123456789");
printf("%s\n", s);
free(s);
return 0;
}
This works, however when I run it using Valgrind I get told I have an error, which comes from using strlen, but I can't seem to fix it. Any help would be awesome!
Error messages: (in valgrind)
==4596==
==4596== Conditional jump or move depends on uninitialised value(s)
==4596== at 0x4C32D08: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4596== by 0x4EBC9D1: puts (ioputs.c:35)
==4596== by 0x1087B4: main (in /home/ryan/ENCE260/lab6)
==4596==
02468 //this is the expected output
==4596==
==4596== HEAP SUMMARY:
==4596== in use at exit: 0 bytes in 0 blocks
==4596== total heap usage: 2 allocs, 2 frees, 1,035 bytes allocated
==4596==
==4596== All heap blocks were freed -- no leaks are possible
==4596==
==4596== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Why is Valgrind reporting this error?
From the Valgrind online manual on the use of uninitialised or unaddressable values in system calls:
Sources of uninitialised data tend to be:
- Local variables in procedures which have not been initialised.
- The contents of heap blocks (allocated with malloc, new, or a similar function) before you (or a constructor) write something there.
Valgrind will complain if:
the program has written uninitialised junk from the heap block to the standard output.
Since you have used s in printf without null-terminating it, it caused the error.

Still reachable with puts and printf

Valgrind is reporting the still reachable "error" on functions like printf and puts. I really don't know what to do about this. I need to get rid of it since it's a school project and there has to be no errors at all. How do I deal with this? From the report I can see those functions use malloc, but I always thought they handled the memory by themselves, right?
I'm using mac OS X so maybe it's a problem between valgrind and the OS?
SAMPLE CODE: The error appears on any of the puts or printf that are used
void twittear(array_t* array_tweets, hash_t* hash, queue_t* queue_input){
char* user = queue_see_first(queue_input);
char key[1] = "#";
if (!user || user[0] != key[0]) {
puts("ERROR_WRONG_COMAND");
return;
}
queue_t* queue_keys = queue_create();
char* text = join_text(queue_input, queue_keys);
if (!text) {
puts("ERROR_TWEET_TOO_LONG");
queue_destroy(queue_keys, NULL);
return;
}
int id = new_tweet(array_tweets, text);
while (!queue_is_empty(queue_keys))
hash_tweet(hash, queue_dequeue(queue_keys), id);
queue_destroy(queue_keys, NULL);
printf("OK %d\n", id);
}
ERROR:
==1954== 16,384 bytes in 1 blocks are still reachable in loss record 77 of 77
==1954== at 0x47E1: malloc (vg_replace_malloc.c:300)
==1954== by 0x183855: __smakebuf (in /usr/lib/system/libsystem_c.dylib)
==1954== by 0x198217: __swsetup (in /usr/lib/system/libsystem_c.dylib)
==1954== by 0x1B1158: __v2printf (in /usr/lib/system/libsystem_c.dylib)
==1954== by 0x1B16AF: __xvprintf (in /usr/lib/system/libsystem_c.dylib)
==1954== by 0x188B29: vfprintf_l (in /usr/lib/system/libsystem_c.dylib)
==1954== by 0x18696F: printf (in /usr/lib/system/libsystem_c.dylib)
==1954== by 0x1000036F3: twittear (main.c:138)
==1954== by 0x100003C8D: main (main.c:309
Valgrind is somewhat glitchy on Mac OS X; it doesn't have complete suppressions for some system libraries. (That is, it doesn't properly ignore some "expected" leaks.) Results will frequently include some spurious results like this, as well as some buffer overruns resulting from optimizations in functions like memcpy().
My advice? Avoid using valgrind on Mac OS X unless you are very familiar with the tool. Compile and test your application on a Linux system for best results.
This is caused by the stdio library. A "Hello World" program is sufficient to reproduce it - just printf or fprintf to stdout or stderr. The first time you write to a FILE, it uses malloc to allocate an output buffer. This buffer allocation happens inside the __swsetup() function (download LibC source from Apple, you will see this in there; but actually, it is copied from FreeBSD, so many *BSD systems have the same function.) Now, when you call fclose() on the FILE, the buffer will be freed. The issue with standard streams (stdout, stderr), is one normally doesn't close them, as a result this buffer will normally never be freed.
You can make this "leak" go away by adding an fclose() on stdout and/or stderr before terminating your program. But honestly, there is no need to do that, you can just ignore it. This is a fixed sized buffer which is not going to grow in size, so it is not a "leak" as such. Closing stdout/stderr at the end of your program is not achieving anything useful.

Where is the uninitialized value in this function?

I am running a debug-version of my C binary within valgrind, which returns numerous errors of the sort Conditional jump or move depends on uninitialised value(s).
Using the symbol table, valgrind tells me where to look in my program for this issue:
==23899== 11 errors in context 72 of 72:
==23899== Conditional jump or move depends on uninitialised value(s)
==23899== at 0x438BB0: _int_free (in /foo/bar/baz)
==23899== by 0x43CF75: free (in /foo/bar/baz)
==23899== by 0x4179E1: json_tokener_parse_ex (json_tokener.c:593)
==23899== by 0x418DC8: json_tokener_parse (json_tokener.c:108)
==23899== by 0x40122D: readJSONMetadataHeader (metadataHelpers.h:345)
==23899== by 0x4019CB: main (baz.c:90)
I have the following function readJSONMetadataHeader(...) that calls json_tokener_parse():
int readJSONMetadataHeader(...) {
char buffer[METADATA_MAX_SIZE];
json_object *metadataJSON;
int charCnt = 0;
...
/* fill up the `buffer` variable here; basically a */
/* stream of characters representing JSON data... */
...
/* terminate `buffer` */
buffer[charCnt - 1] = '\0';
...
metadataJSON = json_tokener_parse(buffer);
...
}
The function json_tokener_parse() in turn is as follows:
struct json_object* json_tokener_parse(const char *str)
{
struct json_tokener* tok;
struct json_object* obj;
tok = json_tokener_new();
obj = json_tokener_parse_ex(tok, str, -1);
if(tok->err != json_tokener_success)
obj = (struct json_object*)error_ptr(-tok->err);
json_tokener_free(tok);
return obj;
}
Following the trace back to readJSONMetadataHeader(), it seems like the uninitialized value is the char [] (or const char *) variable buffer that is fed to json_tokener_parse(), which in turn is fed to json_tokener_parse_ex().
But the buffer variable gets filled with data and then terminated before the json_tokener_parse() function is called.
So why is valgrind saying this value is uninitialized? What am I missing?
It looks from the valgrind error report as if your application is statically linked (in particular, free appears to be in the main executable, and not libc.so.6).
Valgrind will report bogus errors for statically linked applications.
More precisely, there are intentional "don't care" errors inside libc. When you link the application dynamically, such errors are suppressed by default (via suppressions file that ships with Valgrind).
But when you link your application statically, Valgrind has no clue that the faulty code come from libc.a, and so it reports them.
Generally, statically linking applications on Linux is a bad idea (TM).
Running such application under Valgrind: doubly so: Valgrind will not be able to intercept malloc/free calls, and will effectively catch only uninitialized memory reads, and not heap buffer overflows (or other heap corruption bugs) that it is usually good at.
I don't see charCnt initialized.
To see if it comes from buffer, simply initialize it with = {0}, this also would make your null termination of the buffer obsolete.
Have a look in json_tokener_parse_ex() which you don't show. It's likely it's trying to free something that's not initialized.
buffer[charCnt - 1] = '\0';
This will at least fail if charCnt happens to be zero.

Valgrind errors even though all heap blocks were freed

I have recently developed a habit of running all of my programs through valgrind to check for memory leaks, but most of its results have been a bit cryptic for me.
For my latest run, valgrind -v gave me:
All heap blocks were freed -- no leaks are possible
That means my program's covered for memory leaks, right?
So what does this error mean? Is my program not reading certain memory blocks correctly?
ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 14 from 9)
1 errors in context 1 of 1:
Invalid read of size 4
at 0x804885B: findPos (in /home/a.out)
by 0xADD918: start_thread (pthread_create.c:301)
by 0xA26CCD: clone (clone.S:133)
Address 0x4a27108 is 0 bytes after a block of size 40 alloc'd
at 0x4005BDC: malloc (vg_replace_malloc.c:195)
by 0x804892F: readInput (in /home/a.out)
by 0xADD918: start_thread (pthread_create.c:301)
by 0xA26CCD: clone (clone.S:133)
used_suppression: 14 dl-hack3-cond-1
Also, what are the so-called "suppressed" errors here?
This seems obvious ... but it might be worth pointing out that the "no leaks are possible" message does not mean that your program cannot leak; it just means that it did not leak in the configuration under which it was tested.
If I run the following with valgrind with no command line parameters, it informs me that no leaks are possible. But it does leak if I provide a command line parameter.
int main( int argc, char* argv[] )
{
if ( argc > 1 )
malloc( 5 );
printf( "Enter any command line arg to cause a leak\n" );
}
Yes, you are greatly covered, don't
think that valgrind easily can miss
a leak in user code
your error means that you probably
have a +1 error in indexing an array
variable. the lines that valgrind
tell you should be accurate, so you
should easily find that, provided you compile all your code with -g
suppressed errors are usually from
system libraries, which sometimes have small leaks or undectable things like the state variables of threads. your manual page should list the suppression file that is used by default
Checking for memory leaks is one reason to use valgrind, but I'd say a better reason is to find more serious errors in your code, such as using an invalid array subscript or dereferencing an uninitialized pointer or a pointer to freed memory.
It's good if valgrind tells you that the code paths you exercised while running valgrind didn't result in memory leaks, but don't let that cause you to ignore reports of more serious errors, such as the one you're seeing here.
As other have suggested, rerunning valgrind after compiling with debug information (-g) would be a good next step.
If you are getting below error:-
"Invalid read of size 4"
Are you freeing memory before and then go to next argument?
I am also getting error because in linked list I am freeing memory first and then go to next element.
Below is my code snippet where I am getting error -
void free_memory(Llist **head_ref)
{
Llist *current=NULL;
current=*head_ref;
while(*head_ref != NULL)
{
current=*head_ref;
free(current);
current=NULL;
(*head_ref)=(*head_ref)->next;
}
}
After changes below is my code snippet -
void free_memory(Llist **head_ref)
{
Llist *current=NULL;
current=*head_ref;
while(*head_ref != NULL)
{
current=*head_ref;
(*head_ref)=(*head_ref)->next;
free(current);
current=NULL;
}
}

Resources