malloc puts "garbage" values - c

how can i prevent or bypass the garbage valus malloc puts in my variable?
attached the code and the output!
thanks!
#include <stdio.h>
#include "stdlib.h"
#include <string.h>
int main() {
char* hour_char = "13";
char* day_char = "0";
char* time = malloc(strlen(hour_char)+strlen(day_char)+2);
time = strcat(time,day_char);
time = strcat(time,"-");
time = strcat(time,hour_char);
printf("%s",time);
free(time);
}
this is the output i get:
á[┼0-13

The first strcat is incorrect, because malloc-ed memory is uninitialized. Rather than using strcat for the first write, use strcpy. It makes sense, because initially time does not have a string to which you concatenate anything.
time = strcpy(time, day_char);
time = strcat(time, "-");
time = strcat(time, hour_char);
Better yet, use sprintf:
sprintf(time, "%s-%s", day_char, hour_char);

First of all, quoting C11, chapter 7.22.3.4 (emphasis mine)
The malloc function allocates space for an object whose size is specified by size and
whose value is indeterminate.
So, the content of the memory location is indeterminate. That is the expected behaviour.
Then, the problem starts when you use the same pointer as the argument where a string is expected, i.e, the first argument of strcat().
Quoting chapter 7.24.3.1 (again, emphasis mine)
The strcat function appends a copy of the string pointed to by s2 (including the
terminating null character) to the end of the string pointed to by s1. The initial character
of s2 overwrites the null character at the end of s1.
but, in your case, there's no guarantee of the terminating null in the target, so it causes undefined behavior.
You need to 0-initialize the memory (or, at least the first element of the memory should be a null) before doing so. You can use calloc() which returns a pointer to already 0-initialized memory, or least, do time[0] = '\0';.
On a different note, you can also make use of snprintf() which removes the hassle of initial 0-filling.

strcat expects to get passed a null-terminated C string. You pass random garbage to it.
This can easily be fixed by turning your data into a null-terminated string of length 0.
char* time = malloc(strlen(hour_char)+strlen(day_char)+2);
time[0] = '\0';
time = strcat(time,day_char);

Related

Does setting a string char to null cause a memory leak in C?

This seems like a silly question, but I couldn't find the answer.
Anyways, if you set an arbitrary character to null in a string,
then free the string, does that cause a memory leak?
I suppose my knowledge of how the free function works is limited.
/*
char *
strchr(const char *s, int c);
char *
strrchr(const char *s, int c);
The strchr() function locates the first occurrence of c (converted to a
char) in the string pointed to by s. The terminating null character is
considered part of the string; therefore if c is ‘\0’, the functions
locate the terminating ‘\0’.
The strrchr() function is identical to strchr() except it locates the
last occurrence of c.
*/
char* string = strdup ("THIS IS, A STRING WITH, COMMAS!");
char* ch = strrchr( string, ',' );
*ch = 0;
free( string );
/*
The resulting string should be: "THIS IS, A STRING WITH"
When the string pointer is freed, does this result in a memory leak?
*/
Not a stupid question in my opinion.
TLDR: no you do not cause a memory leak.
Now the longer answer: free has no idea what a string is. If you pass it a char* or an int* it could not care less.
The way malloc and free works is the following: when you call malloc you supply a size and receive a pointer with the promise of that many bytes being reserved on the heap from the position of the pointer onwards. However at that point the size and position are also saved internally in some way (this depends and is an implementation detail).
Now when you call free it does not need to know the size, it can just remove the entry your pointer belongs to together with the size
Addendum: also not every char* points to a string, it just so happens that "abcd" becomes a null terminated char* pointing to the 'a', but a char* itself points to a single char, not multiple
malloc only allocates the chunk of memory and gives you the reference to it. If you do not read or write outside the boundaries of this chunk you can do whatever you want with it.

Pointer arithmetic in C when used as a target array for strcat()

When studying string manipulation in C, I've come across an effect that's not quite what I would have expected with strcat(). Take the following little program:
#include <stdio.h>
#include <string.h>
int main()
{
char string[20] = "abcde";
strcat(string + 1, "fghij");
printf("%s", string);
return 0;
}
I would expect this program to print out bcdefghij. My thinking was that in C, strings are arrays of characters, and the name of an array is a pointer to its first element, i.e., the element with index zero. So the variable string is a pointer to a. But if I calculate string + 1 and use that as the destination array for concatenation with strcat(), I get a pointer to a memory address that's one array element (1 * sizeof(char), in this case) away, and hence a pointer to the b. So my thinking was that the target destination is the array starting with b (and ending with the invisible null character), and to that the fghij is concatenated, giving me bcdefghij.
But that's not what I get - the output of the program is abcdefghij. It's the exact same output as I would get with strcat(string, "fghij"); - the addition of the 1 to string is ignored. I also get the same output with an addition of another number, e.g. strcat(string + 4, "fghij");, for that matter.
Can somebody explain to me why this is the case? My best guess is that it has to do with the binding precedence of the + operator, but I'm not sure about this.
Edit: I increased the size of the original array with char string[20] so that it will, in any case, be big enough to hold the concatenated string. Output is still the same, which I think means the array overflow is not key to my question.
You will get an output of abcdefghij, because your call to strcat hasn't changed the address of string (and nor can you change that – it's fixed for the duration of the block in which it is declared, just like the address of any other variable). What you are passing to strcat is the address of the second element of the string array: but that is still interpreted as the address of a nul-terminated string, to which the call appends the second (source) argument. Appending that second argument's content to string, string + 1 or string + n will produce the same result in the string array, so long as there is a nul-terminator at or after the n index.
To print the value of the string that you actually pass to the strcat call (i.e., starting from the 'b' character), you can save the return value of the call and print that:
#include <stdio.h>
#include <string.h>
int main()
{
char string[20] = "abcde";
char* result = strcat(string + 1, "fghij"); // strcat will return the "string + 1" pointer
printf("%s", result); // bcdefghij
return 0;
}
char string[] = "abcde";
strcat(string + 1, "fghij");
Append five characters to a full string array. Booom. Undefined behavior.
Adding something to a string array is a performance optimization that tells the runtime that the string is known to be at least that many characters long.
You seem to believe that a string is a thing of its own and not an array, and strcat is doing something to its first argument. That's not how that works. Strings are arrays*; and strcat is modifying the array contents.
*Somebody's going to come by and claim that heap allocated strings are not arrays. OP is not dealing with heap yet.
Arrays are non-modibfiable lvalues. For example you may not write
char string[20] = "abcde";
char string2[] = ""fghij"";
string = string2;
Used in expressions arrays with rare exceptions are implicitly converted to pointers to their first elements.
If you will write for example string + 1 then the address of the array will not be changed.
In this call
strcat(string + 1, "fghij");
elements of the array string are being overwritten starting from the second element of the array.
In this statement
printf("%s", string);
there is outputted the whole array starting from its first character (again the array designator used as an argument is converted to a pointer to its first element).
You could write for example
printf("%s", string + 1);
In this case the array is outputted starting from its second element.
These are just two pointers to different parts of the same memory inside the same array. There is nothing in your code which creates a second array. "the name of an array is a pointer to its first element" well, not really, it decays into a pointer to its first element whenever used in an expression. So in case of string + 1, this decay first happens to the string operand and then you get pointer arithmetic afterwards. You can actually never do pointer arithmetic on array types, only on decayed pointers. Details here: Do pointers support "array style indexing"?
As for strcat, it basically does two things: call strlen on the original string to find where it ends, then call strcpy to append the new string at the position where the null terminator was stored. It's the very same thing as typing strcpy(&src[strlen(src)], dst);
Therefore it won't matter if you pass string + 1 or string, because in either case strcat will look for the null terminator and nothing else.

I want to know why this works without having to bind memory for the string

Hello guys I recently picked up C programming and I am stuck at understanding pointers. As far as I understand to store a value in a pointer you have to bind memory (using malloc) the size of the value you want to store. Given this, the following code should not work as I have not allocated 11 bytes of memory to store my string of size 11 bytes and yet for some reason beyond my comprehension it works perfectly fine.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(){
char *str = NULL;
str = "hello world\0";
printf("filename = %s\n", str);
return 0;
}
In this case
str = "hello world\0";
str points to the address of the first element of an array of chars, initialized with "hello world\0". In other words, str points to a "string literal".
By definition, the array is allocated and the address of the first element has to be "valid".
Quoting C11, chapter §6.4.5, String literals
In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals.78) The multibyte character
sequence is then used to initialize an array of static storage duration and length just
sufficient to contain the sequence. For character string literals, the array elements have
type char, and are initialized with the individual bytes of the multibyte character
sequence. [....]
Memory allocation still happens, just not explicitly by you (via memory allocator functions).
That said, the "...\0" at the end is repetitive, as mentioned (in the first statement of the quote) above, by default, the array will be null-terminated.
Using a char variable without malloc is stating that the string you are assigning is read-only. This means that you are creating a pointer to a string constant. "hello world\0" is somewhere in the read-only part of memory and you are just pointing to it.
Now if you want to make changes to the string. Let's say changing the h to H, that would be str[0]='H'. Without malloc that will not be possible to make.
When you declare a string literal in a C program, it is stored in a read-only section of the program code. A statement of the form char *str = "hello"; will assign the address of this string to the char* pointer. However, the string itself (i.e., the characters h, e, l, l and o, plus the \0 string terminator) are still located in read-only memory, so you can't change them at all.
Note that there's no need for you to explicitly add a zero byte terminator to your string declarations. The C compiler will do this for you.
Right. But in this case you are just pointing to a string literal which is placed in the constant memory area. Your pointer is created in the stack area. So you are just pointing to another address. i.e, at the starting address of string literal.
Try using copy the string literal in your pointer variable. Then it will give error because you have not allocated memory. Hope you understand now.
Storage for string literals is set aside at program startup and held until the program exits. This storage may be read-only, and attempting to modify the contents of a string literal results in undefined behavior (it may work, it may crash, it may do something in between).

Is it undefined behavior if the destination string in strcat function is not null terminated?

The following program
// Code has taken from http://ideone.com/AXClWb
#include <stdio.h>
#include <string.h>
#define SIZE1 5
#define SIZE2 10
#define SIZE3 15
int main(void){
char a[SIZE1] = "Hello";
char b[SIZE2] = " World";
char res[SIZE3] = {0};
for (int i=0 ; i<SIZE1 ; i++){
res[i] = a[i];
}
strcat(res, b);
printf("The new string is: %s\n",res);
return 0;
}
has well defined behavior. As per the requirement, source string b is null terminated. But what would be the behavior if the line
char res[SIZE3] = {0}; // Destination string
is replaced with
char res[SIZE3];
Does standard says explicitly about the destination string to be null terminated too?
TL;DR Yes.
Since this is a language-lawyer question, let me add my two cents to it.
Quoting C11, chapter §7.24.3.1/2 (emphas is mine)
char *strcat(char * restrict s1,const char * restrict s2);
The strcat function appends a copy of the string pointed to by s2 (including the
terminating null character) to the end of the string pointed to by s1. The initial character
of s2 overwrites the null character at the end of s1.[...]
and, by definition, a string is null-terminated, quoting §7.1.1/1
A string is a contiguous sequence of characters terminated by and including the first null
character.
So, if the source char array is not null-terminated (i.e., not a string), strcat() may very well go beyond the bounds in search of the end which invokes undefined behavior.
As per your question, char res[SIZE3]; being an automatic local variable, will contain indeterminate value, and if used as the destination of strcat(), will invoke UB.
I think man explicitly says that
Description
The strcat() function appends the src string to the dest string, overwriting the terminating null byte ('\0') at the end of dest, and then adds a terminating null byte. The strings may not overlap, and the dest string must have enough space for the result. If dest is not large enough, program behavior is unpredictable; buffer overruns are a favorite avenue for attacking secure programs.
Enphasis mine
BTW I think strcat starts searching for the null terminator into the dest string before to concatenate the new string, so it is obviously UB, as far as dest string has automatic storage.
In the proposed code
for (int i=0 ; i<SIZE1 ; i++){
res[i] = a[i];
}
Copy 5 chars of a and not the null terminator to res string, so other bytes from 5 to 14 are uninitialized.
Standard also says about safaer implementation strcat-s
K.3.7.2.1 The strcat_s function
Synopsis
#define _ _STDC_WANT_LIB_EXT1_ _ 1
#include <string.h>
errno_t strcat_s(char * restrict s1,
rsize_t s1max,
const char * restrict s2);
Runtime-constraints
2 Let m denote the value s1max - strnlen_s(s1, s1max) upon entry to
strcat_s.
We can see that strlen_s always return them valid size for the dest buffer. From my point of view this implementation was introduced to avoid the UB of the question.
If you leave res uninitialized then, after the copying a into res (in for loop), there's no NUL terminator in res. So, the behaviour of strcat() is undefined if the destination string doesn't contain a NUL byte.
Basically strcat() requires both of its arguments to be strings (i.e. both must contain the terminating NUL byte). Otherwise, it's undefined behaviour. This
is obvious from the description of strcat():
§7.23.3.2, strcat() function
The strcat function appends a copy of the string pointed to by s2
(including the terminating null character) to the end of the string
pointed to by s1. The initial character of s2 overwrites the null
character at the end of s1.
(emphasis mine).
If char res[SIZE3]; is on the stack, it'll have random/undefined stuff in it.
You'll never know whether there'll be a zero byte within res[SIZE3], so yes, strcatting to that is undefined.
If char res[SIZE3]; is an uninitialized global, it'll be all zeros, which will make it behave as an empty c-string, and strcating to it will be safe (as long as SIZE3 is large enough for what you're appending).

About pointers and strcpy() in C

I am practicing allocation memory using malloc() with pointers, but 1 observation about pointers is that, why can strcpy() accept str variable without *:
char *str;
str = (char *) malloc(15);
strcpy(str, "Hello");
printf("String = %s, Address = %u\n", str, str);
But with integers, we need * to give str a value.
int *str;
str = (int *) malloc(15);
*str = 10;
printf("Int = %d, Address = %u\n", *str, str);
it really confuses me why strcpy() accepts str, because in my own understanding, "Hello" will be passed to the memory location of str that will cause some errors.
In C, a string is (by definition) an array of characters. However (whether we realize it all the time or not) we almost always end up accessing arrays using pointers. So, although C does not have a true "string" type, for most practical purposes, the type pointer-to-char (i.e. char *) serves this purpose. Almost any function that accepts or returns a string will actually use a char *. That's why strlen() and strcpy() accept char *. That's why printf %s expects a char *. In all of these cases, what these functions need is a pointer to the first character of the string. (They then read the rest of the string sequentially, stopping when they find the terminating '\0' character.)
In these cases, you don't use an explicit * character. * would extract just the character pointed to (that is, the first character of the string), but you don't want to extract the first character, you want to hand the whole string (that is, a pointer to the whole string) to strcpy so it can do its job.
In your second example, you weren't working with a string at all. (The fact that you used a variable named str confused me for a moment.) You have a pointer to some ints, and you're working with the first int pointed to. Since you're directly accessing one of the things pointed to, that's why you do need the explicit * character.
The * is called indirection or dereference operator.
In your second code,
*str = 10;
assigns the value 10 to the memory address pointed by str. This is one value (i.e., a single variable).
OTOTH, strcpy() copies the whole string all at a time. It accepts two char * parameters, so you don't need the * to dereference to get the value while passing arguments.
You can use the dereference operator, without strcpy(), copying element by element, like
char *str;
str = (char *) malloc(15); //success check TODO
int len = strlen("Hello"); //need string.h header
for (i = 0; i < len; i ++)
*(str+i)= "Hello"[i]; // the * form. as you wanted
str[i] = 0; //null termination
Many string manipulation functions, including strcpy, by convention and design, accept the pointer to the first character of the array, not the pointer to the whole array, even though their values are the same.
This is because their types are different; e.g. a pointer to char[10] has a different type from that of a pointer to char[15], and passing around the pointer to the whole array would be impossible or very clumsy because of this, unless you cast them everywhere or make different functions for different lengths.
For this reason, they have established a convention of passing around a string with the pointer to its first character, not to the whole array, possibly with its length when necessary. Many functions that operate on an array, such as memset, work the same way.
Well, here's what happens in the first snippet :
You are first dynamically allocating 15 bytes of memory, storing this address to the char pointer, which is pointer to a 1-byte sequence of data (a string).
Then you call strcpy(), which iterates over the string and copy characters, byte per byte, into the newly allocated memory space. Each character is a number based on the ASCII table, eg. character a = 97 (take a look at man ascii).
Then you pass this address to printf() which reads from the string, byte per byte, then flush it to your terminal.
In the second snippet, the process is the same, you are still allocating 15 bytes, storing the address in an int * pointer. An int is a 4 byte data type.
When you do *str = 10, you are dereferencing the pointer to store the value 10 at the address pointed by str. Remind what I wrote ahead, you could have done *str = 'a', and this index 0 integer would had the value 97, even if you try to read it as an int. you can event print it if you would.
So why strcpy() can take a int * as parameter? Because it's a memory space where it can write, byte per byte. You can store "Hell" in an int, then "o!" in the next one.
It's just all about usage easiness.
See there is a difference between = operator and the function strcpy.
* is deference operator. When you say *str, it means value at the memory location pointed by str.
Also as a good practice, use this
str = (char *) malloc( sizeof(char)*15 )
It is because the size of a data type might be different on different platforms. Hence use sizeof function to determine its actual size at the run time.

Resources