Do I need to assign \0 to malloc strings in c? - c

I have the following code to declare and assign a string on the heap.
char *string = malloc(10);
string[9] = '\0';
strncpy(string, "welcometotherealworld", 9);
printf("string: %s\n", string);
Do I have to manually set the \0\ to ensure the string ends? string[9] = '\0';
Or, does strncpy do this for me?

Two things: First malloc(10) reserves 10 bytes, string[10] addresses the eleventh byte, so that is illegal. Second: Yes you have to set string[9] to null, because according to the standard strncpy does not ensure the string is null terminated if the source string is longer than count.

strncpy does not null terminate the destination array if the length of the source string (the second argument) is greater or equal than the value of the third argument.
So here:
strncpy(string, "welcometotherealworld", 9);
strncpy will not null terminate.

welcometotherealworld is definitely longer than 9 characters, so strncpy should not implicitly add a terminating character.

strncpy(dest, source, n) copies at most n bytes from the buffer pointed to by source into the buffer pointer to by dest. However, if strlen(source) is greater than n, then strncpy will simply copy the first n bytes and will not terminate the string dest with a null byte because there is no space for it. Therefore to ensure that the buffer source is always null terminated, you must do it yourself. What you are doing will always keep your buffer pointed to by string null terminated.

strcpy will always terminate the destination string with a \0. strncpy also normally NULL terminates the string, but may not do. It copies a maximum number of bytes (n) of the string, but unfortunately (in terms of a useful ABI) does not copy in a NULL if the number of bytes to be copied (i.e. the length of the source string including the terminating NULL) exceeds the length specified (n) . So, if you copy the string "1234567890" (which is ten characters plus a NULL, so 11) and pass 10 as the last argument to strncpy, you will get an unterminated string of 10 characters.
Here are some safe routes around this:
dest = malloc(10); /* allocate ten bytes */
strncpy (dest, src, 10); /* copy up to 10 bytes */
dest[9] = 0; /* overwrite the 10th byte with a zero if it was not one already */
dest = malloc(10); /* allocate ten bytes */
strncpy (dest, src, 9); /* copy up to 9 bytes */
dest[9] = 0; /* make the 10th byte zero */
dest = calloc(10, 1); /* allocate and zero ten bytes */
strncpy (dest, src, 9); /* copy up to 9 bytes, leaving the NULL in */

To insure a proper '\0' ending, code needs to set the '\0'.
malloc() does not initialize the the data in string.
char *string = malloc(10);
strncpy(string, "welcometotherealworld", 9 /* or 10 */);
string[9] = '\0';
strncpy(char *s1, const char *s2, size_t n) writes n char to s1.
It first uses min(n, strlen(s2)) char from s2.
If more are needed, the remainder of s2 is written with null characters.

Related

memcpy and string literal. does it check and consider null termination -- C

I have a code
char str1[15];
char str2[15];
memcpy(str1,"abcdef",6);
memcpy(str2,"abcdef",6);
so str1 should have null termination at index 7.
but when I do printf("--%d--",strlen(str1)); it prints --9-- which is why its making me think that memcpy is not considering null termination when copy into str1 the string literal `"abcdef".
so shouldnt it also read null termination or is something I did in printf gives me print --9--?
memcpy just copy a number of bytes, whatever they are. In your case, it copies 6 bytes from a string of 6 characters and hence do not copy the null byte at the end of string.
Better code could be written.
Given:
char str1[15];
char str2[15];
char *p = "abcdef";
This will copy "abcdef" and the nul byte to str1 and str2:
memcpy(str1, p, strlen(p) + 1);
memcpy(str2, p, strlen(p) + 1);
But this is not very good because str1 and str2 array could be overflown if the string is to long!
It is much better to use strncpy which copies the string, taking account of the nul terminating byte, the length of the string and the maximum length of the destination:
strncpy(str1, p, sizeof(str1));
strncpy(str2, p, sizeof(str2));
Warning: If there is no null byte among the first n bytes of the source, the string placed in destination will not be null-terminated. See strncpy man page.
void * memcpy( void *destination, const void *source, size_t num ); just copies num bytes of memory, pointed by source, to another memory pointed by destination pointer.
The methods which deal with copying of null-terminated strings are
char * strcpy ( char *destination, const char *source);
char * strncpy ( char *destination, const char *source, size_t num);
Based on your example, you need to use strncpy:
char str1[15];
char str2[15];
strncpy(str1,"abcdef",7);
strncpy(str2,"abcdef",7);
str1 and str2 will hold "abcdef" at the end.
If you just want to copy the whole string up to the size of str1 or str2 then you can do the following
#define STR_LEN 15
char str1[STR_LEN];
char str2[STR_LEN];
strncpy(str1,"abcdef", STR_LEN);
strncpy(str2,"abcdef", STR_LEN);
NOTE:
As the documentation of strncpy states:
No null-character is implicitly appended at the end of destination if source is longer than num. Thus, in this case, destination shall not be considered a null terminated C string (reading it as such would overflow).

Copying array in array of pointer to char

I have an char array and I am trying to copy a part of it (tokenization) into 0th index of array of pointer to char using the strncpy function. But during runtime a segmentation fault occurs.
Code example:
char array[30] = "ls -l";
char* args[10];
strncpy(args[0], array + 0, 2);
char *args[10] has the following declaration:
declare args as array 10 of pointer to char
That is to say, we have an array of uninitialized pointers. We'll need to make those pointers point somewhere first, before trying to place characters there. Remembering that we must NUL terminate ('\0') C strings, we can simultaneously allocate and NUL out space for our string by using calloc.
This will make space for just 'l', 's', and our mandatory '\0'.
char original_command[30] = "ls -l";
char *args[10] = { 0 };
args[0] = calloc(3, sizeof (char));
strncpy(args[0], original_command, 2);
Alternatively, we can use malloc, but we must remember the NUL terminating byte.
args[0] = malloc(3);
strncpy(args[0], original_command, 2);
args[0][2] = '\0';
It's generally a good idea to always initialize our variables - see how we initialize our args array to be full of NULL pointers (0). Makes it very clear they don't point anywhere useful yet.
Also note that strncpy does not place a NUL terminating byte if it was not found in the first n bytes of our source string. This is why it's very important to manually terminate our destination string.
Additionally, any call to an *alloc function must be matched later by a call to free, when we are finished using that memory.
/* Do whatever needs to be done */
/* ... */
free(args[0]);
You need to allocate space for the copied string content; char* args[10] reserves only space for holding the pointer to the content, not for the content itself. And don't forget to reserve space for the string terminating character '\0' then.
args[0] = malloc(2+1);
strncpy(agrs[0],array+0,2+1);
agrs[0][2] = '\0';

C - memcpy with char * with length greater than source string length

I have the following code in C now
int length = 50
char *target_str = (char*) malloc(length);
char *source_str = read_string_from_somewhere() // read a string from somewhere
// with length, say 20
memcpy(target_str, source_str, length);
The scenario is that target_str is initialized with 50 bytes. source_str is a string of length 20.
If I want to copy the source_str to target_str i use memcpy() as above with length 50, which is the size of target_str. The reason I use length in memcpy is that, the source_str can have a max value of length but is usually less than that (in the above example its 20).
Now, if I want to copy till length of source_str based on its terminating character ('\0'), even if memcpy length is more than the index of terminating character, is the above code a right way to do it? or is there an alternative suggestion.
Thanks for any help.
The scenario is that target_str is initialized with 50 bytes. source_str is a string of length 20.
If I want to copy the source_str to target_str i use memcpy() as above with length 50, which is the size of target_str.
currently you ask for memcpy to read 30 characters after the end of the source string because it does not care of a possible null terminator on the source, this is an undefined behavior
because you copy a string you can use strcpy rather than memcpy
but the problem of size can be reversed, I mean the target can be smaller than the source, and without protection you will have again a undefined behavior
so you can use strncpy giving the length of the target, just take care of the necessity to add a final null character in case the target is smaller than the source :
int length = 50
char *target_str = (char*) malloc(length);
char *source_str = read_string_from_somewhere(); // length unknown
strncpy(target_str, source_str, length - 1); // -1 to let place for \0
target_str[length - 1] = 0; // force the presence of a null character at end in case
If I want to copy the source_str to target_str i use memcpy() as above
with length 50, which is the size of target_str. The reason I use
length in memcpy is that, the source_str can have a max value of
length but is usually less than that (in the above example its 20).
It is crucially important to distinguish between
the size of the array to which source_str points, and
the length of the string, if any, to which source_str points (+/- the terminator).
If source_str is certain to point to an array of length 50 or more then the memcpy() approach you present is ok. If not, then it produces undefined behavior when source_str in fact points to a shorter array. Any result within the power of your C implementation may occur.
If source_str is certain to point to a (properly-terminated) C string of no more than length - 1 characters, and if it is its string value that you want to copy, then strcpy() is more natural than memcpy(). It will copy all the string contents, up to and including the terminator. This presents no problem when source_str points to an array shorter than length, so long as it contains a string terminator.
If neither of those cases is certain to hold, then it's not clear what you want to do. The strncpy() function may cover some of those cases, but it does not cover all of them.
Now, if I want to copy till length of source_str based on its terminating character ('\0'), even if memcpy length is more than the index of terminating character, is the above code a right way to do it?
No; you'd be copying the entire content of source_str, even past the null-terminator if it occurs before the end of the allocated space for the string it is pointing to.
If your concern is minimizing the auxiliary space used by your program, what you could do is use strlen to determine the length of source_str, and allocate target_str based on that. Also, strcpy is similar to memcpy but is specifically intended for null-terminated strings (observe that it has no "size" or "length" parameter):
char *target_str = NULL;
char *source_str = read_string_from_somewhere();
size_t len = strlen(source_str);
target_str = malloc(len + 1);
strcpy(target_str, source_str);
// ...
free(target_str);
target_str = NULL;
memcpy is used to copy fixed blocks of memory, so if you want to copy something shorter that is terminated by '\n' you don't want to use memcpy.
There is other functions like strncpy or strlcpy that do similar things.
Best to check what the implementations do. I removed the optimized versions from the original source code for the sake of readability.
This is an example memcpy implementation: https://git.musl-libc.org/cgit/musl/tree/src/string/memcpy.c
void *memcpy(void *restrict dest, const void *restrict src, size_t n)
{
unsigned char *d = dest;
const unsigned char *s = src;
for (; n; n--) *d++ = *s++;
return dest;
}
It's clear that here, both pieces of memory are visited for n times. regardless of the size of source or destination string, which causes copying of memory past your string if it was shorter. Which is bad and can cause various unwanted behavior.
this is strlcpy from: https://git.musl-libc.org/cgit/musl/tree/src/string/strlcpy.c
size_t strlcpy(char *d, const char *s, size_t n)
{
char *d0 = d;
size_t *wd;
if (!n--) goto finish;
for (; n && (*d=*s); n--, s++, d++);
*d = 0;
finish:
return d-d0 + strlen(s);
}
The trick here is that n && (*d = 0) evaluates to false and will break the looping condition and exit early.
Hence this gives you the wanted behaviour.
Use strlen to determine the exact size of source_string and allocate accordingly, remembering to add an extra byte for the null terminator. Here's a full example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *source_str = "string_read_from_somewhere";
int len = strlen(source_str);
char *target_str = malloc(len + 1);
if (!target_str) {
fprintf(stderr, "%s:%d: malloc failed", __FILE__, __LINE__);
return 1;
}
memcpy(target_str, source_str, len + 1);
puts(target_str);
free(target_str);
return 0;
}
Also, there's no need to cast the result of malloc. Don't forget to free the allocated memory.
As mentioned in the comments, you probably want to restrict the size of the malloced string to a sensible amount.

C programming: how to fill an array?

I know strncpy(s1, s2, n) copies n elements of s2 into s1, but that only fills it from the beginning of s1.
For example
s1[10] = "Teacher"
s2[20] = "Assisstant"
strncpy(s1, s2, 2) would yield s1[10] = "As", correct?
Now what if I want s1 to contain "TeacherAs"? How would I do that? Is strncpy the appropriate thing to use in this case?
You can use strcat() to concatenate strings, however you don't want all of the source string copied in this case, so you need to use something like:
size_t len = strlen(s1);
strncpy(s1 + len - 1, s2, 2);
s2[len + 2] = '\0';
(Add terminating nul; thanks #FatalError).
Which is pretty horrible and you need to worry about the amount of space remaining in the destination array. Please note that if s1 is empty that code will break!
There is strncat() (manpage) under some systems, which is much simpler to use:
strncat(s1, s2, 2);
Use strcat.
Make sure your string you're appending to is big enough to hold both strings. In your case it isn't.
From the link above:
char * strcat ( char * destination, const char * source );
Concatenate strings
Appends a copy of the source string to the destination string. The terminating null character in destination is overwritten by the first character of source, and a null-character is included at the end of the new string formed by the concatenation of both in destination.
destination and source shall not overlap.
In order to achieve what you need you have to use strlcat (but beware! it is considered insecure)
strlcat(s1, s2, sizeof(s1));
This will concatenate to s1, part of the s2 string, until the size of s1 is reached (this avoids memory overflow)
then you'll get into s1 the string TeacherAs + a NUL char to terminate it
you need to make sure that you have enough memory is allocated for the resulting string
s1[10]
is not enough space to fit 'TeacherAs'.
from there, you'll want to do something like
//make sure s1 is big enough to hold s1+s2
s1[40]="Teacher";
s2[20]="Assistant";
//how many chars from second string you want to append
int offset = 2;
//allocate a temp buffer
char subbuff[20];
//copy n chars to buffer
memcpy( subbuff, s2, offset );
//null terminate buff
subbuff[offset+1]='\0';
//do the actual cat
strcat(s1,subbuff);
I'd suggest using snprintf(), like:
size_t len = strlen(s1);
snprintf(s1 + len, sizeof(s1) - len, "%.2s", s2);
snprintf() will always nul terminate and won't overrun your buffer. Plus, it's standard as of C99. As a note, this assumes that s1 is an array declared in the current scope so that sizeof works, otherwise you'll need to provide the size.

Fatal error in wchar_t* to char* conversion

Here is a C code that converts a wchar_t* string into a char* string :
wchar_t *myXML = L"<test/>";
size_t length;
char *charString;
size_t i;
length = wcslen(myXML);
charString = (char *)malloc(length);
wcstombs_s(&i, charString, length, myXML, length);
The code compiles but at exectution it detects a fatal error at the last line and stops running.
Now, if I replace the last line with this one :
wcstombs_s(&i, charString, length+1, myXML, length);
I just added +1 to the third argument. Then it works perfectly...
Why is there a need to add this trick ? Or is there a flaw elsewhere in my code ?
You need one extra byte for the '\0' terminator character. wcslen does not include this in the length it returns!
To do this properly, you don't just need to pass length+1 to wcstombs_s but also to malloc:
charString = (char *)malloc(length+1);
wcstombs_s(&i, charString, length+1, myXML, length);
And even then, I suspect it will not work correctly. Not all wide characters can be mapped to a single char, so for non-ASCII characters you will need extra space in the multi-byte string.
DESCRIPTION
The wcslen() function is the wide-character
equivalent of the strlen(3) function. It determines
the length of the wide-character string pointed to by
s, not including the terminating L'\0' character.
The trick is that you should always look for code of the form:
string = malloc(len);
very suspiciously, because both wcslen(3) and strlen(3) return the string length without the nul byte, and malloc(3) must allocate the space with that byte. C kinda sucks sometimes.
So every time you see string = malloc(len); rather than string = malloc(len+1);, be very careful to read how len gets assigned.
char String = (char *)malloc(length + 1);
Ought to do the trick. :)
EDIT:
Better would be to ask wcstombs() for the size to allocate in the first place:
size_t len = wcstombs(NULL,src,0) + 1;
char *dest = malloc(len);
len = wcstombs(dest, src, len);
if (len == -1) /* handle error */ ...
The +1 allocates for the ascii nul, and wcstombs() will report how much memory is required to do the conversion. It'll do the conversion twice, once to keep track of the memory required, and then once to store the result, but it will be MUCH simpler to maintain. The second time, when it stores the result, it will write at most len bytes including the ascii nul.

Resources