C programming: how to fill an array? - c

I know strncpy(s1, s2, n) copies n elements of s2 into s1, but that only fills it from the beginning of s1.
For example
s1[10] = "Teacher"
s2[20] = "Assisstant"
strncpy(s1, s2, 2) would yield s1[10] = "As", correct?
Now what if I want s1 to contain "TeacherAs"? How would I do that? Is strncpy the appropriate thing to use in this case?

You can use strcat() to concatenate strings, however you don't want all of the source string copied in this case, so you need to use something like:
size_t len = strlen(s1);
strncpy(s1 + len - 1, s2, 2);
s2[len + 2] = '\0';
(Add terminating nul; thanks #FatalError).
Which is pretty horrible and you need to worry about the amount of space remaining in the destination array. Please note that if s1 is empty that code will break!
There is strncat() (manpage) under some systems, which is much simpler to use:
strncat(s1, s2, 2);

Use strcat.
Make sure your string you're appending to is big enough to hold both strings. In your case it isn't.
From the link above:
char * strcat ( char * destination, const char * source );
Concatenate strings
Appends a copy of the source string to the destination string. The terminating null character in destination is overwritten by the first character of source, and a null-character is included at the end of the new string formed by the concatenation of both in destination.
destination and source shall not overlap.

In order to achieve what you need you have to use strlcat (but beware! it is considered insecure)
strlcat(s1, s2, sizeof(s1));
This will concatenate to s1, part of the s2 string, until the size of s1 is reached (this avoids memory overflow)
then you'll get into s1 the string TeacherAs + a NUL char to terminate it

you need to make sure that you have enough memory is allocated for the resulting string
s1[10]
is not enough space to fit 'TeacherAs'.
from there, you'll want to do something like
//make sure s1 is big enough to hold s1+s2
s1[40]="Teacher";
s2[20]="Assistant";
//how many chars from second string you want to append
int offset = 2;
//allocate a temp buffer
char subbuff[20];
//copy n chars to buffer
memcpy( subbuff, s2, offset );
//null terminate buff
subbuff[offset+1]='\0';
//do the actual cat
strcat(s1,subbuff);

I'd suggest using snprintf(), like:
size_t len = strlen(s1);
snprintf(s1 + len, sizeof(s1) - len, "%.2s", s2);
snprintf() will always nul terminate and won't overrun your buffer. Plus, it's standard as of C99. As a note, this assumes that s1 is an array declared in the current scope so that sizeof works, otherwise you'll need to provide the size.

Related

C - memcpy with char * with length greater than source string length

I have the following code in C now
int length = 50
char *target_str = (char*) malloc(length);
char *source_str = read_string_from_somewhere() // read a string from somewhere
// with length, say 20
memcpy(target_str, source_str, length);
The scenario is that target_str is initialized with 50 bytes. source_str is a string of length 20.
If I want to copy the source_str to target_str i use memcpy() as above with length 50, which is the size of target_str. The reason I use length in memcpy is that, the source_str can have a max value of length but is usually less than that (in the above example its 20).
Now, if I want to copy till length of source_str based on its terminating character ('\0'), even if memcpy length is more than the index of terminating character, is the above code a right way to do it? or is there an alternative suggestion.
Thanks for any help.
The scenario is that target_str is initialized with 50 bytes. source_str is a string of length 20.
If I want to copy the source_str to target_str i use memcpy() as above with length 50, which is the size of target_str.
currently you ask for memcpy to read 30 characters after the end of the source string because it does not care of a possible null terminator on the source, this is an undefined behavior
because you copy a string you can use strcpy rather than memcpy
but the problem of size can be reversed, I mean the target can be smaller than the source, and without protection you will have again a undefined behavior
so you can use strncpy giving the length of the target, just take care of the necessity to add a final null character in case the target is smaller than the source :
int length = 50
char *target_str = (char*) malloc(length);
char *source_str = read_string_from_somewhere(); // length unknown
strncpy(target_str, source_str, length - 1); // -1 to let place for \0
target_str[length - 1] = 0; // force the presence of a null character at end in case
If I want to copy the source_str to target_str i use memcpy() as above
with length 50, which is the size of target_str. The reason I use
length in memcpy is that, the source_str can have a max value of
length but is usually less than that (in the above example its 20).
It is crucially important to distinguish between
the size of the array to which source_str points, and
the length of the string, if any, to which source_str points (+/- the terminator).
If source_str is certain to point to an array of length 50 or more then the memcpy() approach you present is ok. If not, then it produces undefined behavior when source_str in fact points to a shorter array. Any result within the power of your C implementation may occur.
If source_str is certain to point to a (properly-terminated) C string of no more than length - 1 characters, and if it is its string value that you want to copy, then strcpy() is more natural than memcpy(). It will copy all the string contents, up to and including the terminator. This presents no problem when source_str points to an array shorter than length, so long as it contains a string terminator.
If neither of those cases is certain to hold, then it's not clear what you want to do. The strncpy() function may cover some of those cases, but it does not cover all of them.
Now, if I want to copy till length of source_str based on its terminating character ('\0'), even if memcpy length is more than the index of terminating character, is the above code a right way to do it?
No; you'd be copying the entire content of source_str, even past the null-terminator if it occurs before the end of the allocated space for the string it is pointing to.
If your concern is minimizing the auxiliary space used by your program, what you could do is use strlen to determine the length of source_str, and allocate target_str based on that. Also, strcpy is similar to memcpy but is specifically intended for null-terminated strings (observe that it has no "size" or "length" parameter):
char *target_str = NULL;
char *source_str = read_string_from_somewhere();
size_t len = strlen(source_str);
target_str = malloc(len + 1);
strcpy(target_str, source_str);
// ...
free(target_str);
target_str = NULL;
memcpy is used to copy fixed blocks of memory, so if you want to copy something shorter that is terminated by '\n' you don't want to use memcpy.
There is other functions like strncpy or strlcpy that do similar things.
Best to check what the implementations do. I removed the optimized versions from the original source code for the sake of readability.
This is an example memcpy implementation: https://git.musl-libc.org/cgit/musl/tree/src/string/memcpy.c
void *memcpy(void *restrict dest, const void *restrict src, size_t n)
{
unsigned char *d = dest;
const unsigned char *s = src;
for (; n; n--) *d++ = *s++;
return dest;
}
It's clear that here, both pieces of memory are visited for n times. regardless of the size of source or destination string, which causes copying of memory past your string if it was shorter. Which is bad and can cause various unwanted behavior.
this is strlcpy from: https://git.musl-libc.org/cgit/musl/tree/src/string/strlcpy.c
size_t strlcpy(char *d, const char *s, size_t n)
{
char *d0 = d;
size_t *wd;
if (!n--) goto finish;
for (; n && (*d=*s); n--, s++, d++);
*d = 0;
finish:
return d-d0 + strlen(s);
}
The trick here is that n && (*d = 0) evaluates to false and will break the looping condition and exit early.
Hence this gives you the wanted behaviour.
Use strlen to determine the exact size of source_string and allocate accordingly, remembering to add an extra byte for the null terminator. Here's a full example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *source_str = "string_read_from_somewhere";
int len = strlen(source_str);
char *target_str = malloc(len + 1);
if (!target_str) {
fprintf(stderr, "%s:%d: malloc failed", __FILE__, __LINE__);
return 1;
}
memcpy(target_str, source_str, len + 1);
puts(target_str);
free(target_str);
return 0;
}
Also, there's no need to cast the result of malloc. Don't forget to free the allocated memory.
As mentioned in the comments, you probably want to restrict the size of the malloced string to a sensible amount.

Separate an array in C

So i have a buffer (array) :
char *buf;
buf = malloc(1024);
the buf is like "foo\0bar\0foo\0bar\0\0\0\0\0\0\0..."
it contains strings separated by the null terminator. I need to separate every string. I tried using the strtok() with \0 as the delemiter but of course it didnt work. How can i achieve that? Also afterwards each string needs to be "copied" somewhere else.
You can go through the array and copy every character except the \0 into another array/struct depending on what that "somewhere else" needs to be. So every string would end at \0.
Since what you have is not actually a string but a character array that may contain nulls, you can use the memchr function to search for nulls in the array. Then you can use strncpy or strcpy to copy out the individual strings.
char *p = buf;
char *list[1024];
int cnt = 0;
while (p) {
char *n = memchr(p, 0, 1024 - (p-buf));
if (n) {
list[cnt++] = strdup(p);
} else {
int size = 1024 - (p-buf);
list[cnt] = malloc(size + 1);
strncpy(list[cnt], p, size);
list[cnt++][size] = 0;
}
p = n;
if (p) p++;
}
We start by setting p to the beginning of buf. Then on each iteration, we use memchr to look for the next null byte between p and the end of the array. If we find one, we can treat p as a string and use strdup to allocate space for and duplicate the string. If we don't find a null, we copy the remaining bytes to a newly allocated buffer and manually add a null byte.
Note that you'll need to know how large your buffer is so that you don't read past the end of it.
EDIT:
There was an issue with the code as originally written. After one iteration, p was pointing to a null byte, so memchr would keep returning a pointer to that byte. I added an increment past that byte at the end of the loop so it isn't checked again.

Do I need to assign \0 to malloc strings in c?

I have the following code to declare and assign a string on the heap.
char *string = malloc(10);
string[9] = '\0';
strncpy(string, "welcometotherealworld", 9);
printf("string: %s\n", string);
Do I have to manually set the \0\ to ensure the string ends? string[9] = '\0';
Or, does strncpy do this for me?
Two things: First malloc(10) reserves 10 bytes, string[10] addresses the eleventh byte, so that is illegal. Second: Yes you have to set string[9] to null, because according to the standard strncpy does not ensure the string is null terminated if the source string is longer than count.
strncpy does not null terminate the destination array if the length of the source string (the second argument) is greater or equal than the value of the third argument.
So here:
strncpy(string, "welcometotherealworld", 9);
strncpy will not null terminate.
welcometotherealworld is definitely longer than 9 characters, so strncpy should not implicitly add a terminating character.
strncpy(dest, source, n) copies at most n bytes from the buffer pointed to by source into the buffer pointer to by dest. However, if strlen(source) is greater than n, then strncpy will simply copy the first n bytes and will not terminate the string dest with a null byte because there is no space for it. Therefore to ensure that the buffer source is always null terminated, you must do it yourself. What you are doing will always keep your buffer pointed to by string null terminated.
strcpy will always terminate the destination string with a \0. strncpy also normally NULL terminates the string, but may not do. It copies a maximum number of bytes (n) of the string, but unfortunately (in terms of a useful ABI) does not copy in a NULL if the number of bytes to be copied (i.e. the length of the source string including the terminating NULL) exceeds the length specified (n) . So, if you copy the string "1234567890" (which is ten characters plus a NULL, so 11) and pass 10 as the last argument to strncpy, you will get an unterminated string of 10 characters.
Here are some safe routes around this:
dest = malloc(10); /* allocate ten bytes */
strncpy (dest, src, 10); /* copy up to 10 bytes */
dest[9] = 0; /* overwrite the 10th byte with a zero if it was not one already */
dest = malloc(10); /* allocate ten bytes */
strncpy (dest, src, 9); /* copy up to 9 bytes */
dest[9] = 0; /* make the 10th byte zero */
dest = calloc(10, 1); /* allocate and zero ten bytes */
strncpy (dest, src, 9); /* copy up to 9 bytes, leaving the NULL in */
To insure a proper '\0' ending, code needs to set the '\0'.
malloc() does not initialize the the data in string.
char *string = malloc(10);
strncpy(string, "welcometotherealworld", 9 /* or 10 */);
string[9] = '\0';
strncpy(char *s1, const char *s2, size_t n) writes n char to s1.
It first uses min(n, strlen(s2)) char from s2.
If more are needed, the remainder of s2 is written with null characters.

Fatal error in wchar_t* to char* conversion

Here is a C code that converts a wchar_t* string into a char* string :
wchar_t *myXML = L"<test/>";
size_t length;
char *charString;
size_t i;
length = wcslen(myXML);
charString = (char *)malloc(length);
wcstombs_s(&i, charString, length, myXML, length);
The code compiles but at exectution it detects a fatal error at the last line and stops running.
Now, if I replace the last line with this one :
wcstombs_s(&i, charString, length+1, myXML, length);
I just added +1 to the third argument. Then it works perfectly...
Why is there a need to add this trick ? Or is there a flaw elsewhere in my code ?
You need one extra byte for the '\0' terminator character. wcslen does not include this in the length it returns!
To do this properly, you don't just need to pass length+1 to wcstombs_s but also to malloc:
charString = (char *)malloc(length+1);
wcstombs_s(&i, charString, length+1, myXML, length);
And even then, I suspect it will not work correctly. Not all wide characters can be mapped to a single char, so for non-ASCII characters you will need extra space in the multi-byte string.
DESCRIPTION
The wcslen() function is the wide-character
equivalent of the strlen(3) function. It determines
the length of the wide-character string pointed to by
s, not including the terminating L'\0' character.
The trick is that you should always look for code of the form:
string = malloc(len);
very suspiciously, because both wcslen(3) and strlen(3) return the string length without the nul byte, and malloc(3) must allocate the space with that byte. C kinda sucks sometimes.
So every time you see string = malloc(len); rather than string = malloc(len+1);, be very careful to read how len gets assigned.
char String = (char *)malloc(length + 1);
Ought to do the trick. :)
EDIT:
Better would be to ask wcstombs() for the size to allocate in the first place:
size_t len = wcstombs(NULL,src,0) + 1;
char *dest = malloc(len);
len = wcstombs(dest, src, len);
if (len == -1) /* handle error */ ...
The +1 allocates for the ascii nul, and wcstombs() will report how much memory is required to do the conversion. It'll do the conversion twice, once to keep track of the memory required, and then once to store the result, but it will be MUCH simpler to maintain. The second time, when it stores the result, it will write at most len bytes including the ascii nul.

Copying some strings from pointer array in C++

I have a string pointer like below,
char *str = "This is cool stuff";
Now, I've references to this string pointer like below,
char* start = str + 1;
char* end = str + 6;
So, start and end are pointing to different locations of *str. How can I copy the string chars falls between start and end into a new string pointer. Any existing C++/C function is preferable.
Just create a new buffer called dest and use strncpy
char dest[end-start+1];
strncpy(dest,start,end-start);
dest[end-start] = '\0'
Use STL std::string:
#include
const char *str = "This is cool stuff";
std::string part( str + 1, str + 6 );
This uses iterator range constructor, so the part of the C-string does not have to be zero-terminated.
It's best to do this with strcpy(), and terminate the result yourself. The standard strncpy() function has very strange semantics.
If you really want a "new string pointer", and be a bit safe with regard to lengths and static buffers, you need to dynamically allocate the new string:
char * ranged_copy(const char *start, const char *end)
{
char *s;
s = malloc(end - start + 1);
memcpy(s, start, end - start);
s[end - start] = 0;
return s;
}
If you want to do this with C++ STL:
#include <string>
...
std::string cppStr (str, 1, 6); // copy substring range from 1st to 6th character of *str
const char *newStr = cppStr.c_str(); // make new char* from substring
char newChar[] = new char[end-start+1]]
p = newChar;
while (start < end)
*p++ = *start++;
This is one of the rare cases when function strncpy can be used. Just calculate the number of characters you need to copy and specify that exact amount in the strncpy. Remember that strncpy will not zero-terminate the result in this case, so you'll have to do it yourself (which, BTW, means that it makes more sense to use memcpy instead of the virtually useless strncpy).
And please, do yourself a favor, start using const char * pointers with string literals.
Assuming that end follows the idiomatic semantics of pointing just past the last item you want copied (STL semantics are a useful idiom even if we're dealing with straight C) and that your destination buffer is known to have enough space:
memcpy( buf, start, end-start);
buf[end-start] = '\0';
I'd wrap this in a sub-string function that also took the destination buffer size as a parameter so it could perform a check and truncate the result or return an error to prevent overruns.
I'd avoid using strncpy() because too many programmers forget about the fact that it might not terminate the destination string, so the second line might be mistakenly dropped at some point by someone believing it unnecessary. That's less likely if memcpy() were used. (In general, just say no to using strncpy())

Resources