C Initialize Character Array from Character Pointer - c

My question should be rather simple.
I need to give a function a char array of a pre-defined length, but I have a character pointer with variable length, but not longer than the length of my array.
Here the code:
#define SIZE_MAX_PERSON_NAME 50
char person[SIZE_MAX_PERSON_NAME];
char* currentPerson = "John";
now how would I get John into the person array but also setting the rest of the array to 0 (/NUL) ?
so that I would have
BINARY DATA: "John/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL....."
in my memory?
sorry if this is overly stupid, but I can't seem to find a solution right now.

First, zero-initialize the fixed-size array :
// Using memset here because I don't know if the whole copy operation can or will be used
// multiple times. We want to be sure that the array is properly zero-initialized if the next
// string to copy is shorter than the previous.
memset(person, '\0', SIZE_MAX_PERSON_NAME);
Then, copy the variable-size string into it :
strcpy(person, currentPerson);
If you are not certain that currentPerson will fit into person :
strncpy(person, currentPerson, SIZE_MAX_PERSON_NAME - 1);
Note that strncpy also zero-initialize the remaining bytes of the array if
strlen(currentPerson) < SIZE_MAX_PERSON_NAME - 1
So you basically have these two options :
memset(person, '\0', SIZE_MAX_PERSON_NAME);
strcpy(person, currentPerson);
Or :
strncpy(person, currentPerson, SIZE_MAX_PERSON_NAME - 1);
person[SIZE_MAX_PERSON_NAME - 1] = '\0';

After this answer was posted the question was retagged from C++ to C.
Use a std::string, like this:
// "using namespace std;" or "using std::string;", then:
string const person = currentPerson;
old_c_function( person.c_str() );
To do things at the C level, which I recommend that you don't, first replace the unnecessary #define with a typed constant:
int const max_person_name_size = 50;
Then zero-initialize your array:
char person[max_person_name_size] = {};
(Note: no silly memset here.)
(Also note: this zeroing is only a preventive measure. You wanted it. But it's not really necessary since strcpy will ensure a trailing zero-byte.)
Then just copy in the string:
assert( strlen( current_person ) < max_person_name_size );
strcpy( person, current_person );
But don't do this. Use std::string instead.
Update: doing other things for some minutes made me realize that this answer, as all the others so far, is completely off the mark. The OP states in a comment elsewhere that
” I've got a function in the library which only takes a character array. Not a character pointer.
Thus, apparently it's all about a misconception.
The only way this can make sense is if the array is modified by the function, and then std::string::c_str() is not a solution. But a std::string can still be used, if its length is set to something sufficient for the C function. Can go like this:
person.resize( max_person_name_size );
foo( &person[0] ); // Assuming foo modifies the array.
person.resize( strlen( person.c_str() ) );

With literal, you may do:
char person[SIZE_MAX_PERSON_NAME] = "John";
if c-string is not a literal, you have to do the copy with strcpy
strcpy(person, currentPerson);

This is the one and only reason for the existence of strncpy:
Putting a string (up to the 0-terminator or buffer end) into a fixed-length array and zeroing out the rest.
This does not ensure 0-termination, thus avoid it for anything else.
7.24.2.4 The strncpy function
#include <string.h>
char *strncpy(char * restrict s1, const char * restrict s2, size_t n);
2 The strncpy function copies not more than n characters (characters that follow a null
character are not copied) from the array pointed to by s2 to the array pointed to by s1.308) If copying takes place between objects that overlap, the behavior is undefined.
3 If the array pointed to by s2 is a string that is shorter than n characters, null characters are appended to the copy in the array pointed to by s1, until n characters in all have been written.
4 The strncpy function returns the value of s1.

Related

Example for strncpy()

If I would use the strncpy function for the strings cat and dog. I don't understand if the \0 character is counted in or not, so I would like to know if the end result will be catdo? or would it be something like cat\0do
strncpy("cat", "dog", 2);
You should not use the strncpy and strncat functions at all.
Their names start with str, but they do not really work with strings. In C, a string is defined as "a character sequence terminated by '\0'". These functions do not guarantee that the resulting character array is always null-terminated.
The better alternatives are strlcpy and strlcat, but these are not available everywhere.
Even better would be a separate string library in which determining the length of a string were a constant-time operation. But that gets distracting.
As torstenvl mentioned, "cat" and "dog" are string literals, so you're not using the function correctly here. The first parameter is the destination, the second parameter is the source, and the third parameter is the number of bytes to copy.
char *strncpy(char *restrict s1, const char *restrict s2, size_t n)
Source: The Open Group Base Specification - strncpy
To answer your specific question: yes; the null terminator is copied to the destination string. n bytes are written, and if your source string s2 is shorter than n bytes, NULL is filled in until n bytes are written.
In your question, it looks like you're trying to append the two strings. To do this in C, you need to first allocate a source string buffer, copy the first string over, then copy the second string, starting from the end of the second string. Depending on where you start the last step, you can end up with either "catdog\0" or "cat\0dog\0". This is another example of the quintessential "off by one" errors.
To start, you have to calculate the length of the two strings you want to append. You can do this using strlen, from string.h. strlen does not count the null-terminator as part of the length, so remember that to get the length of the final string, you'll have to do strlen(s1) + strlen(s2) + 1.
You can then copy the first string over as you normally would. An easy way to do the second copy is to do this:
char* s2start = finalString[strlen(s1) + 1];
You can then do strncpy(s2start, s2, [the size of s2]) and that way you know you're starting right on the s1 null terminator, avoiding the "cat\0dog" error.
Hope this helps, good luck.
When you write out a string like "cat" or "dog" in c, the arrays cannot be changed, if you try it will result in undefined behavior. You can only use these if a function expects const char * input, const is telling you that it cannot/will not be changed in the function. When you write "dog" the data in the character array will look something like this:
{'d','o','g','\0'}
Notice it is NUL terminated.
The function you are using:
char *strncpy(char *dest, const char *src, size_t n)
Copies src to dst with a maximum length of n you cannot copy into "cat" as mentioned above, you can see char *dest is not constant but const char * src is constant. So the source could be "cat" or "dog"
If you were to allocate space for the string you are allowed to modify it:
char cat_str[] = "cat";
now the character array cat_str is initialized to "cat" but we can alway change it, note its length will be 4 (one for each letter plus a NUL) because we did not specify the length. So be sure to not change anything past cat_str[3], you can index it by 0 to 3
There is a common misconception from some static analysis tools that strncpy is a safer version of strcpy. It's not, it has a differnt purpose. If we insist on using it to prevent buffer overflows, you need to be cognisent of the fact that for it's signature
char * strncpy ( char * destination, const char * source, size_t num );
No null-character is implicitly appended at the end of destination if source is longer than num. Thus, in this case, destination shall not be considered a null terminated C string (reading it as such would overflow).
So if you know that your source is a null terminated C string, then you can do the following:
#include <stdio.h>
#include <string.h>
int main()
{
const char* source = "dog";
char destination[4] = "cat";
printf("source is %s\n", source);
printf("destination is %s\n", destination);
/* the strlen+1 accounts for null termination on source */
/* but you need to be sure that source can fit into destination */
/* and still be null terminated - (that's on you the programmer) */
strncpy(destination, source, strlen(source) + 1);
printf("source is still %s\n", source);
printf("destination is now %s\n", destination);
return 0;
}

writing myself strncat, unable to enlarge the size of the array C

as far as I'm concerned, strncat enlarges the size of the array you want to cat.
for example:
char str1[] = "This is str1";
char str2[] = "This is str2";
and here the length of str1 is 12 and str2 is also 12, but when I strncat them, str1 changes from 12 to 24.
I was asked to write strncat by my own, but I can't figure out how to enlarge the size of an array, taking in account that we didn't learn pointers yet.
I tried just putting every char in the end of the array while moving the distance by 1 each iteration, but as you would have thought, it doesn't put the data in the array because there is no such position like this in the array (str[20] when str's length is 10 for example).
Thanks in advance,
every help would be appreciated.
strlen returns the length of the string, that is, counts until the first null character. It does NOT return the size of the memory allocated for str1!
When you concatenatestr2 to str1, you write beyond the memory allocated for str1. That will cause undefined behavior. In your particular case, it seems nothing happens and it even seems that str1 has become larger. That is not so. However (in your paticular case), if str2 follows str1 in memory, you just overwrote str2. Try printing str2. It will probaby print his is str2.
Since strcat() et al. does not enlarge a buffer, your implementation does not have to do it. (And it is simply not possible with the parameter list of strcat().) It is the caller's responsibility to pass a destination buffer big enough.
On the caller's side you can simply create an array big enough and pass its address. However, you can still use variable length arrays (VLA):
char str1[] = "This is str1";
char str2[] = "This is str2";
char str1str2[strlen(str1)+strlen(str2)+1];
strcpy( str1str2, str1 );
yourstrcat( str1str2, str2 );
str1str2 is big enough to store both contents plus 1 for the string terminator \0.
Thanks for everyone, I solved the problem. As some of you said, I don't need to enlarge the string, I just need to make sure it's big enough to contain all the data.
what I did eventually is this:
void strnCat(char dest[], char src[], int length)
{
int i = 0;
int len = strlen(dest);
for(i=0; i < length; i++)
{
dest[len+i] = src[i];
dest[len+i+1] = 0;
}
}
so my main problem was that I forget to add the null at the end of the array to make it a string and that I used strlen(str) instead of saving the length in a variable. I did that because I forgot that there is no end of the string after the null disappears.
It is a really strange task to let students implement strncat, since this is one of the C functions that is very difficult to use correctly.
So to implement it yourself, you should read its specification in the C standard or in the POSIX standard. There you will find that strncat doesn't enlarge any array. By the way, arrays cannot be enlarged in C at all, it's impossible by definition. Note the careful distinction between the words array (can contain arbitrary bytes) and string (must contain one null byte) in the standard wording.
A saner alternative to implement is strlcat, which is not in the C standard but also widely known.

Assign string to element in structure in C

I have this structure:
typedef struct SM_DB
{
LIST_TYPE link;
char name[SM_NAME_SIZE];
} SM_DB_TYPE;
And I would like to assign a string to its 'name'. I am doing so like this:
SM_DB_TYPE one;
one.name = "Alpha";
However, after compiling I get an error: "error C2106: '=' : left operand must be l-value". I am hoping this is fairly obvious. Does anyone know what I am doing wrong?
Thanks
Assuming SM_NAME_SIZE is large enough you could just use strcpy like so:
strcpy(one.name, "Alpha");
Just make sure your destination has enough space to hold the string before doing strcpy your you will get a buffer overflow.
If you want to play it safe you could do
if(!(one.name = malloc(strlen("Alpha") + 1))) //+1 is to make room for the NULL char that terminates C strings
{
//allocation failed
}
strcpy(one.name, "Alpha"); //note that '\0' is not included with Alpha, it is handled by strcpy
//do whatever with one.name
free(one.name) //release space previously allocated
Make sure you free one.name if using malloc so that you don't waste memory.
You can assign value to string only while declaring it. You can not assign it later by using =.
You have to use strcpy() function.
Use strcpy or strncpy to assign strings in C.
C does not have a built in string type. You must use an array of characters to hold the string.
Since C also does not allow the assignment of one array to another, you have to use the various functions in the Standard C Library to copy array elements from one array to another or you have to write a loop to do it yourself. Using the Standard C Library functions is much preferred though there are sometimes reasons to write your own loop.
For standard ANSI type strings used with the char type there are a large number of functions most of which begin with str such as functions to copy or compare strings strcpy(), strcmp(). There are also another set which you specify the maximum number of characters to copy or compare such as strncpy() or strncmp().
A string in C is an array of characters that is terminated by a binary zero character. So if you use a constant string such as "Constant" this will create an array of characters that has one element per character plus an additional element for the zero terminator.
This means that when sizing char arrays you must also remember to add one more extra array element to hold the zero terminator.
The strncpy() function will copy one char array to another up to either the maximum number of characters specified or when the zero terminator is found. If the maximum number of characters is reached then the destination array will not be terminated by a zero terminator so this is something to watch out for.
char one[10];
char two[20];
strncpy (one, "1234567", 10); // copy constant to the char buffer max of 10 chars
one[9] = 0; // make sure the string is zero terminated, it will be this is demo
strcpy (two, one);
strcat (two, " suffix"); // add some more text to the end
There are also functions to work with wide characters used with UNICODE.
Use:
strcpy(one.name, "Alpha"); //Removed null byte (Read first comment by shf301)
Alternative:
typedef struct SM_DB
{
LIST_TYPE link;
char* name;
} SM_DB_TYPE;
SM_DB_TYPE one;
one.name = malloc(sizeof(char) * (strlen("Alpha") + 1); //Allocate memory
if (!one.name) {
/* Error handling */
} else {
strcpy(one.name, "Alpha");
}

Wrong strlen output

I have the following piece of code in C:
char a[55] = "hello";
size_t length = strlen(a);
char b[length];
strncpy(b,a,length);
size_t length2 = strlen(b);
printf("%d\n", length); // output = 5
printf("%d\n", length2); // output = 8
Why is this the case?
it has to be 'b [length +1]'
strlen does not include the null character in the end of c strings.
You never initialized b to anything. Therefore it's contents are undefined. The call to strlen(b) could read beyond the size of b and cause undefined behavior (such as a crash).
b is not initialized: it contains whatever is in your RAM when the program is run.
For the first string a, the length is 5 as it should be "hello" has 5 characters.
For the second string, b you declare it as a string of 5 characters, but you don't initialise it, so it counts the characters until it finds a byte containing the 0 terminator.
UPDATE: the following line was added after I wrote the original answer.
strncpy(b,a,length);
after this addition, the problem is that you declared b of size length, while it should be length + 1 to provision space for the string terminator.
Others have already pointed out that you need to allocate strlen(a)+1 characters for b to be able to hold the whole string.
They've given you a set of parameters to use for strncpy that will (attempt to) cover up the fact that it's not really suitable for the job at hand (or almost any other, truth be told). What you really want is to just use strcpy instead. Also note, however, that as you've allocated it, b is also a local (auto storage class) variable. It's rarely useful to copy a string into a local variable.
Most of the time, if you're copying a string, you need to copy it to dynamically allocated storage -- otherwise, you might as well use the original and skip doing a copy at all. Copying a string into dynamically allocated storage is sufficiently common that many libraries already include a function (typically named strdup) for the purpose. If you're library doesn't have that, it's fairly easy to write one of your own:
char *dupstr(char const *input) {
char *ret = malloc(strlen(input)+1);
if (ret)
strcpy(ret, input);
return ret;
}
[Edit: I've named this dupstr because strdup (along with anything else starting with str is reserved for the implementation.]
Actually char array is not terminated by '\0' so strlen has no way to know where it sh'd stop calculating lenght of string as as
its syntax is int strlen(char *s)-> it returns no. of chars in string till '\0'(NULL char)
so to avoid this this we have to append NULL char (b[length]='\0')
otherwise strlen count char in string passed till NULL counter is encountered

Strings in C: pitfalls and techniques

I will be coaching an ACM Team next month (go figure), and the time has come to talk about strings in C. Besides a discussion on the standard lib, strcpy, strcmp, etc., I would like to give them some hints (something like str[0] is equivalent to *str, and things like that).
Do you know of any lists (like cheat sheets) or your own experience in the matter?
I'm already aware of the books for the ACM competition (which are good, see particularly this), but I'm after tricks of the trade.
Thank you.
Edit: Thank you very much everybody. I will accept the most voted answer, and have duly upvoted others which I think are relevant. I expect to do a summary here (like I did here, asap). I have enough material now and I'm certain this has improved the session on strings immensely. Once again, thanks.
It's obvious but I think it's important to know that strings are nothing more than an array of bytes, delimited by a zero byte.
C strings aren't all that user-friendly as you probably know.
Writing a zero byte somewhere in the string will truncate it.
Going out of bounds generally ends bad.
Never, ever use strcpy, strcmp, strcat, etc.., instead use their safe variants: strncmp, strncat, strndup,...
Avoid strncpy. strncpy will not always zero delimit your string! If the source string doesn't fit in the destination buffer it truncates the string but it won't write a nul byte at the end of the buffer. Also, even if the source buffer is a lot smaller than the destination, strncpy will still overwrite the whole buffer with zeroes. I personally use strlcpy.
Don't use printf(string), instead use printf("%s", string). Try thinking of the consequences if the user puts a %d in the string.
You can't compare strings with if( s1 == s2 )
doStuff(s1);
You have to compare every character in the string. Use strcmp or better strncmp.
if( strncmp( s1, s2, BUFFER_SIZE ) == 0 )
doStuff(s1);
Abusing strlen() will dramatically worsen the performance.
for( int i = 0; i < strlen( string ); i++ ) {
processChar( string[i] );
}
will have at least O(n2) time complexity whereas
int length = strlen( string );
for( int i = 0; i < length; i++ ) {
processChar( string[i] );
}
will have at least O(n) time complexity. This is not so obvious for people who haven't taken time to think of it.
The following functions can be used to implement a non-mutating strtok:
strcspn(string, delimiters)
strspn(string, delimiters)
The first one finds the first character in the set of delimiters you pass in. The second one finds the first character not in the set of delimiters you pass in.
I prefer these to strpbrk as they return the length of the string if they can't match.
str[0] is equivalent to 0[str], or more generally str[i] is i[str] and i[str] is *(str + i).
NB
this is not specific to strings but it works also for C arrays
The strn* variants in stdlib do not necessarily null terminate the destination string.
As an example: from MSDN's documentation on strncpy:
The strncpy function copies the
initial count characters of strSource
to strDest and returns strDest. If
count is less than or equal to the
length of strSource, a null character
is not appended automatically to the
copied string. If count is greater
than the length of strSource, the
destination string is padded with null
characters up to length count.
confuse strlen() with sizeof() when using a string:
char *p = "hello!!";
strlen(p) != sizeof(p)
sizeof(p) yield, at compile time, the size of the pointer (4 or 8 bytes) whereas strlen(p) counts, at runtime, the lenght of the null terminated char array (7 in this example).
strtok is not thread safe, since it uses a mutable private buffer to store data between calls; you cannot interleave or annidate strtok calls also.
A more useful alternative is strtok_r, use it whenever you can.
kmm has already a good list. Here are the things I had problems with when I started to code C.
String literals have an own memory section and are always accessible. Hence they can for example be a return value of function.
Memory management of strings, in particular with a high level library (not libc). Who is responsible to free the string if it is returned by function or passed to a function?
When should "const char *" and when "char *" be used. And what does it tell me if a function returns a "const char *".
All these questions are not too difficult to learn, but hard to figure out if you don't get taught them.
I have found that the char buff[0] technique has been incredibly useful.
Consider:
struct foo {
int x;
char * payload;
};
vs
struct foo {
int x;
char payload[0];
};
see https://stackoverflow.com/questions/295027
See the link for implications and variations
I'd point out the performance pitfalls of over-reliance on the built-in string functions.
char* triple(char* source)
{
int n=strlen(source);
char* dest=malloc(n*3+1);
strcpy(dest,src);
strcat(dest,src);
strcat(dest,src);
return dest;
}
I would discuss when and when not to use strcpy and strncpy and what can go wrong:
char *strncpy(char* destination, const char* source, size_t n);
char *strcpy(char* destination, const char* source );
I would also mention return values of the ansi C stdlib string functions. For example ask "does this if statement pass or fail?"
if (stricmp("StrInG 1", "string 1")==0)
{
.
.
.
}
perhaps you could illustrate the value of sentinel '\0' with following example
char* a = "hello \0 world";
char b[100];
strcpy(b,a);
printf(b);
I once had my fingers burnt when in my zeal I used strcpy() to copy binary data. It worked most of the time but failed mysteriously sometimes. Mystery was revealed when I realized that binary input sometimes contained a zero byte and strcpy() would terminate there.
You could mention indexed addressing.
An elements address is the base address + index * sizeof element
A common error is:
char *p;
snprintf(p, 3, "%d", 42);
it works until you use up to sizeof(p) bytes.. then funny things happens (welcome to the jungle).
Explaination
with char *p you are allocating space for holding a pointer (sizeof(void*) bytes) on the stack. The right thing here is to allocate a buffer or just to specify the size of the pointer at compile time:
char buf[12];
char *p = buf;
snprintf(p, sizeof(buf), "%d", 42);
Pointers and arrays, while having the similar syntax, are not at all the same. Given:
char a[100];
char *p = a;
For the array, a, there is no pointer stored anywhere. sizeof(a) != sizeof(p), for the array it is the size of the block of memory, for the pointer it is the size of the pointer. This become important if you use something like: sizeof(a)/sizeof(a[0]). Also, you can't ++a, and you can make the pointer a 'const' pointer to 'const' chars, but the array can only be 'const' chars, in which case you'd be init it first. etc etc etc
If possible, use strlcpy (instead of strncpy) and strlcat.
Even better, to make life a bit safer, you can use a macro such as:
#define strlcpy_sz(dst, src) (strlcpy(dst, src, sizeof(dst)))

Resources