Wrong strlen output - c

I have the following piece of code in C:
char a[55] = "hello";
size_t length = strlen(a);
char b[length];
strncpy(b,a,length);
size_t length2 = strlen(b);
printf("%d\n", length); // output = 5
printf("%d\n", length2); // output = 8
Why is this the case?

it has to be 'b [length +1]'
strlen does not include the null character in the end of c strings.

You never initialized b to anything. Therefore it's contents are undefined. The call to strlen(b) could read beyond the size of b and cause undefined behavior (such as a crash).

b is not initialized: it contains whatever is in your RAM when the program is run.

For the first string a, the length is 5 as it should be "hello" has 5 characters.
For the second string, b you declare it as a string of 5 characters, but you don't initialise it, so it counts the characters until it finds a byte containing the 0 terminator.
UPDATE: the following line was added after I wrote the original answer.
strncpy(b,a,length);
after this addition, the problem is that you declared b of size length, while it should be length + 1 to provision space for the string terminator.

Others have already pointed out that you need to allocate strlen(a)+1 characters for b to be able to hold the whole string.
They've given you a set of parameters to use for strncpy that will (attempt to) cover up the fact that it's not really suitable for the job at hand (or almost any other, truth be told). What you really want is to just use strcpy instead. Also note, however, that as you've allocated it, b is also a local (auto storage class) variable. It's rarely useful to copy a string into a local variable.
Most of the time, if you're copying a string, you need to copy it to dynamically allocated storage -- otherwise, you might as well use the original and skip doing a copy at all. Copying a string into dynamically allocated storage is sufficiently common that many libraries already include a function (typically named strdup) for the purpose. If you're library doesn't have that, it's fairly easy to write one of your own:
char *dupstr(char const *input) {
char *ret = malloc(strlen(input)+1);
if (ret)
strcpy(ret, input);
return ret;
}
[Edit: I've named this dupstr because strdup (along with anything else starting with str is reserved for the implementation.]

Actually char array is not terminated by '\0' so strlen has no way to know where it sh'd stop calculating lenght of string as as
its syntax is int strlen(char *s)-> it returns no. of chars in string till '\0'(NULL char)
so to avoid this this we have to append NULL char (b[length]='\0')
otherwise strlen count char in string passed till NULL counter is encountered

Related

writing myself strncat, unable to enlarge the size of the array C

as far as I'm concerned, strncat enlarges the size of the array you want to cat.
for example:
char str1[] = "This is str1";
char str2[] = "This is str2";
and here the length of str1 is 12 and str2 is also 12, but when I strncat them, str1 changes from 12 to 24.
I was asked to write strncat by my own, but I can't figure out how to enlarge the size of an array, taking in account that we didn't learn pointers yet.
I tried just putting every char in the end of the array while moving the distance by 1 each iteration, but as you would have thought, it doesn't put the data in the array because there is no such position like this in the array (str[20] when str's length is 10 for example).
Thanks in advance,
every help would be appreciated.
strlen returns the length of the string, that is, counts until the first null character. It does NOT return the size of the memory allocated for str1!
When you concatenatestr2 to str1, you write beyond the memory allocated for str1. That will cause undefined behavior. In your particular case, it seems nothing happens and it even seems that str1 has become larger. That is not so. However (in your paticular case), if str2 follows str1 in memory, you just overwrote str2. Try printing str2. It will probaby print his is str2.
Since strcat() et al. does not enlarge a buffer, your implementation does not have to do it. (And it is simply not possible with the parameter list of strcat().) It is the caller's responsibility to pass a destination buffer big enough.
On the caller's side you can simply create an array big enough and pass its address. However, you can still use variable length arrays (VLA):
char str1[] = "This is str1";
char str2[] = "This is str2";
char str1str2[strlen(str1)+strlen(str2)+1];
strcpy( str1str2, str1 );
yourstrcat( str1str2, str2 );
str1str2 is big enough to store both contents plus 1 for the string terminator \0.
Thanks for everyone, I solved the problem. As some of you said, I don't need to enlarge the string, I just need to make sure it's big enough to contain all the data.
what I did eventually is this:
void strnCat(char dest[], char src[], int length)
{
int i = 0;
int len = strlen(dest);
for(i=0; i < length; i++)
{
dest[len+i] = src[i];
dest[len+i+1] = 0;
}
}
so my main problem was that I forget to add the null at the end of the array to make it a string and that I used strlen(str) instead of saving the length in a variable. I did that because I forgot that there is no end of the string after the null disappears.
It is a really strange task to let students implement strncat, since this is one of the C functions that is very difficult to use correctly.
So to implement it yourself, you should read its specification in the C standard or in the POSIX standard. There you will find that strncat doesn't enlarge any array. By the way, arrays cannot be enlarged in C at all, it's impossible by definition. Note the careful distinction between the words array (can contain arbitrary bytes) and string (must contain one null byte) in the standard wording.
A saner alternative to implement is strlcat, which is not in the C standard but also widely known.

C Initialize Character Array from Character Pointer

My question should be rather simple.
I need to give a function a char array of a pre-defined length, but I have a character pointer with variable length, but not longer than the length of my array.
Here the code:
#define SIZE_MAX_PERSON_NAME 50
char person[SIZE_MAX_PERSON_NAME];
char* currentPerson = "John";
now how would I get John into the person array but also setting the rest of the array to 0 (/NUL) ?
so that I would have
BINARY DATA: "John/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL/NUL....."
in my memory?
sorry if this is overly stupid, but I can't seem to find a solution right now.
First, zero-initialize the fixed-size array :
// Using memset here because I don't know if the whole copy operation can or will be used
// multiple times. We want to be sure that the array is properly zero-initialized if the next
// string to copy is shorter than the previous.
memset(person, '\0', SIZE_MAX_PERSON_NAME);
Then, copy the variable-size string into it :
strcpy(person, currentPerson);
If you are not certain that currentPerson will fit into person :
strncpy(person, currentPerson, SIZE_MAX_PERSON_NAME - 1);
Note that strncpy also zero-initialize the remaining bytes of the array if
strlen(currentPerson) < SIZE_MAX_PERSON_NAME - 1
So you basically have these two options :
memset(person, '\0', SIZE_MAX_PERSON_NAME);
strcpy(person, currentPerson);
Or :
strncpy(person, currentPerson, SIZE_MAX_PERSON_NAME - 1);
person[SIZE_MAX_PERSON_NAME - 1] = '\0';
After this answer was posted the question was retagged from C++ to C.
Use a std::string, like this:
// "using namespace std;" or "using std::string;", then:
string const person = currentPerson;
old_c_function( person.c_str() );
To do things at the C level, which I recommend that you don't, first replace the unnecessary #define with a typed constant:
int const max_person_name_size = 50;
Then zero-initialize your array:
char person[max_person_name_size] = {};
(Note: no silly memset here.)
(Also note: this zeroing is only a preventive measure. You wanted it. But it's not really necessary since strcpy will ensure a trailing zero-byte.)
Then just copy in the string:
assert( strlen( current_person ) < max_person_name_size );
strcpy( person, current_person );
But don't do this. Use std::string instead.
Update: doing other things for some minutes made me realize that this answer, as all the others so far, is completely off the mark. The OP states in a comment elsewhere that
” I've got a function in the library which only takes a character array. Not a character pointer.
Thus, apparently it's all about a misconception.
The only way this can make sense is if the array is modified by the function, and then std::string::c_str() is not a solution. But a std::string can still be used, if its length is set to something sufficient for the C function. Can go like this:
person.resize( max_person_name_size );
foo( &person[0] ); // Assuming foo modifies the array.
person.resize( strlen( person.c_str() ) );
With literal, you may do:
char person[SIZE_MAX_PERSON_NAME] = "John";
if c-string is not a literal, you have to do the copy with strcpy
strcpy(person, currentPerson);
This is the one and only reason for the existence of strncpy:
Putting a string (up to the 0-terminator or buffer end) into a fixed-length array and zeroing out the rest.
This does not ensure 0-termination, thus avoid it for anything else.
7.24.2.4 The strncpy function
#include <string.h>
char *strncpy(char * restrict s1, const char * restrict s2, size_t n);
2 The strncpy function copies not more than n characters (characters that follow a null
character are not copied) from the array pointed to by s2 to the array pointed to by s1.308) If copying takes place between objects that overlap, the behavior is undefined.
3 If the array pointed to by s2 is a string that is shorter than n characters, null characters are appended to the copy in the array pointed to by s1, until n characters in all have been written.
4 The strncpy function returns the value of s1.

Assign string to element in structure in C

I have this structure:
typedef struct SM_DB
{
LIST_TYPE link;
char name[SM_NAME_SIZE];
} SM_DB_TYPE;
And I would like to assign a string to its 'name'. I am doing so like this:
SM_DB_TYPE one;
one.name = "Alpha";
However, after compiling I get an error: "error C2106: '=' : left operand must be l-value". I am hoping this is fairly obvious. Does anyone know what I am doing wrong?
Thanks
Assuming SM_NAME_SIZE is large enough you could just use strcpy like so:
strcpy(one.name, "Alpha");
Just make sure your destination has enough space to hold the string before doing strcpy your you will get a buffer overflow.
If you want to play it safe you could do
if(!(one.name = malloc(strlen("Alpha") + 1))) //+1 is to make room for the NULL char that terminates C strings
{
//allocation failed
}
strcpy(one.name, "Alpha"); //note that '\0' is not included with Alpha, it is handled by strcpy
//do whatever with one.name
free(one.name) //release space previously allocated
Make sure you free one.name if using malloc so that you don't waste memory.
You can assign value to string only while declaring it. You can not assign it later by using =.
You have to use strcpy() function.
Use strcpy or strncpy to assign strings in C.
C does not have a built in string type. You must use an array of characters to hold the string.
Since C also does not allow the assignment of one array to another, you have to use the various functions in the Standard C Library to copy array elements from one array to another or you have to write a loop to do it yourself. Using the Standard C Library functions is much preferred though there are sometimes reasons to write your own loop.
For standard ANSI type strings used with the char type there are a large number of functions most of which begin with str such as functions to copy or compare strings strcpy(), strcmp(). There are also another set which you specify the maximum number of characters to copy or compare such as strncpy() or strncmp().
A string in C is an array of characters that is terminated by a binary zero character. So if you use a constant string such as "Constant" this will create an array of characters that has one element per character plus an additional element for the zero terminator.
This means that when sizing char arrays you must also remember to add one more extra array element to hold the zero terminator.
The strncpy() function will copy one char array to another up to either the maximum number of characters specified or when the zero terminator is found. If the maximum number of characters is reached then the destination array will not be terminated by a zero terminator so this is something to watch out for.
char one[10];
char two[20];
strncpy (one, "1234567", 10); // copy constant to the char buffer max of 10 chars
one[9] = 0; // make sure the string is zero terminated, it will be this is demo
strcpy (two, one);
strcat (two, " suffix"); // add some more text to the end
There are also functions to work with wide characters used with UNICODE.
Use:
strcpy(one.name, "Alpha"); //Removed null byte (Read first comment by shf301)
Alternative:
typedef struct SM_DB
{
LIST_TYPE link;
char* name;
} SM_DB_TYPE;
SM_DB_TYPE one;
one.name = malloc(sizeof(char) * (strlen("Alpha") + 1); //Allocate memory
if (!one.name) {
/* Error handling */
} else {
strcpy(one.name, "Alpha");
}

What is wrong with this function?

I got a problem today. It had a method and I need to find the problem in that function. The objective of the function is to append new line to the string that is passed. Following is the code
char* appendNewLine(char* str){
int len = strlen(str);
char buffer[1024];
strcpy(buffer, str);
buffer[len] = '\n';
return buffer;
}
I had identified the problem with this method. Its kind of straight forward. The method is having a potential of having array's index out of range. That is not my doubt. In java, I use '\n' for newline. (I am basically a Java programmer, its been many years I've worked in C). But I vaguely remember '\n' is to denote termination for a string in C. Is that also a problem with this program?
Please advise.
Theres quite a few problems in this code.
strlen and not strlent, unless you have an odd library function there.
You're defining a static buffer on the stack. This is a potential bug (and a security one as well) since a line later, you're copying the string to it without checking for length.
Possible solutions to that can either be allocating the memory on the heap (with a combination of strlen and malloc), or using strncpy and accepting the cut off of the string.
Appending '\n' indeed solves the problem of adding a new line, but this creates a further bug in that the string is currently not null terminated.
Solution: Append '\n' and '\0' to null terminate the new string.
As others have mentioned, you're returning a pointer to a local variable, this is a severe bug and makes the return value corrupt within a short time.
To expand your understanding of these problems, please look up what C-style strings are, potentially from here. Also, teach yourself the difference between variables allocated on the stack and variables allocated on the heap.
EDITed: AndreyT is correct, the definition of length is valid
No, a '\n' is a new-line in c, just like in Java (Java grabbed that from C). You've identified one problem: if the input string is longer than your buffer, you'll write past the end of buffer. Worse, your return buffer; returns the address of memory that's local to the function and will cease to exist when the function exits.
First this is a function, not a program.
This function returns a pointer to a local variable. Such variables are typically created on the stack are no more available when the function exits.
Another problem is if the passed is longer than 1024 chars ; in this case, strcpy() will write past the buffer.
One solution is to allocate a new buffer in dynamic memory and to return a pointer to that buffer. The size of the buffer shall be len +2 (all chars + newline + \0 string terminator), but someone will have to free this buffer (and possibly the initial buffer as well).
strlent() does not exist, it should be strlen() but I suppose this is just a typo.
This function returns buffer, which is a local variable on the stack. As soon as the function returns the memory for buffer can be reused for another purpose.
You need to allocate memory using malloc or similar if you intend to return it from a function.
There are other issues with the code as well - you do not ensure that buffer is large enough to contain the string you are trying to copy to it and you do not make sure the string ends with a null-terminator.
C strings end with '\0'.
And as your objective is to append newLine, following would do fine (will save you copying the entire string into a buffer):
char* appendNewLine(char* str){
while(*str != '\0') str++; //assumming the string ended with '\0'
*str++ = '\n'; //assign and increment the pointer
*str = '\0';
return str; //optional, you could also send 0 or 1, whether
//it was successful or not
}
EDIT :
String should have space to accommodate the extra '\n' and since the OBJECTIVE itself is to append, which means adding to the original, its safe to assume string has space for atleast one more char!!
But, if you dont want to assume anything,char* appendNewLine(char* str){
int length = strlen(str);
char *newStr = (char *)malloc(1 + length);
*(newStr + length) = '\n';
*(newStr + length + 1) = '\0';
return newStr;
}
Add a null after the newline:
buffer[len] = '\n';
buffer[len + 1] = 0;
The terminator for a string in C is '\0' not '\n'. It stands only for newline.
There are at least two problems with your program.
Firstly, you seem to want to build a string, but you never zero-terminate it.
Secondly, you function returns a pointer to locally declared buffer. Doing this makes no sense.
There are several issues with the code:
It can buffer overflow since buffer is hardcoded to allocate only 1024 characters. Worse yet, the buffer is not even allocated in the heap.
The newline "character" is actually operating system-dependent. Strictly speaking, it's only \n in Unix etc. In Windows, and in strict internet protocol, it's \r\n, for example.
The string returned by the function is not null-terminated. This is most likely not what you'd want.
Also, taking into account your background in Java, here are some things that you should consider:
Since you're working with C char* and not (immutable) Java strings, maybe you could append the newline in-place?
Array access is no longer checked at run time, so you have to be VERY careful about going out of bounds. Make sure that all buffers are of appropriate size.
The language does not come with standard automatic garbage collection, so if you do choose to allocate new buffers for string manipulation, make sure that you manage your memory properly and aren't leaking everywhere.
char* appendNewLine(char* str){
int len = strlen(str);
char buffer[1024];
strcpy(buffer, str);
buffer[len] = '\n';
return buffer;
}
Another important issue is the buffer variable; its supposed to be a local stack variable. As soon as the function returns it is being destroyed from stack. And returning pointer to the buffer probably means you are going to crash your process if you try to write at the returned pointer (address of buffer that's address on stack).
Use malloc instead
I am ignoring the return of a local, as others have eloquently addressed that.
int len = strlen(str);
char buffer[1024];
...
buffer[len] = '\n';
If strlen(str) > 1024, then this sequence would write beyond the bounds of the declared buffer. Also as noted, this would (probably) not be null terminated.
To safely append a new line if possble,
char buffer[1024];
strncpy(buffer, str, 1024); // truncate string if it is too long
int len = strlen(buffer);
if (len < 1022) {
buffer[len] = '\n';
buffer[len + 1] = '\0';
}
Note: If the string is too long, This leave the truncated string WITHOUT the new line.
C string must end with '\0'.
buffer[len+1] = '\0';
You should dynamically allocate the buffer as a pointer to char of size len:
char *buffer = malloc(len*sizeof(char));
Maybe \n should be \r\n. Return + new line. It's what i always use and works for me.

Strings in C: pitfalls and techniques

I will be coaching an ACM Team next month (go figure), and the time has come to talk about strings in C. Besides a discussion on the standard lib, strcpy, strcmp, etc., I would like to give them some hints (something like str[0] is equivalent to *str, and things like that).
Do you know of any lists (like cheat sheets) or your own experience in the matter?
I'm already aware of the books for the ACM competition (which are good, see particularly this), but I'm after tricks of the trade.
Thank you.
Edit: Thank you very much everybody. I will accept the most voted answer, and have duly upvoted others which I think are relevant. I expect to do a summary here (like I did here, asap). I have enough material now and I'm certain this has improved the session on strings immensely. Once again, thanks.
It's obvious but I think it's important to know that strings are nothing more than an array of bytes, delimited by a zero byte.
C strings aren't all that user-friendly as you probably know.
Writing a zero byte somewhere in the string will truncate it.
Going out of bounds generally ends bad.
Never, ever use strcpy, strcmp, strcat, etc.., instead use their safe variants: strncmp, strncat, strndup,...
Avoid strncpy. strncpy will not always zero delimit your string! If the source string doesn't fit in the destination buffer it truncates the string but it won't write a nul byte at the end of the buffer. Also, even if the source buffer is a lot smaller than the destination, strncpy will still overwrite the whole buffer with zeroes. I personally use strlcpy.
Don't use printf(string), instead use printf("%s", string). Try thinking of the consequences if the user puts a %d in the string.
You can't compare strings with if( s1 == s2 )
doStuff(s1);
You have to compare every character in the string. Use strcmp or better strncmp.
if( strncmp( s1, s2, BUFFER_SIZE ) == 0 )
doStuff(s1);
Abusing strlen() will dramatically worsen the performance.
for( int i = 0; i < strlen( string ); i++ ) {
processChar( string[i] );
}
will have at least O(n2) time complexity whereas
int length = strlen( string );
for( int i = 0; i < length; i++ ) {
processChar( string[i] );
}
will have at least O(n) time complexity. This is not so obvious for people who haven't taken time to think of it.
The following functions can be used to implement a non-mutating strtok:
strcspn(string, delimiters)
strspn(string, delimiters)
The first one finds the first character in the set of delimiters you pass in. The second one finds the first character not in the set of delimiters you pass in.
I prefer these to strpbrk as they return the length of the string if they can't match.
str[0] is equivalent to 0[str], or more generally str[i] is i[str] and i[str] is *(str + i).
NB
this is not specific to strings but it works also for C arrays
The strn* variants in stdlib do not necessarily null terminate the destination string.
As an example: from MSDN's documentation on strncpy:
The strncpy function copies the
initial count characters of strSource
to strDest and returns strDest. If
count is less than or equal to the
length of strSource, a null character
is not appended automatically to the
copied string. If count is greater
than the length of strSource, the
destination string is padded with null
characters up to length count.
confuse strlen() with sizeof() when using a string:
char *p = "hello!!";
strlen(p) != sizeof(p)
sizeof(p) yield, at compile time, the size of the pointer (4 or 8 bytes) whereas strlen(p) counts, at runtime, the lenght of the null terminated char array (7 in this example).
strtok is not thread safe, since it uses a mutable private buffer to store data between calls; you cannot interleave or annidate strtok calls also.
A more useful alternative is strtok_r, use it whenever you can.
kmm has already a good list. Here are the things I had problems with when I started to code C.
String literals have an own memory section and are always accessible. Hence they can for example be a return value of function.
Memory management of strings, in particular with a high level library (not libc). Who is responsible to free the string if it is returned by function or passed to a function?
When should "const char *" and when "char *" be used. And what does it tell me if a function returns a "const char *".
All these questions are not too difficult to learn, but hard to figure out if you don't get taught them.
I have found that the char buff[0] technique has been incredibly useful.
Consider:
struct foo {
int x;
char * payload;
};
vs
struct foo {
int x;
char payload[0];
};
see https://stackoverflow.com/questions/295027
See the link for implications and variations
I'd point out the performance pitfalls of over-reliance on the built-in string functions.
char* triple(char* source)
{
int n=strlen(source);
char* dest=malloc(n*3+1);
strcpy(dest,src);
strcat(dest,src);
strcat(dest,src);
return dest;
}
I would discuss when and when not to use strcpy and strncpy and what can go wrong:
char *strncpy(char* destination, const char* source, size_t n);
char *strcpy(char* destination, const char* source );
I would also mention return values of the ansi C stdlib string functions. For example ask "does this if statement pass or fail?"
if (stricmp("StrInG 1", "string 1")==0)
{
.
.
.
}
perhaps you could illustrate the value of sentinel '\0' with following example
char* a = "hello \0 world";
char b[100];
strcpy(b,a);
printf(b);
I once had my fingers burnt when in my zeal I used strcpy() to copy binary data. It worked most of the time but failed mysteriously sometimes. Mystery was revealed when I realized that binary input sometimes contained a zero byte and strcpy() would terminate there.
You could mention indexed addressing.
An elements address is the base address + index * sizeof element
A common error is:
char *p;
snprintf(p, 3, "%d", 42);
it works until you use up to sizeof(p) bytes.. then funny things happens (welcome to the jungle).
Explaination
with char *p you are allocating space for holding a pointer (sizeof(void*) bytes) on the stack. The right thing here is to allocate a buffer or just to specify the size of the pointer at compile time:
char buf[12];
char *p = buf;
snprintf(p, sizeof(buf), "%d", 42);
Pointers and arrays, while having the similar syntax, are not at all the same. Given:
char a[100];
char *p = a;
For the array, a, there is no pointer stored anywhere. sizeof(a) != sizeof(p), for the array it is the size of the block of memory, for the pointer it is the size of the pointer. This become important if you use something like: sizeof(a)/sizeof(a[0]). Also, you can't ++a, and you can make the pointer a 'const' pointer to 'const' chars, but the array can only be 'const' chars, in which case you'd be init it first. etc etc etc
If possible, use strlcpy (instead of strncpy) and strlcat.
Even better, to make life a bit safer, you can use a macro such as:
#define strlcpy_sz(dst, src) (strlcpy(dst, src, sizeof(dst)))

Resources