I got a problem today. It had a method and I need to find the problem in that function. The objective of the function is to append new line to the string that is passed. Following is the code
char* appendNewLine(char* str){
int len = strlen(str);
char buffer[1024];
strcpy(buffer, str);
buffer[len] = '\n';
return buffer;
}
I had identified the problem with this method. Its kind of straight forward. The method is having a potential of having array's index out of range. That is not my doubt. In java, I use '\n' for newline. (I am basically a Java programmer, its been many years I've worked in C). But I vaguely remember '\n' is to denote termination for a string in C. Is that also a problem with this program?
Please advise.
Theres quite a few problems in this code.
strlen and not strlent, unless you have an odd library function there.
You're defining a static buffer on the stack. This is a potential bug (and a security one as well) since a line later, you're copying the string to it without checking for length.
Possible solutions to that can either be allocating the memory on the heap (with a combination of strlen and malloc), or using strncpy and accepting the cut off of the string.
Appending '\n' indeed solves the problem of adding a new line, but this creates a further bug in that the string is currently not null terminated.
Solution: Append '\n' and '\0' to null terminate the new string.
As others have mentioned, you're returning a pointer to a local variable, this is a severe bug and makes the return value corrupt within a short time.
To expand your understanding of these problems, please look up what C-style strings are, potentially from here. Also, teach yourself the difference between variables allocated on the stack and variables allocated on the heap.
EDITed: AndreyT is correct, the definition of length is valid
No, a '\n' is a new-line in c, just like in Java (Java grabbed that from C). You've identified one problem: if the input string is longer than your buffer, you'll write past the end of buffer. Worse, your return buffer; returns the address of memory that's local to the function and will cease to exist when the function exits.
First this is a function, not a program.
This function returns a pointer to a local variable. Such variables are typically created on the stack are no more available when the function exits.
Another problem is if the passed is longer than 1024 chars ; in this case, strcpy() will write past the buffer.
One solution is to allocate a new buffer in dynamic memory and to return a pointer to that buffer. The size of the buffer shall be len +2 (all chars + newline + \0 string terminator), but someone will have to free this buffer (and possibly the initial buffer as well).
strlent() does not exist, it should be strlen() but I suppose this is just a typo.
This function returns buffer, which is a local variable on the stack. As soon as the function returns the memory for buffer can be reused for another purpose.
You need to allocate memory using malloc or similar if you intend to return it from a function.
There are other issues with the code as well - you do not ensure that buffer is large enough to contain the string you are trying to copy to it and you do not make sure the string ends with a null-terminator.
C strings end with '\0'.
And as your objective is to append newLine, following would do fine (will save you copying the entire string into a buffer):
char* appendNewLine(char* str){
while(*str != '\0') str++; //assumming the string ended with '\0'
*str++ = '\n'; //assign and increment the pointer
*str = '\0';
return str; //optional, you could also send 0 or 1, whether
//it was successful or not
}
EDIT :
String should have space to accommodate the extra '\n' and since the OBJECTIVE itself is to append, which means adding to the original, its safe to assume string has space for atleast one more char!!
But, if you dont want to assume anything,char* appendNewLine(char* str){
int length = strlen(str);
char *newStr = (char *)malloc(1 + length);
*(newStr + length) = '\n';
*(newStr + length + 1) = '\0';
return newStr;
}
Add a null after the newline:
buffer[len] = '\n';
buffer[len + 1] = 0;
The terminator for a string in C is '\0' not '\n'. It stands only for newline.
There are at least two problems with your program.
Firstly, you seem to want to build a string, but you never zero-terminate it.
Secondly, you function returns a pointer to locally declared buffer. Doing this makes no sense.
There are several issues with the code:
It can buffer overflow since buffer is hardcoded to allocate only 1024 characters. Worse yet, the buffer is not even allocated in the heap.
The newline "character" is actually operating system-dependent. Strictly speaking, it's only \n in Unix etc. In Windows, and in strict internet protocol, it's \r\n, for example.
The string returned by the function is not null-terminated. This is most likely not what you'd want.
Also, taking into account your background in Java, here are some things that you should consider:
Since you're working with C char* and not (immutable) Java strings, maybe you could append the newline in-place?
Array access is no longer checked at run time, so you have to be VERY careful about going out of bounds. Make sure that all buffers are of appropriate size.
The language does not come with standard automatic garbage collection, so if you do choose to allocate new buffers for string manipulation, make sure that you manage your memory properly and aren't leaking everywhere.
char* appendNewLine(char* str){
int len = strlen(str);
char buffer[1024];
strcpy(buffer, str);
buffer[len] = '\n';
return buffer;
}
Another important issue is the buffer variable; its supposed to be a local stack variable. As soon as the function returns it is being destroyed from stack. And returning pointer to the buffer probably means you are going to crash your process if you try to write at the returned pointer (address of buffer that's address on stack).
Use malloc instead
I am ignoring the return of a local, as others have eloquently addressed that.
int len = strlen(str);
char buffer[1024];
...
buffer[len] = '\n';
If strlen(str) > 1024, then this sequence would write beyond the bounds of the declared buffer. Also as noted, this would (probably) not be null terminated.
To safely append a new line if possble,
char buffer[1024];
strncpy(buffer, str, 1024); // truncate string if it is too long
int len = strlen(buffer);
if (len < 1022) {
buffer[len] = '\n';
buffer[len + 1] = '\0';
}
Note: If the string is too long, This leave the truncated string WITHOUT the new line.
C string must end with '\0'.
buffer[len+1] = '\0';
You should dynamically allocate the buffer as a pointer to char of size len:
char *buffer = malloc(len*sizeof(char));
Maybe \n should be \r\n. Return + new line. It's what i always use and works for me.
Related
I wonder if there is a way to initialize a string array to a value I decide on during memory allocation, so it won't contain any garbage and the null character will be placed at the correct place. I know that using calloc the memory allocated is initialized to all zeros, but in this case involving strings it doesn't help.
I practice using pointers and allocation memory in C.
There is an exercise in which I wrote a function for copying a string to another string - In main(), I allocate memory using malloc for both strings based on the strings lengths the user provides, and then the user enters the first string.
At this point I send the pointer of the first string and second string (uninitialized) as parameters to strCopy(char* str1, char* str2). Inside that function I also use another basic function I wrote, to calculate the length of a string. but as you may guess, since the second string is full of garbage, it's length calculation inside the function is messed up.
void strCopy(char* str1, char* str2)
{
int str1len = str_len(str1); // basic length calculating function
int str2len = str_len(str2);
int i;
for (i = 0; i < str2len; i++)
{
str2[i] = str1[i];
}
str2[i] = '\0';
if (str2len < str1len)
printf("There wasn't enought space to copy the entire string. %s was
copied.\n", str2);
else
printf("The string %s has been copied.\n", str2);
}
Right now it works fine when initializing str2 in a loop in main(), but I am interested in other possible solutions.
Thank you very much for any help!
No, there is not. You have to manually initialize it.
If you want to copy a string while allocating memory, you can use strdup. Note that this is a POSIX function, which means this will only work on POSIX-compliant operating systems, Windows, and any other OSs that implement it.
Well, for this particular situation you can use memcpy. A simple way could be like this in your code instead of the for loop:
memcpy(str2, str1, str2len);
But your code is seriously flawed. I don't know how you have implemented str_len but there is absolutely no way (if you're not using non-standard, non-portable, dirty hacks) to get the size of the block that a pointer is pointing to.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 3 years ago.
Improve this question
I try to use strcpy; is it a good way to do so?
char *stringvar;
stringvar = malloc(strlen("mi string") + 1);
strcpy(stringvar, "mi string");
stringvar[strlen("mi string")] = '\0';
What do you think?
There is exactly one bug in it: You don't check for malloc-failure.
Aside from that, it's repetitive and error-prone.
And overwriting the terminator with the terminator is useless.
Also, the repeated recalculation of the string-length is expensive.
Anyway, as you already have to determine the length, prefer memcpy() over strcpy().
What you should do is extracting it into a function, let's call it strdup() (that is the name POSIX and the next C standard give it):
char* strdup(const char* s) {
size_t n = strlen(s) + 1;
char* r = malloc(n);
if (r)
memcpy(r, s, n);
return r;
}
char* stringvar = strdup("mi string");
if (!stringvar)
handle_error();
You don't need the last line
stringvar[strlen("mi string")] = '\0';
strcpy takes care of that for you.
In real code you absolutely must check malloc's return value to make sure it's not NULL.
Other than that your code is fine. In particular, you've got the vital + 1 in the call to malloc. strlen gives you the length of the string not including the terminating '\0' character, but strcpy is going to add it, so you absolutely need to allocate space for it.
The problem with strcpy -- the fatal problem, some people say -- is that at the moment you call strcpy, you have no way of telling strcpy how big the destination array is. It's your responsibility to make the array big enough, and avoid overflow. strcpy is unable, by itself, to prevent buffer overflow -- and of course if the destination does overflow, that's a big problem.
So then the question is, how can you ensure -- absolutely ensure -- that all the calls to strcpy in all the code you write are correct? And how can you ensure that later, someone modifying your program won't accidentally mess things up?
Basically, if you use strcpy at all, you want to arrange that two things are right next to each other:
the code that arranges that the pointer variable points to enough space for the string you're about to copy into it, and
the actual call to strcpy that copies the string into that pointer.
So your code
stringvar = malloc(strlen("mi string") + 1);
strcpy(stringvar, "mi string");
comes pretty close to that ideal.
I know your code is only an example, but it does let us explore the concern, what if later, someone modifying your program accidentally messes things up? What if someone changes it to
stringvar = malloc(strlen("mi string") + 1);
strcpy(stringvar, "mi string asombroso");
Obviously we've got a problem. So to make absolutely sure that there's room for the string we're copying, it's even better, I think, if the string we're copying is in a variable, so it's patently obvious that the string we're copying is the same string we allocated space for.
So here's my improved version of your code:
char *inputstring = "mi string";
char *stringvar = malloc(strlen(inputstring) + 1);
if(stringvar == NULL) {
fprintf(stderr, "out of memory\n");
exit(1);
}
strcpy(stringvar, inputstring);
(Unfortunately, the check for a NULL return from malloc gets in the way of the goal of having the strcpy call right next to the malloc call.)
Basically your code is an implementation of the C library function strdup, which takes a string you give it and returns a copy in malloc'ed memory.
One more thing. You were worried about the + 1 in the all to malloc, and as I said, it's correct. A pretty common error is
stringvar = malloc(strlen(inputstring));
strcpy(stringvar, inputstring);
This fails to allocate space for the \0, so when strcpy adds the \0 it overflows the allocated space, so it's a problem.
And with that said, make sure you don't accidentally write
stringvar = malloc(strlen("mi string" + 1));
Do you see the error? This is a surprisingly easy mistake to make, but obviously it doesn't do what you want it to do.
There are some issues with the code posted:
you do not check if malloc() succeeded: if malloc() fails and returns NULL, passing this null pointer to strcpy has undefined behavior.
the last statement stringvar[strlen("mi string")] = '\0'; is useless: strcpy does copy the null terminator at the end of the source string, making the destination array a proper C string.
Here is a corrected version:
#include <stdlib.h>
#include <string.h>
...
char *stringvar;
if ((stringvar = malloc(strlen("mi string") + 1)) != NULL)
strcpy(stringvar, "mi string");
Note that is would be slightly better to store the allocation size and not use strcpy:
char *stringvar;
size_t size = strlen("mi string") + 1;
if ((stringvar = malloc(size)) != NULL)
memcpy(stringvar, "mi string", size);
Indeed it would be even simpler and safer to use strdup(), available on POSIX compliant systems, that performs exactly the above steps:
char *stringvar = strdup("mi string");
as far as I'm concerned, strncat enlarges the size of the array you want to cat.
for example:
char str1[] = "This is str1";
char str2[] = "This is str2";
and here the length of str1 is 12 and str2 is also 12, but when I strncat them, str1 changes from 12 to 24.
I was asked to write strncat by my own, but I can't figure out how to enlarge the size of an array, taking in account that we didn't learn pointers yet.
I tried just putting every char in the end of the array while moving the distance by 1 each iteration, but as you would have thought, it doesn't put the data in the array because there is no such position like this in the array (str[20] when str's length is 10 for example).
Thanks in advance,
every help would be appreciated.
strlen returns the length of the string, that is, counts until the first null character. It does NOT return the size of the memory allocated for str1!
When you concatenatestr2 to str1, you write beyond the memory allocated for str1. That will cause undefined behavior. In your particular case, it seems nothing happens and it even seems that str1 has become larger. That is not so. However (in your paticular case), if str2 follows str1 in memory, you just overwrote str2. Try printing str2. It will probaby print his is str2.
Since strcat() et al. does not enlarge a buffer, your implementation does not have to do it. (And it is simply not possible with the parameter list of strcat().) It is the caller's responsibility to pass a destination buffer big enough.
On the caller's side you can simply create an array big enough and pass its address. However, you can still use variable length arrays (VLA):
char str1[] = "This is str1";
char str2[] = "This is str2";
char str1str2[strlen(str1)+strlen(str2)+1];
strcpy( str1str2, str1 );
yourstrcat( str1str2, str2 );
str1str2 is big enough to store both contents plus 1 for the string terminator \0.
Thanks for everyone, I solved the problem. As some of you said, I don't need to enlarge the string, I just need to make sure it's big enough to contain all the data.
what I did eventually is this:
void strnCat(char dest[], char src[], int length)
{
int i = 0;
int len = strlen(dest);
for(i=0; i < length; i++)
{
dest[len+i] = src[i];
dest[len+i+1] = 0;
}
}
so my main problem was that I forget to add the null at the end of the array to make it a string and that I used strlen(str) instead of saving the length in a variable. I did that because I forgot that there is no end of the string after the null disappears.
It is a really strange task to let students implement strncat, since this is one of the C functions that is very difficult to use correctly.
So to implement it yourself, you should read its specification in the C standard or in the POSIX standard. There you will find that strncat doesn't enlarge any array. By the way, arrays cannot be enlarged in C at all, it's impossible by definition. Note the careful distinction between the words array (can contain arbitrary bytes) and string (must contain one null byte) in the standard wording.
A saner alternative to implement is strlcat, which is not in the C standard but also widely known.
I have the following piece of code in C:
char a[55] = "hello";
size_t length = strlen(a);
char b[length];
strncpy(b,a,length);
size_t length2 = strlen(b);
printf("%d\n", length); // output = 5
printf("%d\n", length2); // output = 8
Why is this the case?
it has to be 'b [length +1]'
strlen does not include the null character in the end of c strings.
You never initialized b to anything. Therefore it's contents are undefined. The call to strlen(b) could read beyond the size of b and cause undefined behavior (such as a crash).
b is not initialized: it contains whatever is in your RAM when the program is run.
For the first string a, the length is 5 as it should be "hello" has 5 characters.
For the second string, b you declare it as a string of 5 characters, but you don't initialise it, so it counts the characters until it finds a byte containing the 0 terminator.
UPDATE: the following line was added after I wrote the original answer.
strncpy(b,a,length);
after this addition, the problem is that you declared b of size length, while it should be length + 1 to provision space for the string terminator.
Others have already pointed out that you need to allocate strlen(a)+1 characters for b to be able to hold the whole string.
They've given you a set of parameters to use for strncpy that will (attempt to) cover up the fact that it's not really suitable for the job at hand (or almost any other, truth be told). What you really want is to just use strcpy instead. Also note, however, that as you've allocated it, b is also a local (auto storage class) variable. It's rarely useful to copy a string into a local variable.
Most of the time, if you're copying a string, you need to copy it to dynamically allocated storage -- otherwise, you might as well use the original and skip doing a copy at all. Copying a string into dynamically allocated storage is sufficiently common that many libraries already include a function (typically named strdup) for the purpose. If you're library doesn't have that, it's fairly easy to write one of your own:
char *dupstr(char const *input) {
char *ret = malloc(strlen(input)+1);
if (ret)
strcpy(ret, input);
return ret;
}
[Edit: I've named this dupstr because strdup (along with anything else starting with str is reserved for the implementation.]
Actually char array is not terminated by '\0' so strlen has no way to know where it sh'd stop calculating lenght of string as as
its syntax is int strlen(char *s)-> it returns no. of chars in string till '\0'(NULL char)
so to avoid this this we have to append NULL char (b[length]='\0')
otherwise strlen count char in string passed till NULL counter is encountered
I will be coaching an ACM Team next month (go figure), and the time has come to talk about strings in C. Besides a discussion on the standard lib, strcpy, strcmp, etc., I would like to give them some hints (something like str[0] is equivalent to *str, and things like that).
Do you know of any lists (like cheat sheets) or your own experience in the matter?
I'm already aware of the books for the ACM competition (which are good, see particularly this), but I'm after tricks of the trade.
Thank you.
Edit: Thank you very much everybody. I will accept the most voted answer, and have duly upvoted others which I think are relevant. I expect to do a summary here (like I did here, asap). I have enough material now and I'm certain this has improved the session on strings immensely. Once again, thanks.
It's obvious but I think it's important to know that strings are nothing more than an array of bytes, delimited by a zero byte.
C strings aren't all that user-friendly as you probably know.
Writing a zero byte somewhere in the string will truncate it.
Going out of bounds generally ends bad.
Never, ever use strcpy, strcmp, strcat, etc.., instead use their safe variants: strncmp, strncat, strndup,...
Avoid strncpy. strncpy will not always zero delimit your string! If the source string doesn't fit in the destination buffer it truncates the string but it won't write a nul byte at the end of the buffer. Also, even if the source buffer is a lot smaller than the destination, strncpy will still overwrite the whole buffer with zeroes. I personally use strlcpy.
Don't use printf(string), instead use printf("%s", string). Try thinking of the consequences if the user puts a %d in the string.
You can't compare strings with if( s1 == s2 )
doStuff(s1);
You have to compare every character in the string. Use strcmp or better strncmp.
if( strncmp( s1, s2, BUFFER_SIZE ) == 0 )
doStuff(s1);
Abusing strlen() will dramatically worsen the performance.
for( int i = 0; i < strlen( string ); i++ ) {
processChar( string[i] );
}
will have at least O(n2) time complexity whereas
int length = strlen( string );
for( int i = 0; i < length; i++ ) {
processChar( string[i] );
}
will have at least O(n) time complexity. This is not so obvious for people who haven't taken time to think of it.
The following functions can be used to implement a non-mutating strtok:
strcspn(string, delimiters)
strspn(string, delimiters)
The first one finds the first character in the set of delimiters you pass in. The second one finds the first character not in the set of delimiters you pass in.
I prefer these to strpbrk as they return the length of the string if they can't match.
str[0] is equivalent to 0[str], or more generally str[i] is i[str] and i[str] is *(str + i).
NB
this is not specific to strings but it works also for C arrays
The strn* variants in stdlib do not necessarily null terminate the destination string.
As an example: from MSDN's documentation on strncpy:
The strncpy function copies the
initial count characters of strSource
to strDest and returns strDest. If
count is less than or equal to the
length of strSource, a null character
is not appended automatically to the
copied string. If count is greater
than the length of strSource, the
destination string is padded with null
characters up to length count.
confuse strlen() with sizeof() when using a string:
char *p = "hello!!";
strlen(p) != sizeof(p)
sizeof(p) yield, at compile time, the size of the pointer (4 or 8 bytes) whereas strlen(p) counts, at runtime, the lenght of the null terminated char array (7 in this example).
strtok is not thread safe, since it uses a mutable private buffer to store data between calls; you cannot interleave or annidate strtok calls also.
A more useful alternative is strtok_r, use it whenever you can.
kmm has already a good list. Here are the things I had problems with when I started to code C.
String literals have an own memory section and are always accessible. Hence they can for example be a return value of function.
Memory management of strings, in particular with a high level library (not libc). Who is responsible to free the string if it is returned by function or passed to a function?
When should "const char *" and when "char *" be used. And what does it tell me if a function returns a "const char *".
All these questions are not too difficult to learn, but hard to figure out if you don't get taught them.
I have found that the char buff[0] technique has been incredibly useful.
Consider:
struct foo {
int x;
char * payload;
};
vs
struct foo {
int x;
char payload[0];
};
see https://stackoverflow.com/questions/295027
See the link for implications and variations
I'd point out the performance pitfalls of over-reliance on the built-in string functions.
char* triple(char* source)
{
int n=strlen(source);
char* dest=malloc(n*3+1);
strcpy(dest,src);
strcat(dest,src);
strcat(dest,src);
return dest;
}
I would discuss when and when not to use strcpy and strncpy and what can go wrong:
char *strncpy(char* destination, const char* source, size_t n);
char *strcpy(char* destination, const char* source );
I would also mention return values of the ansi C stdlib string functions. For example ask "does this if statement pass or fail?"
if (stricmp("StrInG 1", "string 1")==0)
{
.
.
.
}
perhaps you could illustrate the value of sentinel '\0' with following example
char* a = "hello \0 world";
char b[100];
strcpy(b,a);
printf(b);
I once had my fingers burnt when in my zeal I used strcpy() to copy binary data. It worked most of the time but failed mysteriously sometimes. Mystery was revealed when I realized that binary input sometimes contained a zero byte and strcpy() would terminate there.
You could mention indexed addressing.
An elements address is the base address + index * sizeof element
A common error is:
char *p;
snprintf(p, 3, "%d", 42);
it works until you use up to sizeof(p) bytes.. then funny things happens (welcome to the jungle).
Explaination
with char *p you are allocating space for holding a pointer (sizeof(void*) bytes) on the stack. The right thing here is to allocate a buffer or just to specify the size of the pointer at compile time:
char buf[12];
char *p = buf;
snprintf(p, sizeof(buf), "%d", 42);
Pointers and arrays, while having the similar syntax, are not at all the same. Given:
char a[100];
char *p = a;
For the array, a, there is no pointer stored anywhere. sizeof(a) != sizeof(p), for the array it is the size of the block of memory, for the pointer it is the size of the pointer. This become important if you use something like: sizeof(a)/sizeof(a[0]). Also, you can't ++a, and you can make the pointer a 'const' pointer to 'const' chars, but the array can only be 'const' chars, in which case you'd be init it first. etc etc etc
If possible, use strlcpy (instead of strncpy) and strlcat.
Even better, to make life a bit safer, you can use a macro such as:
#define strlcpy_sz(dst, src) (strlcpy(dst, src, sizeof(dst)))