I have a function that accepts a char* as one of its parameters. I need to manipulate it, but leave the original char* intact. Essentially, I want to create a working copy of this char*. It seems like this should be easy, but I am really struggling.
My first (naive) attempt was to create another char* and set it equal to the original:
char* linkCopy = link;
This doesn't work, of course, because all I did was cause them to point to the same place.
Should I use strncpy to accomplish this?
I have tried the following, but it causes a crash:
char linkCopy[sizeof(link)] = strncpy(linkCopy, link, sizeof(link));
Am I missing something obvious...?
EDIT: My apologies, I was trying to simplify the examples, but I left some of the longer variable names in the second example. Fixed.
The sizeof will give you the size of the pointer. Which is often 4 or 8 depending on your processor/compiler, but not the size of the string pointed to. You can use strlen and strcpy:
// +1 because of '\0' at the end
char * copy = malloc(strlen(original) + 1);
strcpy(copy, original);
...
free(copy); // at the end, free it again.
I've seen some answers propose use of strdup, but that's a posix function, and not part of C.
You might want to take a look at the strdup (man strdup) function:
char *linkCopy = strdup(link);
/* Do some work here */
free(linkCopy);
Edit: And since you need it to be standard C, do as others have pointed out:
char *linkCopy = malloc(strlen(link) + 1);
/* Note that strncpy is unnecessary here since you know both the size
* of the source and destination buffers
*/
strcpy(linkCopy, link);
/* Do some work */
free(linkCopy);
Since strdup() is not in ANSI/ISO standard C, if it's not available in your compiler's runtime, go ahead and use this:
/*
** Portable, public domain strdup() originally by Bob Stout
*/
#include <stdlib.h>
#include <string.h>
char* strdup(const char* str)
{
char* newstr = (char*) malloc( strlen( str) + 1);
if (newstr) {
strcpy( newstr, str);
}
return newstr;
}
Use strdup, or strndup if you know the size (more secure).
Like:
char* new_char = strdup(original);
... manipulate it ...
free(new_char)
ps.: Not a C standard
Some answers, including the accepted one are a bit off. You do not strcpy a string you have just strlen'd. strcpy should not be used at all in modern programs.
The correct thing to do is a memcpy.
EDIT: memcpy is very likely to be faster in any architecture, strcpy can only possibly perform better for very short strings and should be avoided for security reasons even if they are not relevant in this case.
You are on the right track, you need to use strcpy/strncpy to make copies of strings. Simply assigning them just makes an "alias" of it, a different name that points to the same thing.
Your main problem in your second attempt is that you can't assign to an array that way. The second problem is you seem to have come up with some new names in the function call that I can't tell where they came from.
What you want is:
char linkCopy[sizeof(link)];
strncpy(linkCopy, chLastLink, sizeof(link));
but be careful, sizeof does not always work the way you want it to on strings. Use strlen, or use strdup.
Like sean.bright said strdup() is the easiest way to deal with the copy. But strdup() while widely available is not std C. This method also keeps the copied string in the heap.
char *linkCopy = strdup(link);
/* Do some work here */
free(linkCopy);
If you are committed to using a stack allocated string and strncpy() you need some changes. You wrote:
char linkCopy[sizeof(link)]
That creates a char array (aka string) on the stack that is the size of a pointer (probably 4 bytes). Your third parameter to strncpy() has the same problem. You probably want to write:
char linkCopy[strlen(link)+1];
strncpy(linkCopy,link,strlen(link)+1);
You don't say whether you can use C++ instead of C, but if you can use C++ and the STL it's even easier:
std::string newString( original );
Use newString as you would have used the C-style copy above, its semantics are identical. You don't need to free() it, it is a stack object and will be disposed of automatically.
Related
I'm fairly new to the concept of pointers in C. Let's say I have two variables:
char *arch_file_name;
char *tmp_arch_file_name;
Now, I want to copy the value of arch_file_name to tmp_arch_file_name and add the word "tmp" to the end of it. I'm looking at them as strings, so I have:
strcpy(&tmp_arch_file_name, &arch_file_name);
strcat(tmp_arch_file_name, "tmp");
However, when strcat() is called, both of the variables change and are the same. I want one of them to change and the other to stay intact. I have to use pointers because I use the names later for the fopen(), rename() and delete() functions. How can I achieve this?
What you want is:
strcpy(tmp_arch_file_name, arch_file_name);
strcat(tmp_arch_file_name, "tmp");
You are just copying the pointers (and other random bits until you hit a 0 byte) in the original code, that's why they end up the same.
As shinkou correctly notes, make sure tmp_arch_file_name points to a buffer of sufficient size (it's not clear if you're doing this in your code). Simplest way is to do something like:
char buffer[256];
char* tmp_arch_file_name = buffer;
Before you use pointers, you need to allocate memory. Assuming that arch_file_name is assigned a value already, you should calculate the length of the result string, allocate memory, do strcpy, and then strcat, like this:
char *arch_file_name = "/temp/my.arch";
// Add lengths of the two strings together; add one for the \0 terminator:
char * tmp_arch_file_name = malloc((strlen(arch_file_name)+strlen("tmp")+1)*sizeof(char));
strcpy(tmp_arch_file_name, arch_file_name);
// ^ this and this ^ are pointers already; no ampersands!
strcat(tmp_arch_file_name, "tmp");
// use tmp_arch_file_name, and then...
free(tmp_arch_file_name);
First, you need to make sure those pointers actually point to valid memory. As they are, they're either NULL pointers or arbitrary values, neither of which will work very well:
char *arch_file_name = "somestring";
char tmp_arch_file_name[100]; // or malloc
Then you cpy and cat, but with the pointers, not pointers-to-the-pointers that you currently have:
strcpy (tmp_arch_file_name, arch_file_name); // i.e., no "&" chars
strcat (tmp_arch_file_name, "tmp");
Note that there is no bounds checking going on in this code - the sample doesn't need it since it's clear that all the strings will fit in the allocated buffers.
However, unless you totally control the data, a more robust solution would check sizes before blindly copying or appending. Since it's not directly related to the question, I won't add it in here, but it's something to be aware of.
The & operator is the address-of operator, that is it returns the address of a variable. However using it on a pointer returns the address of where the pointer is stored, not what it points to.
In C I have a path in one of my strings
/home/frankv/
I now want to add the name of files that are contained in this folder - e.g. file1.txt file123.txt etc.
Having declared my variable either like this
char pathToFile[strlen("/home/frankv/")+1]
or
char *pathToFile = malloc(strlen("/home/frankv/")+1)
My problem is that I cannot simply add more characters because it would cause a buffer overflow. Also, what do I do in case I do not know how long the filenames will be?
I've really gotten used to PHP lazy $string1.$string2 .. What is the easiest way to do this in C?
If you've allocated a buffer with malloc(), you can use realloc() to expand it:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *buf;
const char s1[] = "hello";
const char s2[] = ", world";
buf = malloc(sizeof s1);
strcpy(buf, s1);
buf = realloc(buf, sizeof s1 + sizeof s2 - 1);
strcat(buf, s2);
puts(buf);
return 0;
}
NOTE: I have omitted error checking. You shouldn't. Always check whether malloc() returns a null pointer; if it does, take some corrective action, even if it's just terminating the program. Likewise for realloc(). And if you want to be able to recover from a realloc() failure, store the result in a temporary so you don't clobber your original pointer.
Use std::string, if possible. Else, reallocate another block of memory and use strcpy and strcat.
You have a couple options, but, if you want to do this dynamically using no additional libraries, realloc() is the stdlib function you're looking for:
char *pathToFile = malloc(strlen("/home/frankv/")+1);
char *string_to_add = "filename.txt";
char *p = realloc(pathToFile, strlen(pathToFile) + strlen(string_to_add) + 1);
if (!p) abort();
pathToFile = p;
strcat(p, string_to_add);
Note: you should always assign the result of realloc to a new pointer first, as realloc() returns NULL on failure. If you assign to the original pointer, you are begging for a memory leak.
If you're going to be doing much string manipulation, though, you may want to consider using a string library. Two I've found useful are bstring and ustr.
In case you can use C++, use the std::string. In case you must to use pure C, use what's call doubling - i.e. when out of space in the string - double the memory and copy the string into the new memory. And you'll have to use the second syntax:
char *pathToFile = malloc(strlen("/home/frankv/")+1);
You have chosen the wrong language for manipulating strings!
The easy and conventional way out is to do something like:
#define MAX_PATH 260
char pathToFile[MAX_PATH+1] = "/home/frankv/";
strcat(pathToFile, "wibble/");
Of course, this is error prone - if the resulting string exceeds MAX_PATH characters, anything can happen, and it is this sort of programming which is the route many trojans and worms use to penetrate security (by corrupting memory in a carefully defined way). Hence my deliberate choice of 260 for MAX_PATH, which is what it used to be in Windows - you can still make Windows Explorer do strange things to your files with paths over 260 characters, possibly because of code like this!
strncat may be a small help - you can at least tell it the maximum size of the destination, and it won't copy beyond that.
To do it robustly you need a string library which does variable length strings correctly. But I don't know if there is such a thing for C (C++ is a different matter, of course).
I will be coaching an ACM Team next month (go figure), and the time has come to talk about strings in C. Besides a discussion on the standard lib, strcpy, strcmp, etc., I would like to give them some hints (something like str[0] is equivalent to *str, and things like that).
Do you know of any lists (like cheat sheets) or your own experience in the matter?
I'm already aware of the books for the ACM competition (which are good, see particularly this), but I'm after tricks of the trade.
Thank you.
Edit: Thank you very much everybody. I will accept the most voted answer, and have duly upvoted others which I think are relevant. I expect to do a summary here (like I did here, asap). I have enough material now and I'm certain this has improved the session on strings immensely. Once again, thanks.
It's obvious but I think it's important to know that strings are nothing more than an array of bytes, delimited by a zero byte.
C strings aren't all that user-friendly as you probably know.
Writing a zero byte somewhere in the string will truncate it.
Going out of bounds generally ends bad.
Never, ever use strcpy, strcmp, strcat, etc.., instead use their safe variants: strncmp, strncat, strndup,...
Avoid strncpy. strncpy will not always zero delimit your string! If the source string doesn't fit in the destination buffer it truncates the string but it won't write a nul byte at the end of the buffer. Also, even if the source buffer is a lot smaller than the destination, strncpy will still overwrite the whole buffer with zeroes. I personally use strlcpy.
Don't use printf(string), instead use printf("%s", string). Try thinking of the consequences if the user puts a %d in the string.
You can't compare strings with if( s1 == s2 )
doStuff(s1);
You have to compare every character in the string. Use strcmp or better strncmp.
if( strncmp( s1, s2, BUFFER_SIZE ) == 0 )
doStuff(s1);
Abusing strlen() will dramatically worsen the performance.
for( int i = 0; i < strlen( string ); i++ ) {
processChar( string[i] );
}
will have at least O(n2) time complexity whereas
int length = strlen( string );
for( int i = 0; i < length; i++ ) {
processChar( string[i] );
}
will have at least O(n) time complexity. This is not so obvious for people who haven't taken time to think of it.
The following functions can be used to implement a non-mutating strtok:
strcspn(string, delimiters)
strspn(string, delimiters)
The first one finds the first character in the set of delimiters you pass in. The second one finds the first character not in the set of delimiters you pass in.
I prefer these to strpbrk as they return the length of the string if they can't match.
str[0] is equivalent to 0[str], or more generally str[i] is i[str] and i[str] is *(str + i).
NB
this is not specific to strings but it works also for C arrays
The strn* variants in stdlib do not necessarily null terminate the destination string.
As an example: from MSDN's documentation on strncpy:
The strncpy function copies the
initial count characters of strSource
to strDest and returns strDest. If
count is less than or equal to the
length of strSource, a null character
is not appended automatically to the
copied string. If count is greater
than the length of strSource, the
destination string is padded with null
characters up to length count.
confuse strlen() with sizeof() when using a string:
char *p = "hello!!";
strlen(p) != sizeof(p)
sizeof(p) yield, at compile time, the size of the pointer (4 or 8 bytes) whereas strlen(p) counts, at runtime, the lenght of the null terminated char array (7 in this example).
strtok is not thread safe, since it uses a mutable private buffer to store data between calls; you cannot interleave or annidate strtok calls also.
A more useful alternative is strtok_r, use it whenever you can.
kmm has already a good list. Here are the things I had problems with when I started to code C.
String literals have an own memory section and are always accessible. Hence they can for example be a return value of function.
Memory management of strings, in particular with a high level library (not libc). Who is responsible to free the string if it is returned by function or passed to a function?
When should "const char *" and when "char *" be used. And what does it tell me if a function returns a "const char *".
All these questions are not too difficult to learn, but hard to figure out if you don't get taught them.
I have found that the char buff[0] technique has been incredibly useful.
Consider:
struct foo {
int x;
char * payload;
};
vs
struct foo {
int x;
char payload[0];
};
see https://stackoverflow.com/questions/295027
See the link for implications and variations
I'd point out the performance pitfalls of over-reliance on the built-in string functions.
char* triple(char* source)
{
int n=strlen(source);
char* dest=malloc(n*3+1);
strcpy(dest,src);
strcat(dest,src);
strcat(dest,src);
return dest;
}
I would discuss when and when not to use strcpy and strncpy and what can go wrong:
char *strncpy(char* destination, const char* source, size_t n);
char *strcpy(char* destination, const char* source );
I would also mention return values of the ansi C stdlib string functions. For example ask "does this if statement pass or fail?"
if (stricmp("StrInG 1", "string 1")==0)
{
.
.
.
}
perhaps you could illustrate the value of sentinel '\0' with following example
char* a = "hello \0 world";
char b[100];
strcpy(b,a);
printf(b);
I once had my fingers burnt when in my zeal I used strcpy() to copy binary data. It worked most of the time but failed mysteriously sometimes. Mystery was revealed when I realized that binary input sometimes contained a zero byte and strcpy() would terminate there.
You could mention indexed addressing.
An elements address is the base address + index * sizeof element
A common error is:
char *p;
snprintf(p, 3, "%d", 42);
it works until you use up to sizeof(p) bytes.. then funny things happens (welcome to the jungle).
Explaination
with char *p you are allocating space for holding a pointer (sizeof(void*) bytes) on the stack. The right thing here is to allocate a buffer or just to specify the size of the pointer at compile time:
char buf[12];
char *p = buf;
snprintf(p, sizeof(buf), "%d", 42);
Pointers and arrays, while having the similar syntax, are not at all the same. Given:
char a[100];
char *p = a;
For the array, a, there is no pointer stored anywhere. sizeof(a) != sizeof(p), for the array it is the size of the block of memory, for the pointer it is the size of the pointer. This become important if you use something like: sizeof(a)/sizeof(a[0]). Also, you can't ++a, and you can make the pointer a 'const' pointer to 'const' chars, but the array can only be 'const' chars, in which case you'd be init it first. etc etc etc
If possible, use strlcpy (instead of strncpy) and strlcat.
Even better, to make life a bit safer, you can use a macro such as:
#define strlcpy_sz(dst, src) (strlcpy(dst, src, sizeof(dst)))
I have a old program in which some library function is used and i dont have that library.
So I am writing that program using libraries of c++.
In that old code some function is there which is called like this
*string = newstrdup("Some string goes here");
the string variable is declared as char **string;
What he may be doing in that function named "newstrdup" ?
I tried many things but i dont know what he is doing ... Can anyone help
The function is used to make a copy of c-strings. That's often needed to get a writable version of a string literal. They (string literals) are itself not writable, so such a function copies them into an allocated writable buffer. You can then pass them to functions that modify their argument given, like strtok which writes into the string it has to tokenize.
I think you can come up with something like this, since it is called newstrdup:
char * newstrdup(char const* str) {
char *c = new char[std::strlen(str) + 1];
std::strcpy(c, str);
return c;
}
You would be supposed to free it once done using the string using
delete[] *string;
An alternative way of writing it is using malloc. If the library is old, it may have used that, which C++ inherited from C:
char * newstrdup(char const* str) {
char *c = (char*) malloc(std::strlen(str) + 1);
if(c != NULL) {
std::strcpy(c, str);
}
return c;
}
Now, you are supposed to free the string using free when done:
free(*string);
Prefer the first version if you are writing with C++. But if the existing code uses free to deallocate the memory again, use the second version. Beware that the second version returns NULL if no memory is available for dup'ing the string, while the first throws an exception in that case. Another note should be taken about behavior when you pass a NULL argument to your newstrdup. Depending on your library that may be allowed or may be not allowed. So insert appropriate checks into the above functions if necessary. There is a function called strdup available in POSIX systems, but that one allows neither NULL arguments nor does it use the C++ operator new to allocate memory.
Anyway, i've looked with google codesearch for newstrdup functions and found quite a few. Maybe your library is among the results:
Google CodeSearch, newstrdup
there has to be a reason that they wrote a "new" version of strdup. So there must be a corner case that it handles differently. like perhaps a null string returns an empty string.
litb's answer is a replacement for strdup, but I would think there is a reason they did what they did.
If you want to use strdup directly, use a define to rename it, rather than write new code.
The line *string = newstrdup("Some string goes here"); is not showing any weirdness to newstrdup. If string has type char ** then newstrdup is just returning char * as expected. Presumably string was already set to point to a variable of type char * in which the result is to be placed. Otherwise the code is writing through an uninitialized pointer..
newstrdup is probably making a new string that is a duplicate of the passed string; it returns a pointer to the string (which is itself a pointier to the characters).
It looks like he's written a strdup() function to operate on an existing pointer, probably to re-allocate it to a new size and then fill its contents. Likely, he's doing this to re-use the same pointer in a loop where *string is going to change frequently while preventing a leak on every subsequent call to strdup().
I'd probably implement that like string = redup(&string, "new contents") .. but that's just me.
Edit:
Here's a snip of my 'redup' function which might be doing something similar to what you posted, just in a different way:
int redup(char **s1, const char *s2)
{
size_t len, size;
if (s2 == NULL)
return -1;
len = strlen(s2);
size = len + 1;
*s1 = realloc(*s1, size);
if (*s1 == NULL)
return -1;
memset(*s1, 0, size);
memcpy(*s1, s2, len);
return len;
}
Of course, I should probably save a copy of *s1 and restore it if realloc() fails, but I didn't need to get that paranoid.
I think you need to look at what is happening with the "string" variable within the code as the prototype for the newstrdup() function would appear to be identical to the library strdup() version.
Are there any free(*string) calls in the code?
It would appear to be a strange thing do to, unless it's internally keeping a copy of the duplicated string and returning a pointer back to the same string again.
Again, I would ask why?
I have the following C code fragment and have to identify the error and suggest a way of writing it more safely:
char somestring[] = "Send money!\n";
char *copy;
copy = (char *) malloc(strlen(somestring));
strcpy(copy, somestring);
printf(copy);
So the error is that strlen ignores the trailing '\0' of a string and therefore it is not going to be allocated enough memory for the copy but I'm not sure what they're getting at about writing it more safely?
I could just use malloc(strlen(somestring)+1)) I assume but I'm thinking there must be a better way than that?
EDIT: OK, I've accepted an answer, I suspect that the strdup solution would not be expected from us as it's not part of ANSI C. It seems to be quite a subjective question so I'm not sure if what I've accepted is actually the best. Thanks anyway for all the answers.
I can't comment on the responses above, but in addition to checking the
return code and using strncpy, you should never do:
printf(string)
But use:
printf("%s", string);
ref: http://en.wikipedia.org/wiki/Format_string_attack
char somestring[] = "Send money!\n";
char *copy = strdup(something);
if (copy == NULL) {
// error
}
or just put this logic in a separate function xstrdup:
char * xstrdup(const char *src)
{
char *copy = strdup(src);
if (copy == NULL) {
abort();
}
return copy;
}
char somestring[] = "Send money!\n";
char *copy;
size_t copysize;
copysize = strlen(somestring)+1;
copy = (char *) malloc(copysize);
if (copy == NULL)
bail("Oh noes!\n");
strncpy(copy, somestring, copysize);
printf("%s", copy);
Noted differences above:
Result of malloc() must be checked!
Compute and store the memory size!
Use strncpy() because strcpy() is naughty. In this contrived example it won't hurt, but don't get into the habit of using it.
EDIT:
To those thinking I should be using strdup()... that only works if you take the very narrowest view of the question. That's not only silly, it's overlooking an even better answer:
char somestring[] = "Send money!\n";
char *copy = somestring;
printf(copy);
If you're going to be obtuse, at least be good at it.
strlen + 1, for the \0 terminator
malloc may fail; always check malloc return value
Ick... use strdup() like everyone else said and write it yourself if you have to. Since you have time to think about this now... check out the 25 Most Dangerous Programming Errors at Mitre, then consider why the phrase printf(copy) should never appear in code. That is right up there with malloc(strlen(str)) in terms of utter badness not to mention the headache of tracking down why it causes lots of grief when copy is something like "%s%n"...
I would comment to previous solutions but I do not have enough rep.
Using strncpy here is as wrong as using strcpy(As there is absolutely no risk of overflow). There is a function called memcpy in < string.h > and it is meant exactly for this. It is not only significantly faster, but also the correct function to use to copy strings of known length in standard C.
From the accepted answer:
char somestring[] = "Send money!\n";
char *copy;
size_t copysize;
copysize = strlen(somestring)+1;
copy = (char *) malloc(copysize);
if (copy == NULL)
bail("Oh noes!\n");
memcpy(copy, somestring, copysize); /* You don't use str* functions for this! */
printf("%s", copy);
to add more to Adrian McCarthy's ways to make safer code,
Use a static code analyzer, they are very good at finding this kind of errors
Ways to make the code safer (and more correct).
Don't make an unnecessary copy. From the example, there's no apparent requirement that you actually need to copy somestring. You can output it directly.
If you have to make a copy of a string, write a function to do it (or use strdup if you have it). Then you only have to get it right in one place.
Whenever possible, initialize the pointer to the copy immediately when you declare it.
Remember to allocate space for the null terminator.
Remember to check the return value from malloc.
Remember to free the malloc'ed memory.
Don't call printf with an untrusted format string. Use printf("%s", copy) or puts(copy).
Use an object-oriented language with a string class or any language with built-in string support to avoid most of these problems.
The best way to write it more safely, if one were to be truly interested in such a thing, would be to write it in Ada.
somestring : constant string := "Send money!";
declare
copy : constant string := somestring;
begin
put_line (somestring);
end;
Same result, so what are the differences?
The whole thing is done on the stack
(no pointers). Deallocation is
automatic and safe.
Everything is automaticly range-checked so
there's no chance of buffer-overflow
exploits
Both strings are constants,
so there's no chance of screwing up
and modifying them.
It will probably be way faster than the C, not only because of the lack of dynamic allocation, but because there isn't that extra scan through the string required by strlen().
Note that in Ada "string" is not some special dynamic construct. It's the built-in array of characters. However, Ada arrays can be sized at declaration by the array you assign into them.
The safer way would be to use strncpy instead of strcpy. That function takes a third argument: the length of the string to copy. This solution doesn't stretch beyond ANSI C, so this will work under all environments (whereas other methods may only work under POSIX-compliant systems).
char somestring[] = "Send money!\n";
char *copy;
copy = (char *) malloc(strlen(somestring));
strncpy(copy, somestring, strlen(somestring));
printf(copy);