strdup() - what does it do in C? - c

What is the purpose of the strdup() function in C?

Exactly what it sounds like, assuming you're used to the abbreviated way in which C and UNIX assigns words, it duplicates strings :-)
Keeping in mind it's actually not part of the current (C17) ISO C standard itself(a) (it's a POSIX thing), it's effectively doing the same as the following code:
char *strdup(const char *src) {
char *dst = malloc(strlen (src) + 1); // Space for length plus nul
if (dst == NULL) return NULL; // No memory
strcpy(dst, src); // Copy the characters
return dst; // Return the new string
}
In other words:
It tries to allocate enough memory to hold the old string (plus a '\0' character to mark the end of the string).
If the allocation failed, it sets errno to ENOMEM and returns NULL immediately. Setting of errno to ENOMEM is something malloc does in POSIX so we don't need to explicitly do it in our strdup. If you're not POSIX compliant, ISO C doesn't actually mandate the existence of ENOMEM so I haven't included that here(b).
Otherwise the allocation worked so we copy the old string to the new string(c) and return the new address (which the caller is responsible for freeing at some point).
Keep in mind that's the conceptual definition. Any library writer worth their salary may have provided heavily optimised code targeting the particular processor being used.
One other thing to keep in mind, it looks like this is currently slated to be in the C2x iteration of the standard, along with strndup, as per draft N2912 of the document.
(a) However, functions starting with str and a lower case letter are reserved by the standard for future directions. From C11 7.1.3 Reserved identifiers:
Each header declares or defines all identifiers listed in its associated sub-clause, and optionally declares or defines identifiers listed in its associated future library directions sub-clause.*
The future directions for string.h can be found in C11 7.31.13 String handling <string.h>:
Function names that begin with str, mem, or wcs and a lowercase letter may be added to the declarations in the <string.h> header.
So you should probably call it something else if you want to be safe.
(b) The change would basically be replacing if (d == NULL) return NULL; with:
if (d == NULL) {
errno = ENOMEM;
return NULL;
}
(c) Note that I use strcpy for that since that clearly shows the intent. In some implementations, it may be faster (since you already know the length) to use memcpy, as they may allow for transferring the data in larger chunks, or in parallel. Or it may not :-) Optimisation mantra #1: "measure, don't guess".
In any case, should you decide to go that route, you would do something like:
char *strdup(const char *src) {
size_t len = strlen(src) + 1; // String plus '\0'
char *dst = malloc(len); // Allocate space
if (dst == NULL) return NULL; // No memory
memcpy (dst, src, len); // Copy the block
return dst; // Return the new string
}

char * strdup(const char * s)
{
size_t len = 1+strlen(s);
char *p = malloc(len);
return p ? memcpy(p, s, len) : NULL;
}
Maybe the code is a bit faster than with strcpy() as the \0 char doesn't need to be searched again (It already was with strlen()).

No point repeating the other answers, but please note that strdup() can do anything it wants from a C perspective, since it is not part of any C standard. It is however defined by POSIX.1-2001.

From strdup man:
The strdup() function shall return a pointer to a new string, which is a duplicate of the string pointed to by s1. The returned pointer can be passed to free(). A null pointer is returned if the new string cannot be created.

strdup() does dynamic memory allocation for the character array including the end character '\0' and returns the address of the heap memory:
char *strdup (const char *s)
{
char *p = malloc (strlen (s) + 1); // allocate memory
if (p != NULL)
strcpy (p,s); // copy string
return p; // return the memory
}
So, what it does is give us another string identical to the string given by its argument, without requiring us to allocate memory. But we still need to free it, later.

strdup and strndup are defined in POSIX compliant systems as:
char *strdup(const char *str);
char *strndup(const char *str, size_t len);
The strdup() function allocates sufficient memory for a copy of the
string str, does the copy, and returns a pointer to it.
The pointer may subsequently be used as an argument to the function free.
If insufficient memory is available, NULL is returned and errno is set to
ENOMEM.
The strndup() function copies at most len characters from the string str always null terminating the copied string.

It makes a duplicate copy of the string passed in by running a malloc and strcpy of the string passed in. The malloc'ed buffer is returned to the caller, hence the need to run free on the return value.

The most valuable thing it does is give you another string identical to the first, without requiring you to allocate memory (location and size) yourself. But, as noted, you still need to free it (but which doesn't require a quantity calculation, either.)

The statement:
strcpy(ptr2, ptr1);
is equivalent to (other than the fact this changes the pointers):
while(*ptr2++ = *ptr1++);
Whereas:
ptr2 = strdup(ptr1);
is equivalent to:
ptr2 = malloc(strlen(ptr1) + 1);
if (ptr2 != NULL) strcpy(ptr2, ptr1);
So, if you want the string which you have copied to be used in another function (as it is created in heap section), you can use strdup, else strcpy is enough,

The strdup() function is a shorthand for string duplicate, it takes in a parameter as a string constant or a string literal and allocates just enough space for the string and writes the corresponding characters in the space allocated and finally returns the address of the allocated space to the calling routine.

Related

in c why the dereference of the s point to string value not working?

why when i use the program it return s = null
the get_string function can have update to make the program work
it is : string s = malloc(sizeof(string));
but in the end of the function and after return s; i cant free(s);
or before return s; i will lose the data i stored
i tried to search more about dereference pointers but i got nothing.
#include <stdio.h>
typedef char* string;
string get_string(string q)
{
string s = NULL;
printf("%s",q);
scanf("%s",s);
return s;
}
int main(void)
{
string a = get_string("name : ");
printf("name is %s\n",a);
}
Here are two correct uses of scanf to read a string:
char s1[10];
scanf("%9s", s1);
char *s2 = malloc(100);
scanf("%99s", s2);
Notice that in both cases — s1 and s2 — I had to explicitly say how much memory I wanted for my string. Then, when I called scanf, I included that information — 1 less than the overall string size — in the %s format, so that I could tell scanf not to read a bigger string than my string variable could hold.
Notice, by contrast, that in your get_string function, you did not allocate any memory to hold your string at all. Your variable s was a null pointer, explicitly pointing to no memory at all.
This is something that's very easy to overlook at first: Most of the time, C does not allocate memory for strings for you. Most of the time, this is your responsibility.
Now, an additional concern is that even when you do allocate memory for a string, you have to think about how long that memory will stick around, and whether anyone has to explicitly deallocate it. And there are some additional mistakes that are easy to make. In particular, suppose you took my first example (s1) to heart, and tried to fix your get_string function like this:
char *get_string(char *q)
{
char s[100]; /* WRONG */
printf("%s",q);
scanf("%99s",s);
return s;
}
Here you have given scanf a proper array to read in to, but it's local to the get_string function, meaning that it will disappear after get_string returns, and it will be useless to the caller.
Another possibility is
#include <stdlib.h>
char *get_string(char *q)
{
char s = malloc(100); /* better */
if(s == NULL) {
fprintf(stderr, "out of memory!\n");
exit(1);
}
printf("%s",q);
scanf("%99s",s);
return s;
}
This will work just fine. Note that I have checked to see whether malloc succeeded. (If it fails, it returns a NULL pointer.) But now we have the issue that some memory has been allocated which might never be freed. That is, get_string returns a pointer to dynamically-allocated memory, and it's the caller's responsibility to free that memory when it's no longer needed. The caller doesn't have to, but if there end up being 1,000,000 calls to get_string, and if none of the allocated blocks ever gets freed, that's a lot of memory wasted.
First as other people have noted in the comments the Typedef in this case isn't very helpful as it hides the fact its a pointer. Also char* is vary commonly used and not a type complicated enough for a typedef IMO.
For your other issues the problem appears to be that you are thinking of the value as a C++ string instead of a char pointer. In C there aren't string objects but instead people use char* which can pointer blocks of chars and we determine the end of the string by putting a null character at the end of list of characters.
So the reason you can't print the NULL string is because it is undefined behavior to pass a NULL pointer to printf. When you change it to s = malloc(sizeof(string)); the pointer is no longer null but instead pointing to the start of a block of memory that is sizeof(string) bytes long. You should be doing malloc(sizeof(char)*strlen(q)); instead so you have a block of memory holding a string with the length of string q instead of just one character. More generally it would be malloc(sizeof(char)*NUM_CHARS);.
When it come to the free call. You can't call free(s) after return s; because no statements after return s; will occur because the function has returned and no longer executing. As for calling before, calling free(s) deallocates that block of memory that s is pointing too from the malloc(sizeof(string)) is pointing to. Here you have to remember that the function isn't returning the memory or the string but instead it returns the pointer to the string. So if you delete the memory the pointer is pointing to then you lose it once you return.

strdup and memory leaking

Does strdup allocate another memory zone and create another pointer every time?
For example: does the following code result in a memory leak?
void x(char** d, char* s){
*d = strdup(s);
}
int main(){
char* test = NULL;
x(&test, "abcd");
x(&test, "etc");
return 0;
}
Yes, the program leaks memory because it allocates objects and then loses references to them.
The first time this happens is in the line:
x(&test, "etc");
The variable test holds the one and only copy of a pointer that was allocated in a previous call to x. The new call to x overwrites that pointer. At that point, the pointer leaks.
This is what it means to leak memory: to lose all references to an existing dynamically allocated piece of storage.*
The second leak occurs when the main function returns. At that point, the test variable is destroyed, and that variable holds the one and only copy of a pointer to a duplicate of the "etc" string.
Sometimes in C programs, we sometimes not care about leaks of this second type: memory that is not freed when the program terminates, but that is not allocated over and over again in a loop (so it doesn't cause a runaway memory growth problem).
If the program is ever integrated into another program (for instance as a shared library) where the original main function becomes a startup function that could be invoked repeatedly in the same program environment, both the leaks will turn into problems.
The POSIX strdup function behaves similarly to this:
char *strdup(const char *orig)
{
size_t bytes = strlen(orig) + 1;
char *copy = malloc(bytes);
if (copy != 0)
memcpy(copy, orig, bytes);
return copy;
}
Yes; it allocates new storage each time.
If you have a garbage collector (such as Boehm) in your C image, then it's possible that the leaked storage is recycled, and so strdup is able to re-use the same memory for the second allocation. (However, a garbage collector is not going to kick in after just one allocation, unless it is operated in a stress-test mode for flushing out bugs.)
Now if you want to actually reuse the memory with realloc, then you can change your x function along these lines:
#include <stdlib.h>
#include <string.h>
void *strealloc(char *origptr, char *strdata)
{
size_t nbytes = strlen(strdata) + 1;
char *newptr = (char *) realloc(origptr, nbytes); /* cast needed for C++ */
if (newptr)
memcpy(newptr, strdata, nbytes);
return newptr;
}
(By the way, external names starting with str are in an ISO C reserved namespace, but strealloc is too nice a name to refuse.)
Note that the interface is different. We do not pass in a pointer-to-pointer, but instead present a realloc-like interface. The caller can check the return value for null to detect an allocation error, without having the pointer inconveniently overwritten with null in that case.
The main function now looks like:
int main(void)
{
char *test = strealloc(NULL, "abcd");
test = strealloc(test, "etc");
free(test);
return 0;
}
Like before, there is no error checking. If the first strealloc were to fail, test is then null. That doesn't since it gets overwritten anyway, and the first argument of strealloc may be null.
Only one free is needed to plug the memory leak.
* It's possible to have a semantic memory leak with objects that the program hasn't lost a reference to. For instance, suppose that a program keeps adding information to a list which is never used for any purpose and just keeps growing.
Yes, it allocates memory and leaks if you don't free it. From the man page:
The strdup() function returns a pointer to a new string which is a duplicate of the string s. Memory for the new string is obtained with malloc(3), and can be freed with free(3).
new_s = strdup(s) is essentially equivalent to:
new_s = malloc(strlen(s)+1);
strcpy(new_s, s);
Consider the following definition for strdup:
#include <string.h>
char *strdup(const char *string);
strdup reserves storage space for a copy of string by calling malloc. The string argument to this function is expected to contain a null character (\0) marking the end of the string. Remember to free the storage reserved with the call to strdup.
You must free the string yourself.

copy of a string pointer

I have a function that has input and pointer to an array of char in C. in that function I am manipulating the main string, however I want to make a backup copy in another variable before I use it. I want to put it in char backup[2000], so if the pointer changes the backup won't change.
How can I do that?
void function (const char *string)
{
char *stringcopy = malloc (1 + strlen (string));
if (stringcopy)
strcpy (stringcopy, string);
else fprintf (stderr, "malloc failure!"):
...
do whatever needs to be done with `stringcopy`
}
To duplicate strings in C there is a library function called strdup made for that:
The memory allocated by strdup must be freed after usage using free.
strdup provides the memory allocation and string copy operation in one step. Using an char array can become problematic if at some point in time the string to copy happens to be larger than the array's size.
Your friends are the following functions
malloc
memset
calloc
memcpy
strcpy
strdup
strncpy
But best of all, greatest friend is man :)
Use strncpy().
void myfunc(char* inputstr)
{
char backup[2000];
strncpy(backup, inputstr, 1999);
backup[1999] = '\0';
}
Copying 1999 characters to the 2000 character array leaves the last character for the null terminator.
You can use strcpy to make a copy of your char array (string) before you start modifying.
Use memcpy to create the backup.
char backup[2000];
char original[2000];
sprintf(original, "lovely data here");
memcpy(backup, original, 2000);
char* orig_ptr = original;
strcpy reference.
strcpy( backup, source );
Note from link.
To avoid overflows, the size of the array pointed by destination shall be long enough to contain the same C string as source (including the terminating null character), and should not overlap in memory with source.

Possibility of strcat failing?

Is there a possibility that strcat can ever fail?
If we pass some incorrect buffer or string, then it might lead to memory corruption. But, apart from that is it possible that this function can return failure like strcat returning NULL even if destination string passed is Non-NULL? If no, why strcat has a return type specified at all?
I have just mentioned strcat as an example. But, this question applies to many string and memory related (like memcpy etc) functions. I just want to know the reasoning behind some of these seemingly "always successful" functions having return types.
Returning a pointer to the target string makes it easy to use the output in this sort of (perhaps not-so-clever) way:
int len = strlen(strcat(firstString, secondString));
Most of them go back to a time when C didn't include 'void', so there was no way to specify that it had no return value. As a result, they specified them to return something, even if it was pretty useless.
The implicit contract of these functions is the following: if you pass-in pointers to valid strings, then the functions will perform as advertised. Pass-in a NULL pointer, and the function may do anything (usually, it will raise a SIGSEGV). Given that the arguments are valid (i.e., point to strings) then the algorithms used can not fail.
I always ignored the return types (wondering who uses them) until today I saw this in glibc-2.11 (copied exactly from the source file) and everything became much more clear:
wchar_t *
wcsdup (s)
const wchar_t *s;
{
size_t len = (__wcslen (s) + 1) * sizeof (wchar_t);
void *new = malloc (len);
if (new == NULL)
return NULL;
return (wchar_t *) memcpy (new, (void *) s, len);
}
It makes it easier to write less code ("chain" it?) I guess.
Here's a pretty standard implementation of strcat from OpenBSD:
char *
strcat(char *s, const char *append)
{
char *save = s;
for (; *s; ++s);
while ((*s++ = *append++) != '\0');
return(save);
}
As long as the inputs passed to it are valid (i.e. append is properly terminated and s is large enough to concatenate it), this can't really fail - it's a simple memory manipulation. That memory is entirely under the control of the caller.
The return value here could be used to chain concatenations, for example:
strcat(strcat(s, t1), t2);
Although this is hardly efficient...

converting char** to char* or char

I have a old program in which some library function is used and i dont have that library.
So I am writing that program using libraries of c++.
In that old code some function is there which is called like this
*string = newstrdup("Some string goes here");
the string variable is declared as char **string;
What he may be doing in that function named "newstrdup" ?
I tried many things but i dont know what he is doing ... Can anyone help
The function is used to make a copy of c-strings. That's often needed to get a writable version of a string literal. They (string literals) are itself not writable, so such a function copies them into an allocated writable buffer. You can then pass them to functions that modify their argument given, like strtok which writes into the string it has to tokenize.
I think you can come up with something like this, since it is called newstrdup:
char * newstrdup(char const* str) {
char *c = new char[std::strlen(str) + 1];
std::strcpy(c, str);
return c;
}
You would be supposed to free it once done using the string using
delete[] *string;
An alternative way of writing it is using malloc. If the library is old, it may have used that, which C++ inherited from C:
char * newstrdup(char const* str) {
char *c = (char*) malloc(std::strlen(str) + 1);
if(c != NULL) {
std::strcpy(c, str);
}
return c;
}
Now, you are supposed to free the string using free when done:
free(*string);
Prefer the first version if you are writing with C++. But if the existing code uses free to deallocate the memory again, use the second version. Beware that the second version returns NULL if no memory is available for dup'ing the string, while the first throws an exception in that case. Another note should be taken about behavior when you pass a NULL argument to your newstrdup. Depending on your library that may be allowed or may be not allowed. So insert appropriate checks into the above functions if necessary. There is a function called strdup available in POSIX systems, but that one allows neither NULL arguments nor does it use the C++ operator new to allocate memory.
Anyway, i've looked with google codesearch for newstrdup functions and found quite a few. Maybe your library is among the results:
Google CodeSearch, newstrdup
there has to be a reason that they wrote a "new" version of strdup. So there must be a corner case that it handles differently. like perhaps a null string returns an empty string.
litb's answer is a replacement for strdup, but I would think there is a reason they did what they did.
If you want to use strdup directly, use a define to rename it, rather than write new code.
The line *string = newstrdup("Some string goes here"); is not showing any weirdness to newstrdup. If string has type char ** then newstrdup is just returning char * as expected. Presumably string was already set to point to a variable of type char * in which the result is to be placed. Otherwise the code is writing through an uninitialized pointer..
newstrdup is probably making a new string that is a duplicate of the passed string; it returns a pointer to the string (which is itself a pointier to the characters).
It looks like he's written a strdup() function to operate on an existing pointer, probably to re-allocate it to a new size and then fill its contents. Likely, he's doing this to re-use the same pointer in a loop where *string is going to change frequently while preventing a leak on every subsequent call to strdup().
I'd probably implement that like string = redup(&string, "new contents") .. but that's just me.
Edit:
Here's a snip of my 'redup' function which might be doing something similar to what you posted, just in a different way:
int redup(char **s1, const char *s2)
{
size_t len, size;
if (s2 == NULL)
return -1;
len = strlen(s2);
size = len + 1;
*s1 = realloc(*s1, size);
if (*s1 == NULL)
return -1;
memset(*s1, 0, size);
memcpy(*s1, s2, len);
return len;
}
Of course, I should probably save a copy of *s1 and restore it if realloc() fails, but I didn't need to get that paranoid.
I think you need to look at what is happening with the "string" variable within the code as the prototype for the newstrdup() function would appear to be identical to the library strdup() version.
Are there any free(*string) calls in the code?
It would appear to be a strange thing do to, unless it's internally keeping a copy of the duplicated string and returning a pointer back to the same string again.
Again, I would ask why?

Resources