Memcpy with a void pointer - c

I am having trouble with the memcpy function. I need a function that stores data into a void pointer. Here is a simple example of it:
void saveData(void* data)
{
char inputData[] = "some data";
memcpy((char*)data, inputData, sizeof(inputData));
}
However I get segmentation errors when I do this, even though it compiles just fine. My function argument has to be a void pointer because I may have different data formats to input and I may not know the size of the data ahead of time. Could somebody tell me what I am doing wrong?
Please and thank you.
***UPDATE:
Thanks all for the helpful responses. As most of you pointed out, I did not initialize void* data. That fixed it.
Now my question is: when I do dynamic allocation for the void* data (or even char* data), I give it a size, but when I do memcpy, it allows me to write an ever bigger string than I first assigned space for. I also tried just doing char* data, and the same thing happens. This is my sample code:
char inputData[] = "123456789";
void* data1 = malloc(5*sizeof(char));
char* data2 = (char*)malloc(5*sizeof(char));
memcpy(data1,inputData,sizeof(inputData));
memcpy(data2,inputData,sizeof(inputData));
When I print out the results, the entire string of inputData get copied even though I only allocated enough space for 5 chars. Shouldn't that give me an error?

Your function parameter data needs to point to somewhere that there is memory available.
You could do something like this:
int main()
{
char myString[256];
saveData((void*)myString);
}
If you prefer to use malloc and free, then your code would be more like this:
int main()
{
char* myString = (char*)malloc(256);
saveData((void*)myString);
free(myString);
}
..the trouble with both of these is that you don't know how long the string needs to be. It would be far safer/easier to use std::string.
std::string myString; // an empty string
myString += "some data"; // append a string

Note that the size of the source is not given by sizeof but by strlen. Assuming the caller does not want to allocate memory, you need to malloc storage and have the void* store a handle to that. Given the save function calls malloc, I'd set up a corresponding release function.
bool saveData(void** retSave, const char* input)
{
bool success = false;
size_t storageSize = strlen(input) + 1;
void* store = malloc(storageSize);
if (store)
{
memcpy(store, input, storageSize);
success = true;
}
return success;
}
void releaseSavedData(void* savedData)
{
free(savedData);
}
int main()
{
void* saveHandle = 0;
bool ok = saveData(&saveHandle, "some data");
if (ok)
{
releaseSavedData(saveHandle);
}
return 0;
}

data needs to point to somewhere you have access to write. The null pointer area you don't have access to (unless you're running DOS or something).
Use malloc to allocate a chunk of memory to copy from inputData to.

If you don't know the size of the data ahead of time, the only way to get this to work is to malloc some arbitrarily large buffer and make sure you NEVER copy more than that # of bytes into the void*. Not the best design, but it will work.
Also, don't forget to allocate an extra byte for the NULL char on the end of the string, and don't forget to actually put the null on the end of the memory buffer. In the example above, your string in data won't have a NULL on the end.

Related

Returning allocated buffer, vs buffer passed to a function

When passing values to my functions, I often consider either returning an allocated buffer from my function, rather than letting the function take a buffer as an argument. I was trying to figure out if there was any significant benefit to passing a buffer to my function (eg:
void f(char **buff) {
/* operations */
strcpy(*buff, value);
}
Versus
char *f() {
char *buff = malloc(BUF_SIZE);
/* operations */
return buff;
}
These are obviously not super advanced examples, but I think the point stands. But yeah, are there any benefits to letting the user pass an allocated buffer, or is it better to return an allocated buffer?
Are there any benefits to using one over the other, or is it just useless?
This is a specific case of the more general question of whether a function should return data to its caller via its return value or via an out parameter. Both approaches work fine, and the pros and cons are mostly stylistic, not technical.
The main technical consideration is that each function has only one return value, but can have any number of out parameters. That can be worked around, but doing so might not be acceptable. For example, if you want to reserve your functions' return values for use as status codes such as many standard library functions produce, then that limits your options for sending back other data.
Some of the stylistic considerations are
using the return value is more aligned with the idiom of a mathematical function;
many people have trouble understanding pointers; and in particular,
non-local modifications effected through pointers sometimes confuse people. On the other hand,
the return value of a function can be used directly in an expression.
With respect to modifications to the question since this answer was initially posted, if the question is about whether to dynamically allocate and populate a new object vs populating an object presented by the caller, then there are these additional considerations:
allocating the object inside the function frees the caller from allocating it themselves, which is a convenience. On the other hand,
allocating the object inside the function prevents the caller from allocating it themselves (maybe automatically or statically), and does not provide for re-initializing an existing object. Also,
returning a pointer to an allocated object can obscure the fact that the caller has an obligation to free it.
Of course, you can have it both ways:
void init_thing(thing *t, char *name) {
t->name = name;
}
thing *create_thing(char *name) {
thing *t = new malloc(sizeof(*t));
if (t) {
init_thing(t);
}
return t;
}
Both options work.
But in general, returning information through the parameters (the second option) is preferable because we usually reserve the return of the function to report an error. And we can return several information trough multiple parameters. Hence, it is easier for the caller to check if the function was OK or not by checking first the returned value. Most of the services from the C library or the Linux system calls work like this.
Concerning your examples, both options work because you are referencing a constant string which is globally allocated at program's loading time. So, in both solutions, you return the address of this string.
But if you do something like the following:
char *func(void) {
char buff[] = "example";
return buff;
}
You actually copy the content of the constant string "example" into the stack area of the function pointed by buff. In the caller the returned address is no longer valid as it refers to a stack location which can be reused by any other function called by the caller.
Let's compile a program using this function:
#include <stdio.h>
char *func(void) {
char buff[] = "example";
return buff;
}
int main(void) {
char *p = func();
printf("%s\n", p);
return 0;
}
If the compilation options of the compiler are smart enough, we get a first red flag with a warning like this:
$ gcc -g bad.c -o bad
bad.c: In function 'func':
bad.c:5:11: warning: function returns address of local variable [-Wreturn-local-addr]
5 | return buff;
| ^~~~
The compiler points out the fact that func() is returning the address of a local space in its stack which is no longer valid when the function returns. This is the compiler option -Wreturn-local-addr which triggers this warning. Let's deactivate this option to remove the warning:
$ gcc -g bad.c -o bad -Wno-return-local-addr
So, now we have a program compiled with 0 warning but this is misleading as the execution fails or may trigger some unpredictible behaviors:
$ ./bad
Segmentation fault (core dumped)
You can't return the address of local memory.
Your first example works because the memory in "example" will not be deallocated. But if you allocated local (aka automatic) memory it automtically be deallocated when the function returns; the returned pointer will be invalid.
char *func() {
char buff[10];
// Copy into local memory
strcpy(buff, "example");
// buff will be deallocated after returning.
// warning: function returns address of local variable
return buff;
}
You either return dynamic memory, using malloc, which the caller must then free.
char *func() {
char *buf = malloc(10);
strcpy(buff, "example");
return buff;
}
int main() {
char *buf = func();
puts(buf);
free(buf);
}
Or you let the caller allocate the memory and pass it in.
void *func(char **buff) {
// Copy a string into local memory
strcpy(buff, "example");
// buff will be deallocated after returning.
// warning: function returns address of local variable
return buff;
}
int main() {
char buf[10];
func(&buf);
puts(buf);
}
The upside is the caller has full control of the memory. They can reused existing memory, and they can use local memory.
The downside is the caller must allocate the correct amount of memory. This might lead to allocating too much memory, and also too little.
An additional downside is the function has no control over the memory which has been passed in. It cannot grow nor shrink nor free the memory.
You can only return one thing from a function.
For example, if you want to convert a string to an integer you could return the integer like atoi does. int atoi( const char *str ).
int num = atoi("42");
But then what happens when the conversion fails? atoi returns 0, but how do you tell the difference between atoi("0") and atoi("purple")?
You can instead pass in an int * for the converted value. int my_atoi( const char *str, int *ret ).
int num;
int err = my_atoi("42", &num);
if(err) {
exit(1);
}
else {
printf("%d\n");
}

Error trying to change contents of string pointer in C

I'm working on a program in C and one of my key functions is defined as follows:
void changeIndex(char* current_index)
{
char temp_index[41]; // note: same size as current_index
// do stuff with temp_index (inserting characters and such)
current_index = temp_index;
}
However, this function has no effect on current_index. I thought I found a fix and tried changing the last line to
strcpy(current_index, temp_index)
but this gave me yet another error. Can anyone spot what I'm doing wrong here? I basically just want to set the contents of current_index equal to that of temp_index at each call of changeIndex.
If more information is needed, please let me know.
strcpy should work if current_index points to allocated memory of sufficient size. Consider the following example, where changeIndex require additional parameter - size of distination string:
void changeIndex(char* current_index, int max_length)
{
// check the destination memory
if(current_index == NULL)
{
return; // do nothing
}
char temp_index[41];
// do stuff with temp_index (inserting characters and such)
// copy to external memory, that should be allocated
strncpy(current_index, temp_index, max_length-1);
current_index[max_length-1] = '\0';
}
Note: strncpy is better for the case when temp_index is longer then current_index.
Examples of usage:
// example with automatic memory
char str[20];
changeIndex(str, 20);
// example with dinamic memory
char * ptr = (char *) malloc(50);
changeIndex(ptr, 50);
Obviously defining a local char array on the stack and returning a pointer to it is wrong. You should never do that as the memory is not defined after the function ends.
In addition to the previous answers: The strncpy char pointer (which seems unsafe for my opinion), and the malloc which is safer but you need to remember to free it outside of the function (and its inconsistent with the hierarchy of the program) you can do the following:
char* changeIndex()
{
static char temp_index[41]; // note: same size as current_index
// do stuff with temp_index (inserting characters and such)
return temp_index;
}
As the char array is static it will not be undefined at the end of the function and you do not need to remember to free the pointer at the end of the use.
Caveat: If you are using multiple thread you cannot use this option as the static memory could be changed by different threads entering the function at the same time
Your array temp_index is local for function, then *current_index don't take what u want.
U can use also function strdup . Function return begin memory location of copied string , or NULL if error occurred, lets say ( char *strdup(char *) )
char temp[] = "fruit";
char *line = strdup(temp );

Pointers and assignment in a sub-function

I have a small program that creates a semver struct with some variables in it:
typedef struct {
unsigned major;
unsigned minor;
unsigned patch;
char * note;
char * tag;
} semver;
Then, I would like to create a function which creates a semver struct and returns it to the caller. Basically, a Factory.
That factory would call an initialize function to set the default values of the semver struct:
void init_semver(semver * s) {
s->major = 0;
s->minor = 0;
s->patch = 0;
s->note = "alpha";
generate_semver(s->tag, s);
}
And on top of that, I would like a function to generate a string of the complete semver tag.
void generate_semver(char * tag, semver * s) {
sprintf( tag, "v%d.%d.%d-%s",
s->major, s->minor, s->patch, s->note);
}
My problem appears to lie in this function. I have tried returning a string, but have heard that mallocing some space is bad unless you explicitly free it later ;) In order to avoid this problem, I decided to try to pass a string to the function to have it be changed within the function with no return value. I'm trying to loosely follow something like DI practices, even though I'd really like to separate the concerns of these functions and have the generate_semver function return a string that I can use like so:
char * generate_semver(semver * s) {
char * full_semver;
sprintf( full_semver, "v%d.%d.%d-%s",
s->major, s->minor, s->patch, s->note);
return full_semver; // I know this won't work because it is defined in the local stack and not outside.
}
semver->tag = generate_semver(semver);
How can I do this?
My problem appears to lie in this function. I have tried returning a string, but have heard that mallocing some space is bad unless you explicitly free it later.
Explicitly freeing dynamically allocated memory is required to avoid memory leaks. However, it is not necessarily a task that the end users need to perform directly: an API often provides a function to deal with this.
In your case, you should provide a deinit_semver function that does the clean up of memory that init_semver has allocated dynamically. These two functions behave in a way that is similar to constructor and destructor; init_semver is not a factory function, because it expects the semver struct to be allocated, rather than allocating it internally.
Here is one way of doing it:
void init_semver(semver * s, int major, int minor, int pathc, const char * note) {
s->major = major;
s->minor = minor;
s->patch = pathc;
size_t len = strlen(note);
s->note = malloc(len+1);
strcpy(s->note, note);
s->tag = malloc(40 + len);
sprintf(s->tag, "v%d.%d.%d-%s", major, minor, patch, note);
}
void deinit_semver(semver *s) {
free(s->note);
free(s->tag);
}
Note the changes above: rather than using fixed values for the components of struct semver, this code takes the values as parameters. In addition, the code copies the note into a dynamically allocated buffer, rather than pointing to it directly.
The deinit function does the clean-up by free-ing both fields that were allocated dynamically.
A char * on its own is just a pointer to memory. To accomplish what you want you will either need to instead use a fixed size field, i.e. char[33], or you can dynamically allocate the memory as needed.
As it is, your generate_semver function is attempting to print to an unknown address. Let's look at one solution.
typedef struct {
unsigned major;
unsigned minor;
unsigned patch;
char note[32];
char tag[32];
} semver;
Now, in your init_semver function, the line previously s->note = "alpha"; will become a string copy, as arrays are not a valid lvalue.
strncpy(s->note, "alpha", 31);
s->note[31] = '\0';
strncpy will copy a string from the second parameter to the first up to the number of bytes in the third parameter. The second line ensures that a trailing null terminator is in place.
Similarly, in the generate_semver function, it would directly work in the buffer:
void generate_semver(semver * s) {
snprintf( s->tag, 32, "v%d.%d.%d-%s",
s->major, s->minor, s->patch, s->note);
}
This will directly print to the array in the structure, with a maximum character limit. snprintf does append a trailing null terminator (unlike strncpy), so we don't need to worry about adding it ourselves.
You mention having to free allocated memory, and then say: "In order to avoid this problem". Well, it's not so much a problem, but rather a necessity of the C language. It's common to have functions that allocate memory, and require the caller to free it again.
The idiomatic way is to have a pair of "create" and "destroy" functions. So I'd suggest doing it like this:
// Your factory function
semver* create_semver() {
semver* instance = malloc(sizeof(*instance));
init_semver(instance); // will also allocate instance->tag and ->note
return instance;
}
// Your destruction function
void free_semver(semver* s) {
free(semver->tag);
free(semver->note);
free(semver);
}

Returning populated string array from function C

It's the first time posting so I apologise for any confusion:
I am writing a function like this:
int myFunc(char* inputStr, int *argCTemp, char** argVTemp[]);
The purpose of my function is to take a copy of the input string (basically any user input) and then use strtok to convert it to tokens and populate an array via an array pointer (argV). When myFunc is finished, hopefully I have the argument count and array of strings from my inputStr string.
Here is an example of how I call it:
int main(int argc, char** argv[])
{
int argCTemp = -1;
char** argVTemp;
// 1 Do Stuff
// 2 Get input string from user
// 3 then call myfunc like this:
myFunc(inputStr, &argCTemp, &argVTemp);
// 4: I get garbage whenever I try to use "argVTemp[i]"
}
My Questions: How should I best do this in a safe and consistent way. How do the pro's do this?
I don't use malloc because:
I don't know the number of arguments or the length of each for my input (to dynamically allocate the space). I figured that's why I use pointers
since I declare it in the main function, I thought the pointers to/memory used by argCTemp and argVTemp would be fine/remain in scope even if they are on the stack.
I know when myFunc exits it invalidates any stack references it created, so that's why I sent it pointers from a calling function. Should I be using pointers and malloc and such or what?
Last thing: before myfunc exits, I check to see the values of argCTemp and argVTemp and they have valid content. I am setting argCtemp and argVtemp like this:
(*argCTemp) = argCount;
(*argVTemp)[0] = "foo";
and it seems to be working just fine BEFORE the function exits. Since I'm setting pointers somewhere else in memory, I'm confused why the reference is failing. I tried using malloc INSIDE myFunc when setting the pointers and it is still becoming garbage when myFunc ends and is read by the calling function.
I'm sorry if any of this is confusing and thank you in advance for any help.
Since "don't know the number of arguments or the length of each for my input ", you can use malloc also. When your buffer abouting full, you should realloc your buffer.
The better way: You needn't store whole input. A line, a token or a block is better. Just set a static array to store them. and maybe hash is better if your input more than 100 mb.
I'm sorry for my poor English.
You send an uninitialized pointer (you call is isn't correct as well, you don't need the & ) to the function, this pointer points to some random place and that is why you get garbage, you can also get segmentation fault.
You can do one of the two.
Per allocate a large enough array which can be static for example
static char * arr[MAX SIZE] and send it (char **)&arr in the function call, or run twice and use malloc.
You should also pass the max size, or use constant and make sure you don't pass it.
Lets say you the number of tokens in int n then
char * arr[] = malloc(sizeof(int *)*n);
this will create array of pointers, now you pass it to your populate function by calling
it with (char **)&arr, and use it like you did in your code
for example (*argVTemp)[0] = ;.
(when the array is not needed any more don't forget to free it by caliing free(arr))
Generally speaking, since you don't know how many tokens will be in the result you'll need to allocate the array dynamically using malloc(), realloc() and/or some equivalent. Alternatively you can have the caller pass in array along with the array's size and return an error indication if the array isn't large enough (I do this for simple command parsers on embedded systems where dynamic allocation isn't appropriate).
Here's an example that allocates the returned array in small increments:
static
char** myFunc_realloc( char** arr, size_t* elements)
{
enum {
allocation_chunk = 16
};
*elements += allocation_chunk;
char** tmp = (char**) realloc( arr, (*elements) * sizeof(char*));
if (!tmp) {
abort(); // or whatever error handling
}
return tmp;
}
void myFunc_free( char** argv)
{
free(argv);
}
int myFunc(char* inputStr, int *argCTemp, char** argVTemp[])
{
size_t argv_elements = 0;
size_t argv_used = 0;
char** argv_arr = NULL;
char* token = strtok( inputStr, " ");
while (token) {
if ((argv_used+1) >= argv_elements) {
// we need to realloc - the +1 is because we want an extra
// element for the NULL sentinel
argv_arr = myFunc_realloc( argv_arr, &argv_elements);
}
argv_arr[argv_used] = token;
++argv_used;
token = strtok( NULL, " ");
}
if ((argv_used+1) >= argv_elements) {
argv_arr = myFunc_realloc( argv_arr, &argv_elements);
}
argv_arr[argv_used] = NULL;
*argCTemp = argv_used;
*argVTemp = argv_arr;
return argv_used;
}
Some notes:
if an allocation fails, the program is terminated. You may need different error handling.
the passed in input string is 'corrupted'. This might not be an appropriate interface for your function (in general, I'd prefer that a function like this not destroy the input data).
the user of the function should call myFunc_free() to deallocate the returned array. Currently this is a simple wrapper for free(), but this gives you flexibility to do more sophisticated things (like allocating memory for the tokens so you don't have to destroy the input string).

How do I return an array of strings from a recursive function?

How do I return an array of strings from a recursive function?
For example::
char ** jumble( char *jumbStr)//reccurring function
{
char *finalJumble[100];
...code goes here...call jumble again..code goes here
return finalJumble;
}
Thanks in advance.
In C, you cannot return a string from a function. You can only return a pointer to a string. Therefore, you have to pass the string you want returned as a parameter to the function (DO NOT use global variables, or function local static variables) as follows:
char *func(char *string, size_t stringSize) {
/* Fill the string as wanted */
return string;
}
If you want to return an array of strings, this is even more complex, above all if the size of the array varies. The best IMHO could be to return all the strings in the same string, concatenating the strings in the string buffer, and an empty string as marker for the last string.
char *string = "foo\0bar\0foobar\0";
Your current implementation is not correct as it returns a pointer to variables that are defined in the local function scope.
(If you really do C++, then return an std::vector<std::string>.)
Your implementation is not correct since you are passing a pointer to a local variable that will go out of scope rather quickly and then you are left with a null pointer and eventually a crash.
If you still want to continue this approach, then pass by reference (&) an array of characters to that function and stop recursing once you have reached the desired end point. Once you are finished, you should have the 'jumbled' characters you need.
You don't :-)
Seriously, your code will create a copy of the finalJumble array on every iteration and you don't want that I believe. And as noted elsewhere finalJumble will go out of scope ... it will sometimes work but other times that memory will be reclaimed and the application will crash.
So you'd generate the jumble array outside the jumble method:
void jumble_client( char *jumbStr)
char *finalJumble[100];
jumble(finalJuble, jumbStr);
... use finalJumble ...
}
void jumble( char **jumble, char *jumbStr)
{
...code goes here...call jumble again..code goes here
}
And of course you'd use the stl datatypes instead of char arrays and you might want to examine whether it might be sensible to write a jumble class that has the finalJumble data as a member. But all that is a little further down the road. Nevertheless once you got the original problem solved try to find out how to do that to learn more.
I would pass a vector of strings as a parameter, by reference. You can always use the return value for error checking.
typedef std::vector<std::string> TJumbleVector;
int jumble(char* jumbStr, TJumbleVector& finalJumble) //recurring function
{
int err = 0; // error checking
...code goes here...call jumble again..code goes here
// finalJumble.push_back(aGivenString);
return err;
}
If you want to do it in C, you can keep track of the number of strings, do a malloc at the last recursive call, and fill the array after each recursive call. You should keep in mind that the caller should free the allocated memory. Another option is that the caller does a first call to see how much space he needs for the array, then does the malloc, and the call to jumble:
char** jumble(char* jumbStr)
{
return recursiveJumble(jumbStr, 0);
}
char** recursiveJumble(char* jumbStr, unsigned int numberOfElements)
{
char** ret = NULL;
if (/*baseCase*/)
{
ret = (char**) malloc(numberOfElements * sizeof(char*));
}
else
{
ret = jumble(/*restOfJumbStr*/, numberOfElements+1);
ret[numberOfElements] = /*aGivenString*/;
}
return ret;
}

Resources