Related
In this toy code example:
int MAX = 5;
void fillArray(int** someArray, int* blah) {
int i;
for (i=0; i<MAX; i++)
(*someArray)[i] = blah[i]; // segfault happens here
}
int main() {
int someArray[MAX];
int blah[] = {1, 2, 3, 4, 5};
fillArray(&someArray, blah);
return 0;
}
... I want to fill the array someArray, and have the changes persist outside the function.
This is part of a very large homework assignment, and this question addresses the issue without allowing me to copy the solution. I am given a function signature that accepts an int** as a parameter, and I'm supposed to code the logic to fill that array. I was under the impression that dereferencing &someArray within the fillArray() function would give me the required array (a pointer to the first element), and that using bracketed array element access on that array would give me the necessary position that needs to be assigned. However, I cannot figure out why I'm getting a segfault.
Many thanks!
I want to fill the array someArray, and have the changes persist outside the function.
Just pass the array to the function as it decays to a pointer to the first element:
void fillArray(int* someArray, int* blah) {
int i;
for (i=0; i<MAX; i++)
someArray[i] = blah[i];
}
and invoked:
fillArray(someArray, blah);
The changes to the elements will be visible outside of the function.
If the actual code was to allocate an array within fillArray() then an int** would be required:
void fillArray(int** someArray, int* blah) {
int i;
*someArray = malloc(sizeof(int) * MAX);
if (*someArray)
{
for (i=0; i<MAX; i++) /* or memcpy() instead of loop */
(*someArray)[i] = blah[i];
}
}
and invoked:
int* someArray = NULL;
fillArray(&someArray, blah);
free(someArray);
When you create an array, such as int myArray[10][20], a guaranteed contiguous block of memory is allocated from the stack, and normal array arithmetic is used to find any given element in the array.
If you want to allocate that 3D "array" from the heap, you use malloc() and get some memory back. That memory is "dumb". It's just a chunk of memory, which should be thought of as a vector. None of the navigational logic attendant with an array comes with that, which means you must find another way to navigate your desired 3D array.
Since your call to malloc() returns a pointer, the first variable you need is a pointer to hold the vector of int* s you're going to need to hold some actual integer data IE:
int *pArray;
...but this still isn't the storage you want to store integers. What you have is an array of pointers, currently pointing to nothing. To get storage for your data, you need to call malloc() 10 times, with each malloc() allocating space for 20 integers on each call, whose return pointers will be stored in the *pArray vector of pointers. This means that
int *pArray
needs to be changed to
int **pArray
to correctly indicate that it is a pointer to the base of a vector of pointers.
The first dereferencing, *pArray[i], lands you somewhere in an array of int pointers, and the 2nd dereferencing, *p[i][j], lands you somewhere inside an array of ints, pointed to by an int pointer in pArray[i].
IE: you have a cloud of integer vectors scattered all over the heap, pointed to by an array of pointers keeping track of their locations. Not at all similar to Array[10][20] allocated statically from the stack, which is all contiguous storage, and doesn't have a single pointer in it anywhere.
As others have eluded to, the pointer-based heap method doesn't seem to have a lot going for it at first glance, but turns out to be massively superior.
1st, and foremost, you can free() or realloc() to resize heap memory whenever you want, and it doesn't go out of scope when the function returns. More importantly, experienced C coders arrange their functions to operate on vectors where possible, where 1 level of indirection is removed in the function call. Finally, for large arrays, relative to available memory, and especially on large, shared machines, the large chunks of contiguous memory are often not available, and are not friendly to other programs that need memory to operate. Code with large static arrays, allocated on the stack, are maintenance nightmares.
Here you can see that the table is just a shell collecting vector pointers returned from vector operations, where everything interesting happens at the vector level, or element level. In this particular case, the vector code in VecRand() is calloc()ing it's own storage and returning calloc()'s return pointer to TblRand(), but TblRand has the flexibility to allocate VecRand()'s storage as well, just by replacing the NULL argument to VecRand() with a call to calloc()
/*-------------------------------------------------------------------------------------*/
dbl **TblRand(dbl **TblPtr, int rows, int cols)
{
int i=0;
if ( NULL == TblPtr ){
if (NULL == (TblPtr=(dbl **)calloc(rows, sizeof(dbl*))))
printf("\nCalloc for pointer array in TblRand failed");
}
for (; i!=rows; i++){
TblPtr[i] = VecRand(NULL, cols);
}
return TblPtr;
}
/*-------------------------------------------------------------------------------------*/
dbl *VecRand(dbl *VecPtr, int cols)
{
if ( NULL == VecPtr ){
if (NULL == (VecPtr=(dbl *)calloc(cols, sizeof(dbl))))
printf("\nCalloc for random number vector in VecRand failed");
}
Randx = GenRand(VecPtr, cols, Randx);
return VecPtr;
}
/*--------------------------------------------------------------------------------------*/
static long GenRand(dbl *VecPtr, int cols, long RandSeed)
{
dbl r=0, Denom=2147483647.0;
while ( cols-- )
{
RandSeed= (314159269 * RandSeed) & 0x7FFFFFFF;
r = sqrt(-2.0 * log((dbl)(RandSeed/Denom)));
RandSeed= (314159269 * RandSeed) & 0x7FFFFFFF;
*VecPtr = r * sin(TWOPI * (dbl)(RandSeed/Denom));
VecPtr++;
}
return RandSeed;
}
There is no "array/pointer" equivalence, and arrays and pointers are very different. Never confuse them. someArray is an array. &someArray is a pointer to an array, and has type int (*)[MAX]. The function takes a pointer to a pointer, i.e. int **, which needs to point to a pointer variable somewhere in memory. There is no pointer variable anywhere in your code. What could it possibly point to?
An array value can implicitly degrade into a pointer rvalue for its first element in certain expressions. Something that requires an lvalue like taking the address (&) obviously does not work this way. Here are some differences between array types and pointer types:
Array types cannot be assigned or passed. Pointer types can
Pointer to array and pointer to pointer are different types
Array of arrays and array of pointers are different types
The sizeof of an array type is the length times the size of the component type; the sizeof of a pointer is just the size of a
pointer
I have a function that I pass an array into and an int into from my main function. I am doing operations to the array inside this new function, let's call it foo. In foo, I initialize another array with 52 cells all with 0. I do operations on the array that I passed from main, and transfer that data to the newly initialized array. I want to return the new array back to the main function. But of course, I can't return data structures like arrays. So I instead return an int pointer that points to this array. Inside the int main, I pass the pointer to have it point to various cells in the array. When I print the results of what the pointer is pointing to, it should either be pointing to 0 or an integer greater than 0. But instead, I get inconsistent results. For some reason, some of the values that SHOULD be 0, prints out garbage data. I've been trying to spot the bug for some time, but I just wanted a second hand look at it. Here is just the GENERAL idea for the code for this portion anyways...
int main(){
int *retPtr;
char input[] = "abaecedg";
retPtr = foo(input, size);
for(i=0; i<52; i++){
// error displayed here
printf("%d\n", *(retPr + i));
}
}
int foo(char input[], int size)
{
int arr[52] = {0}; // should initialize all 52 cells with 0.
int i=0, value; // looking for non-zero results in the end.
int *ptr = &arr[0];
for(i=0; i<size; i++){
if(arr[i] > 64 && arr[i] < 91){
value = input[i] - 65;
arr[value]++;
}
}
return ptr;
}
Hopefully this makes sense of what I'm trying to do. In the foo function, I am trying to find the frequency of certain alphabets. I know this might be a bit cryptic, but the code is quite long with comments and everything so I wanted to make it as succinct as possible. Is there any possible reason why I'm getting correct values for some (numbers > 0, 0) and garbage values in the other?
The reason you get garbage back is that the array created in foo is allocated in foos stack frame, and you then return a pointer into that frame. That frame is discarded when foo returns.
You should allocate the array on the heap (using malloc and friends) if you want it to remain after foo returns. Don't forget to free() it when you're done with the array.
int main(){
char input[] = "abaecedg";
int retPtr[] = foo(input, size); //An array and a pointer is the same thing
...
free(retPtr);
}
int *foo(char input[], int size)
{
int arr[] = calloc(52*sizeof(int); // should initialize all 52 cells with 0.
...
arr[value]++;
...
return arr;
}
Another way is to let foo take an array as a parameter and work with that, in this way:
int main(){
int ret[52] = {0};
...
foo(input, size, ret);
...
}
void foo(char input[], int size, int *arr)
{
...
arr[value]++;
...
return; //Don't return anything, you have changed the array in-place
}
The reason this works is because an array is the exact same thing as a pointer, so you are really passing the array by reference into foo. arr will be pointing to the same place as ret, into the stack frame of main.
In function foo the array arr is a local array, that is, allocated on the stack. You must not return any pointer of data allocated on the stack, since the stack is rewinded after you return from the function, and its content is no more guaratneed.
If you want to return an array you should allocate it on the heap using malloc, for example, and return the pointer malloc returned. But you will then have to free that memory somewhere in your program. If you fail to free it you will have what's called a "memory leak", which may or may not crash/disturb this program from running again, depending on your environment. A not clean situation, that's for sure.
That's why I consider C not so good for functional programing idioms, such as returning things from function (unless they are primitive types). I would achieve what you tried to do by passing another array to foo - an output array, companioned by a size variable, and fill that array.
Alternately, you could wrap the array within a struct and return that struct. Structs can be returned by value, in which case they are copied via the stack to the caller function's returned value.
I am supposed to follow the following criteria:
Implement function answer4 (pointer parameter and n):
Prepare an array of student_record using malloc() of n items.
Duplicate the student record from the parameter to the array n
times.
Return the array.
And I came with the code below, but it's obviously not correct. What's the correct way to implement this?
student_record *answer4(student_record* p, unsigned int n)
{
int i;
student_record* q = malloc(sizeof(student_record)*n);
for(i = 0; i < n ; i++){
q[i] = p[i];
}
free(q);
return q;
};
p = malloc(sizeof(student_record)*n);
This is problematic: you're overwriting the p input argument, so you can't reference the data you were handed after that line.
Which means that your inner loop reads initialized data.
This:
return a;
is problematic too - it would return a pointer to a local variable, and that's not good - that pointer becomes invalid as soon as the function returns.
What you need is something like:
student_record* ret = malloc(...);
for (int i=...) {
// copy p[i] to ret[i]
}
return ret;
1) You reassigned p, the array you were suppose to copy, by calling malloc().
2) You can't return the address of a local stack variable (a). Change a to a pointer, malloc it to the size of p, and copy p into. Malloc'd memory is heap memory, and so you can return such an address.
a[] is a local automatic array. Once you return from the function, it is erased from memory, so the calling function can't use the array you returned.
What you probably wanted to do is to malloc a new array (ie, not p), into which you should assign the duplicates and return its values w/o freeing the malloced memory.
Try to use better names, it might help in avoiding the obvious mix-up errors you have in your code.
For instance, start the function with:
student_record * answer4(const student_record *template, size_t n)
{
...
}
It also makes the code clearer. Note that I added const to make it clearer that the first argument is input-only, and made the type of the second one size_t which is good when dealing with "counts" and sizes of things.
The code in this question is evolving quite quickly but at the time of this answer it contains these two lines:
free(q);
return q;
This is guaranteed to be wrong - after the call to free its argument points to invalid memory and anything could happen subsequently upon using the value of q. i.e. you're returning an invalid pointer. Since you're returning q, don't free it yet! It becomes a "caller-owned" variable and it becomes the caller's responsibility to free it.
student_record* answer4(student_record* p, unsigned int n)
{
uint8_t *data, *pos;
size_t size = sizeof(student_record);
data = malloc(size*n);
pos = data;
for(unsigned int i = 0; i < n ; i++, pos=&pos[size])
memcpy(pos,p,size);
return (student_record *)data;
};
You may do like this.
This compiles and, I think, does what you want:
student_record *answer4(const student_record *const p, const unsigned int n)
{
unsigned int i;
student_record *const a = malloc(sizeof(student_record)*n);
for(i = 0; i < n; ++i)
{
a[i] = p[i];
}
return a;
};
Several points:
The existing array is identified as p. You want to copy from it. You probably do not want to free it (to free it is probably the caller's job).
The new array is a. You want to copy to it. The function cannot free it, because the caller will need it. Therefore, the caller must take the responsibility to free it, once the caller has done with it.
The array has n elements, indexed 0 through n-1. The usual way to express the upper bound on the index thus is i < n.
The consts I have added are not required, but well-written code will probably include them.
Altought, there are previous GOOD answers to this question, I couldn't avoid added my own. Since I got pascal programming in Collegue, I am used to do this, in C related programming languages:
void* AnyFunction(int AnyParameter)
{
void* Result = NULL;
DoSomethingWith(Result);
return Result;
}
This, helps me to easy debug, and avoid bugs like the one mention by #ysap, related to pointers.
Something important to remember, is that the question mention to return a SINGLE pointer, this a common caveat, because a pointer, can be used to address a single item, or a consecutive array !!!
This question suggests to use an array as A CONCEPT, with pointers, NOT USING ARRAY SYNTAX.
// returns a single pointer to an array:
student_record* answer4(student_record* student, unsigned int n)
{
// empty result variable for this function:
student_record* Result = NULL;
// the result will allocate a conceptual array, even if it is a single pointer:
student_record* Result = malloc(sizeof(student_record)*n);
// a copy of the destination result, will move for each item
student_record* dest = Result;
int i;
for(i = 0; i < n ; i++){
// copy contents, not address:
*dest = *student;
// move to next item of "Result"
dest++;
}
// the data referenced by "Result", was changed using "dest"
return Result;
} // student_record* answer4(...)
Check that, there is not subscript operator here, because of addressing with pointers.
Please, don't start a pascal v.s. c flame war, this is just a suggestion.
I have a recursive struct which is:
typedef struct dict dict;
struct dict {
dict *children[M];
list *words[M];
};
Initialized this way:
dict *d = malloc(sizeof(dict));
bzero(d, sizeof(dict));
I would like to know what bzero() exactly does here, and how can I malloc() recursively for children.
Edit: This is how I would like to be able to malloc() the children and words:
void dict_insert(dict *d, char *signature, unsigned int current_letter, char *w) {
int occur;
occur = (int) signature[current_letter];
if (current_letter == LAST_LETTER) {
printf("word found : %s!\n",w);
list_print(d->words[occur]);
char *new;
new = malloc(strlen(w) + 1);
strcpy(new, w);
list_append(d->words[occur],new);
list_print(d->words[occur]);
}
else {
d = d->children[occur];
dict_insert(d,signature,current_letter+1,w);
}
}
bzero(3) initializes the memory to zero. It's equivalent to calling memset(3) with a second parameter of 0. In this case, it initializes all of the member variables to null pointers. bzero is considered deprecated, so you should replace uses of it with memset; alternatively, you can just call calloc(3) instead of malloc, which automatically zeroes out the returned memory for you upon success.
You should not use either of the two casts you have written—in C, a void* pointer can be implicitly cast to any other pointer type, and any pointer type can be implicitly cast to void*. malloc returns a void*, so you can just assign it to your dict *d variable without a cast. Similarly, the first parameter of bzero is a void*, so you can just pass it your d variable directly without a cast.
To understand recursion, you must first understand recursion. Make sure you have an appropriate base case if you want to avoid allocating memory infinitely.
In general, when you are unsure what the compiler is generating for you, it is a good idea to use a printf to report the size of the struct. In this case, the size of dict should be 2 * M * the size of a pointer. In this case, bzero will fill a dict with zeros. In other words, all M elements of the children and words arrays will be zero.
To initialize the structure, I recommend creating a function that takes a pointer to a dict and mallocs each child and then calls itself to initialize it:
void init_dict(dict* d)
{
int i;
for (i = 0; i < M; i++)
{
d->children[i] = malloc(sizeof(dict));
init_dict(d->children[i]);
/* initialize the words elements, too */
}
}
+1 to you if you can see why this code won't work as is. (Hint: it has an infinite recursion bug and needs a rule that tells it how deep the children tree needs to be so it can stop recursing.)
bzero just zeros the memory. bzero(addr, size) is essentially equivalent to memset(addr, 0, size). As to why you'd use it, from what I've seen around half the time it's used, it's just because somebody though zeroing the memory seemed like a good idea, even though it didn't really accomplish anything. In this case, it looks like the effect would be to set some pointers to NULL (though it's not entirely portable for that purpose).
To allocate recursively, you'd basically just keep track of a current depth, and allocate child nodes until you reached the desired depth. Code something on this order would do the job:
void alloc_tree(dict **root, size_t depth) {
int i;
if (depth == 0) {
(*root) = NULL;
return;
}
(*root) = malloc(sizeof(**root));
for (i=0; i<M; i++)
alloc_tree((*root)->children+i, depth-1);
}
I should add that I can't quite imagine doing recursive allocation like this though. In a typical case, you insert data, and allocate new nodes as needed to hold the data. The exact details of that will vary depending on whether (and if so how) you're keeping the tree balanced. For a multi-way tree like this, it's fairly common to use some B-tree variant, in which case the code I've given above won't normally apply at all -- with a B-tree, you fill a node, and when it's reached its limit, you split it in half and promote the middle item to the parent node. You allocate a new node when this reaches the top of the tree, and the root node is already full.
I'm trying to create a function which takes an array as an argument, adds values to it (increasing its size if necessary) and returns the count of items.
So far I have:
int main(int argc, char** argv) {
int mSize = 10;
ent a[mSize];
int n;
n = addValues(a,mSize);
for(i=0;i<n;i++) {
//Print values from a
}
}
int addValues(ent *a, int mSize) {
int size = mSize;
i = 0;
while(....) { //Loop to add items to array
if(i>=size-1) {
size = size*2;
a = realloc(a, (size)*sizeof(ent));
}
//Add to array
i++;
}
return i;
}
This works if mSize is large enough to hold all the potential elements of the array, but if it needs resizing, I get a Segmentation Fault.
I have also tried:
int main(int argc, char** argv) {
...
ent *a;
...
}
int addValues(ent *a, int mSize) {
...
a = calloc(1, sizeof(ent);
//usual loop
...
}
To no avail.
I assume this is because when I call realloc, the copy of 'a' is pointed elsewhere - how is it possible to modify this so that 'a' always points to the same location?
Am I going about this correctly? Are there better ways to deal with dynamic structures in C? Should I be implementing a linked list to deal with these?
The main problem here is that you're trying to use realloc with a stack-allocated array. You have:
ent a[mSize];
That's automatic allocation on the stack. If you wanted to use realloc() on this later, you would create the array on the heap using malloc(), like this:
ent *a = (ent*)malloc(mSize * sizeof(ent));
So that the malloc library (and thus realloc(), etc.) knows about your array. From the looks of this, you may be confusing C99 variable-length arrays with true dynamic arrays, so be sure you understand the difference there before trying to fix this.
Really, though, if you are writing dynamic arrays in C, you should try to use OOP-ish design to encapsulate information about your arrays and hide it from the user. You want to consolidate information (e.g. pointer and size) about your array into a struct and operations (e.g. allocation, adding elements, removing elements, freeing, etc.) into special functions that work with your struct. So you might have:
typedef struct dynarray {
elt *data;
int size;
} dynarray;
And you might define some functions to work with dynarrays:
// malloc a dynarray and its data and returns a pointer to the dynarray
dynarray *dynarray_create();
// add an element to dynarray and adjust its size if necessary
void dynarray_add_elt(dynarray *arr, elt value);
// return a particular element in the dynarray
elt dynarray_get_elt(dynarray *arr, int index);
// free the dynarray and its data.
void dynarray_free(dynarray *arr);
This way the user doesn't have to remember exactly how to allocate things or what size the array is currently. Hope that gets you started.
Try reworking it so a pointer to a pointer to the array is passed in, i.e. ent **a. Then you will be able to update the caller on the new location of the array.
this is a nice reason to use OOP. yes, you can do OOP on C, and it even looks nice if done correctly.
in this simple case you don't need inheritance nor polymorphism, just the encapsulation and methods concepts:
define a structure with a length and a data pointer. maybe an element size.
write getter/setter functions that operate on pointers to that struct.
the 'grow' function modifies the data pointer within the struct, but any struct pointer stays valid.
If you changed the variable declaration in main to be
ent *a = NULL;
the code would work more like you envisioned by not freeing a stack-allocated array. Setting a to NULL works because realloc treats this as if the user called malloc(size). Keep in mind that with this change, the prototype to addValue needs to change to
int addValues(ent **a, int mSize)
and that the code needs to handle the case of realloc failing. For example
while(....) { //Loop to add items to array
tmp = realloc(*a, size*sizeof(ent));
if (tmp) {
*a = tmp;
} else {
// allocation failed. either free *a or keep *a and
// return an error
}
//Add to array
i++;
}
I would expect that most implementations of realloc will internally allocate twice as much memory if the current buffer needs resizing making the original code's
size = size * 2;
unnecessary.
You are passing the array pointer by value. What this means is:
int main(int argc, char** argv) {
...
ent *a; // This...
...
}
int addValues(ent *a, int mSize) {
...
a = calloc(1, sizeof(ent); // ...is not the same as this
//usual loop
...
}
so changing the value of a in the addValues function does not change the value of a in main. To change the value of a in main you need to pass a reference to it to addValues. At the moment, the value of a is being copied and passed to addValues. To pass a reference to a use:
int addValues (int **a, int mSize)
and call it like:
int main(int argc, char** argv) {
...
ent *a; // This...
...
addValues (&a, mSize);
}
In the addValues, access the elements of a like this:
(*a)[element]
and reallocate the array like this:
(*a) = calloc (...);
Xahtep explains how your caller can deal with the fact that realloc() might move the array to a new location. As long as you do this, you should be fine.
realloc() might get expensive if you start working with large arrays. That's when it's time to start thinking of using other data structures -- a linked list, a binary tree, etc.
As stated you should pass pointer to pointer to update the pointer value.
But I would suggest redesign and avoid this technique, in most cases it can and should be avoided. Without knowing what exactly you trying to achieve it's hard to suggest alternative design, but I'm 99% sure that it's doable other way. And as Javier sad - think object oriented and you will always get better code.
Are you really required to use C? This would be a great application of C++'s "std::vector", which is precisely a dynamically-sized array (easily resizeble with a single call you don't have to write and debug yourself).